Validity of the Internal-External Scale in its Relationship with Political Position
ERIC Educational Resources Information Center
Silvern, Louise
1975-01-01
Previous studies have shown a relationship between left wing political beliefs and externality on Rotter's Scale. By examining the validity of Rotter's Scale in relation to political position, no evidence was found relating political position to locus of control. (DEP)
An Experimental Study of the Internal Consistency of Judgments Made in Bookmark Standard Setting
ERIC Educational Resources Information Center
Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia
2017-01-01
Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…
Fooken, Jonas
2017-03-10
The present study investigates the external validity of emotional value measured in economic laboratory experiments by using a physiological indicator of stress, heart rate variability (HRV). While there is ample evidence supporting the external validity of economic experiments, there is little evidence comparing the magnitude of internal levels of emotional stress during decision making with external stress. The current study addresses this gap by comparing the magnitudes of decision stress experienced in the laboratory with the stress from outside the laboratory. To quantify a large change in HRV, measures observed in the laboratory during decision-making are compared to the difference between HRV during a university exam and other mental activity for the same individuals in and outside of the laboratory. The results outside the laboratory inform about the relevance of laboratory findings in terms of their relative magnitude. Results show that psychologically induced HRV changes observed in the laboratory, particularly in connection with social preferences, correspond to large effects outside. This underscores the external validity of laboratory findings and shows the magnitude of emotional value connected to pro-social economic decisions in the laboratory.
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy P.; Harris, Pamela J.; Menzies, Holly Mariah; Cox, Meredith; Lambert, Warren
2012-01-01
We report findings of an exploratory validation study of a revised instrument: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE). The SRSS-IE was modified to include seven additional items reflecting characteristics of internalizing behaviors, with proposed items generated from the current literature base, review of…
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Carter, Erik W.; Lambert, Warren E.; Jenkins, Abbie B.
2013-01-01
We reported findings of an exploratory validation study of a revised universal screening instrument: the Student Risk Screening Scale--Internalizing and Externalizing (SRSS-IE) for use with middle school students. Tested initially for use with elementary-age students, the SRSS-IE was adapted to include seven additional items reflecting…
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Menzies, Holly M.; Oakes, Wendy P.; Lambert, Warren; Cox, Meredith; Hankins, Katy
2012-01-01
We report findings of two studies, one conducted in a rural school district (N = 982) and a second conducted in an urban district (N = 1,079), offering additional evidence of the reliability and validity of a revised instrument, the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE), to accurately detect internalizing and…
López-Jáuregui, Alicia; Oliden, Paula Elosua
2009-11-01
The aim of this study is to adapt the ESPA29 scale of parental socialization styles in adolescence to the Basque language. The study of its psychometric properties is based on the search for evidence of internal and external validity. The first focuses on the assessment of the dimensionality of the scale by means of exploratory factor analysis. The relationship between the dimensions of parental socialization styles and gender and age guarantee the external validity of the scale. The study of the equivalence of the adapted and original versions is based on the comparisons of the reliability coefficients and on factor congruence. The results allow us to conclude the equivalence of the two scales.
Majumdar, Subhabrata; Basak, Subhash C
2018-04-26
Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
The Utrecht questionnaire (U-CEP) measuring knowledge on clinical epidemiology proved to be valid.
Kortekaas, Marlous F; Bartelink, Marie-Louise E L; de Groot, Esther; Korving, Helen; de Wit, Niek J; Grobbee, Diederick E; Hoes, Arno W
2017-02-01
Knowledge on clinical epidemiology is crucial to practice evidence-based medicine. We describe the development and validation of the Utrecht questionnaire on knowledge on Clinical epidemiology for Evidence-based Practice (U-CEP); an assessment tool to be used in the training of clinicians. The U-CEP was developed in two formats: two sets of 25 questions and a combined set of 50. The validation was performed among postgraduate general practice (GP) trainees, hospital trainees, GP supervisors, and experts. Internal consistency, internal reliability (item-total correlation), item discrimination index, item difficulty, content validity, construct validity, responsiveness, test-retest reliability, and feasibility were assessed. The questionnaire was externally validated. Internal consistency was good with a Cronbach alpha of 0.8. The median item-total correlation and mean item discrimination index were satisfactory. Both sets were perceived as relevant to clinical practice. Construct validity was good. Both sets were responsive but failed on test-retest reliability. One set took 24 minutes and the other 33 minutes to complete, on average. External GP trainees had comparable results. The U-CEP is a valid questionnaire to assess knowledge on clinical epidemiology, which is a prerequisite for practicing evidence-based medicine in daily clinical practice. Copyright © 2016 Elsevier Inc. All rights reserved.
Measuring Long-Distance Romantic Relationships: A Validity Study
ERIC Educational Resources Information Center
Pistole, M. Carole; Roberts, Amber
2011-01-01
This study investigated aspects of construct validity for the scores of a new long-distance romantic relationship measure. A single-factor structure of the long-distance romantic relationship index emerged, with convergent and discriminant evidence of external validity, high internal consistency reliability, and applied utility of the scores.…
Externalizing disorders: cluster 5 of the proposed meta-structure for DSM-V and ICD-11.
Krueger, R F; South, S C
2009-12-01
The extant major psychiatric classifications DSM-IV and ICD-10 are purportedly atheoretical and largely descriptive. Although this achieves good reliability, the validity of a medical diagnosis is greatly enhanced by an understanding of the etiology. In an attempt to group mental disorders on the basis of etiology, five clusters have been proposed. We consider the validity of the fifth cluster, externalizing disorders, within this proposal. We reviewed the literature in relation to 11 validating criteria proposed by the Study Group of the DSM-V Task Force, in terms of the extent to which these criteria support the idea of a coherent externalizing spectrum of disorders. This cluster distinguishes itself by the central role of disinhibitory personality in mental disorders spread throughout sections of the current classifications, including substance dependence, antisocial personality disorder and conduct disorder. Shared biomarkers, co-morbidity and course offer additional evidence for a valid cluster of externalizing disorders. Externalizing disorders meet many of the salient criteria proposed by the Study Group of the DSM-V Task Force to suggest a classification cluster.
Lee, Jin; Huang, Yueng-hsiang; Robertson, Michelle M; Murphy, Lauren A; Garabet, Angela; Chang, Wen-Ruey
2014-02-01
The goal of this study was to examine the external validity of a 12-item generic safety climate scale for lone workers in order to evaluate the appropriateness of generalized use of the scale in the measurement of safety climate across various lone work settings. External validity evidence was established by investigating the measurement equivalence (ME) across different industries and companies. Confirmatory factor analysis (CFA)-based and item response theory (IRT)-based perspectives were adopted to examine the ME of the generic safety climate scale for lone workers across 11 companies from the trucking, electrical utility, and cable television industries. Fairly strong evidence of ME was observed for both organization- and group-level generic safety climate sub-scales. Although significant invariance was observed in the item intercepts across the different lone work settings, absolute model fit indices remained satisfactory in the most robust step of CFA-based ME testing. IRT-based ME testing identified only one differentially functioning item from the organization-level generic safety climate sub-scale, but its impact was minimal and strong ME was supported. The generic safety climate scale for lone workers reported good external validity and supported the presence of a common feature of safety climate among lone workers. The scale can be used as an effective safety evaluation tool in various lone work situations. Copyright © 2013 Elsevier Ltd. All rights reserved.
Venables, Noah C.; Patrick, Christopher J.
2013-01-01
The Externalizing Spectrum Inventory (ESI; Krueger, Markon, Patrick, Benning, & Kramer, 2007) provides a self-report based method for indexing a range of correlated problem behaviors and traits in the domain of deficient impulse control. The ESI organizes lower-order behaviors and traits of this kind around higher-order factors encompassing general disinhibitory proneness, callous-aggression, and substance abuse. The current study used data from a male prisoner sample (N = 235) to evaluate the validity of ESI total and factor scores in relation to external criterion measures consisting of externalizing disorder symptoms (including child and adult antisocial deviance and substance-related problems) assessed via diagnostic interview, personality traits assessed by self-report, and psychopathic features as assessed by both interview and self-report. Results provide evidence for the validity of the ESI measurement model and point to its potential utility as a referent for research on the neurobiological correlates and etiological bases of externalizing proneness. PMID:21787091
Venables, Noah C; Patrick, Christopher J
2012-03-01
The Externalizing Spectrum Inventory (ESI; Krueger, Markon, Patrick, Benning, & Kramer, 2007) provides a self-report based method for indexing a range of correlated problem behaviors and traits in the domain of deficient impulse control. The ESI organizes lower order behaviors and traits of this kind around higher order factors encompassing general disinhibitory proneness, callous-aggression, and substance abuse. In the current study, we used data from a male prisoner sample (N = 235) to evaluate the validity of ESI total and factor scores in relation to external criterion measures consisting of externalizing disorder symptoms (including child and adult antisocial deviance and substance-related problems) assessed via diagnostic interviews, personality traits assessed with self-reports, and psychopathic features as assessed with both interviews and self-reports. Results provide evidence for the validity of the ESI measurement model and point to its potential usefulness as a referent for research on the neurobiological correlates and etiological bases of externalizing proneness.
The bottom-up approach to integrative validity: a new perspective for program evaluation.
Chen, Huey T
2010-08-01
The Campbellian validity model and the traditional top-down approach to validity have had a profound influence on research and evaluation. That model includes the concepts of internal and external validity and within that model, the preeminence of internal validity as demonstrated in the top-down approach. Evaluators and researchers have, however, increasingly recognized that in an evaluation, the over-emphasis on internal validity reduces that evaluation's usefulness and contributes to the gulf between academic and practical communities regarding interventions. This article examines the limitations of the Campbellian validity model and the top-down approach and provides a comprehensive, alternative model, known as the integrative validity model for program evaluation. The integrative validity model includes the concept of viable validity, which is predicated on a bottom-up approach to validity. This approach better reflects stakeholders' evaluation views and concerns, makes external validity workable, and becomes therefore a preferable alternative for evaluation of health promotion/social betterment programs. The integrative validity model and the bottom-up approach enable evaluators to meet scientific and practical requirements, facilitate in advancing external validity, and gain a new perspective on methods. The new perspective also furnishes a balanced view of credible evidence, and offers an alternative perspective for funding. Copyright (c) 2009 Elsevier Ltd. All rights reserved.
Instrumental and statistical methods for the comparison of class evidence
NASA Astrophysics Data System (ADS)
Liszewski, Elisa Anne
Trace evidence is a major field within forensic science. Association of trace evidence samples can be problematic due to sample heterogeneity and a lack of quantitative criteria for comparing spectra or chromatograms. The aim of this study is to evaluate different types of instrumentation for their ability to discriminate among samples of various types of trace evidence. Chemometric analysis, including techniques such as Agglomerative Hierarchical Clustering, Principal Components Analysis, and Discriminant Analysis, was employed to evaluate instrumental data. First, automotive clear coats were analyzed by using microspectrophotometry to collect UV absorption data. In total, 71 samples were analyzed with classification accuracy of 91.61%. An external validation was performed, resulting in a prediction accuracy of 81.11%. Next, fiber dyes were analyzed using UV-Visible microspectrophotometry. While several physical characteristics of cotton fiber can be identified and compared, fiber color is considered to be an excellent source of variation, and thus was examined in this study. Twelve dyes were employed, some being visually indistinguishable. Several different analyses and comparisons were done, including an inter-laboratory comparison and external validations. Lastly, common plastic samples and other polymers were analyzed using pyrolysis-gas chromatography/mass spectrometry, and their pyrolysis products were then analyzed using multivariate statistics. The classification accuracy varied dependent upon the number of classes chosen, but the plastics were grouped based on composition. The polymers were used as an external validation and misclassifications occurred with chlorinated samples all being placed into the category containing PVC.
Identifying and Evaluating External Validity Evidence for Passing Scores
ERIC Educational Resources Information Center
Davis-Becker, Susan L.; Buckendahl, Chad W.
2013-01-01
A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1)…
Interaction of Theory and Practice to Assess External Validity.
Leviton, Laura C; Trujillo, Mathew D
2016-01-18
Variations in local context bedevil the assessment of external validity: the ability to generalize about effects of treatments. For evaluation, the challenges of assessing external validity are intimately tied to the translation and spread of evidence-based interventions. This makes external validity a question for decision makers, who need to determine whether to endorse, fund, or adopt interventions that were found to be effective and how to ensure high quality once they spread. To present the rationale for using theory to assess external validity and the value of more systematic interaction of theory and practice. We review advances in external validity, program theory, practitioner expertise, and local adaptation. Examples are provided for program theory, its adaptation to diverse contexts, and generalizing to contexts that have not yet been studied. The often critical role of practitioner experience is illustrated in these examples. Work is described that the Robert Wood Johnson Foundation is supporting to study treatment variation and context more systematically. Researchers and developers generally see a limited range of contexts in which the intervention is implemented. Individual practitioners see a different and often a wider range of contexts, albeit not a systematic sample. Organized and taken together, however, practitioner experiences can inform external validity by challenging the developers and researchers to consider a wider range of contexts. Researchers have developed a variety of ways to adapt interventions in light of such challenges. In systematic programs of inquiry, as opposed to individual studies, the problems of context can be better addressed. Evaluators have advocated an interaction of theory and practice for many years, but the process can be made more systematic and useful. Systematic interaction can set priorities for assessment of external validity by examining the prevalence and importance of context features and treatment variations. Practitioner interaction with researchers and developers can assist in sharpening program theory, reducing uncertainty about treatment variations that are consistent or inconsistent with the theory, inductively ruling out the ones that are harmful or irrelevant, and helping set priorities for more rigorous study of context and treatment variation. © The Author(s) 2016.
Miciak, Jeremy; Fletcher, Jack M.; Stuebing, Karla; Vaughn, Sharon; Tolar, Tammy D.
2014-01-01
Purpose Few empirical investigations have evaluated LD identification methods based on a pattern of cognitive strengths and weaknesses (PSW). This study investigated the reliability and validity of two proposed PSW methods: the concordance/discordance method (C/DM) and cross battery assessment (XBA) method. Methods Cognitive assessment data for 139 adolescents demonstrating inadequate response to intervention was utilized to empirically classify participants as meeting or not meeting PSW LD identification criteria using the two approaches, permitting an analysis of: (1) LD identification rates; (2) agreement between methods; and (3) external validity. Results LD identification rates varied between the two methods depending upon the cut point for low achievement, with low agreement for LD identification decisions. Comparisons of groups that met and did not meet LD identification criteria on external academic variables were largely null, raising questions of external validity. Conclusions This study found low agreement and little evidence of validity for LD identification decisions based on PSW methods. An alternative may be to use multiple measures of academic achievement to guide intervention. PMID:24274155
Psychopathy in Bulgaria: The cross-cultural generalizability of the Hare Psychopathy Checklist
Wilson, Michael J.; Abramowitz, Carolyn; Vasilev, Georgi; Bozgunov, Kiril; Vassileva, Jasmin
2014-01-01
The generalizability of the psychopathy construct to Eastern European cultures has not been well-studied, and no prior studies have evaluated psychopathy in non-offender samples from this population. The current validation study examines the factor structure, internal consistency, and external validity of the Bulgarian translation of the Hare Psychopathy Checklist: Screening Version. Two hundred sixty-two Bulgarian adults from the general community were assessed, of which 185 had a history of substance dependence. Confirmatory factor analysis indicated good fit for the two-, three-, and four-factor models of psychopathy. Zero-order and partial correlation analyses were conducted between the two factors of psychopathy and criterion measures of antisocial behavior, internalizing and externalizing psychopathology, personality traits, addictive disorders and demographic characteristics. Relationships to external variables provided evidence for the convergent and discriminant validity of the psychopathy construct in a Bulgarian community sample. PMID:25313268
Battisti, Nicolò Matteo Luca; Sehovic, Marina; Extermann, Martine
2017-09-01
Non-small-cell lung cancer (NSCLC) is a disease of the elderly, who are under-represented in clinical trials. This challenges the external validity of the evidence base for its management and of current guidelines, that we evaluated in a population of older patients. We retrieved randomized clinical trials (RCTs) supporting the guidelines and identified 18 relevant topics. We matched a cohort of NSCLC patients aged older than 80 years from the Moffitt Cancer Center database with the studies' eligibility criteria to check their qualification for at least 2 studies. Eligibility > 60% was rated full validity, 30% to 60% partial validity, and < 30% limited validity. We obtained data from 760 elderly patients in stage-adjusted groups and collected 244 RCTs from the National Comprehensive Cancer Network (NCCN) and 148 from the European Society for Medical Oncology (ESMO) guidelines. External validity was deemed insufficient for neoadjuvant chemotherapy in stage III disease (27.37% and 25.26% of patients eligible for NCCN and ESMO guidelines, respectively) and use of bevacizumab (13.86% and 16.27% of patients eligible). For ESMO guidelines, it was inadequate regarding double-agent chemotherapy (25.90% of patients eligible), its duration (24.10%) and therapy for Eastern Cooperative Oncology Group performance status 2 patients (17.74%). For NCCN guidelines external validity was lacking for neoadjuvant chemoradiotherapy in stage IIIA disease (25.86% of patients eligible). Our analysis highlighted the effect of RCT eligibility criteria on guidelines' external validity in elderly patients. Eligibility criteria should be carefully considered in trial design and more studies that do not exclude elderly patients should be included in guidelines. Copyright © 2017 Elsevier Inc. All rights reserved.
Additional Evidence of Convergent Validity between SRSS-IE and SSiS-PSG Scores
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Ennis, Robin Parks; Royer, David James
2015-01-01
We report findings of a validity study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG; Elliott & Gresham, 2007). Participants were 1,680 kindergarten through sixth-grade elementary students from three…
Ghorbani, Nima; Watson, P J
2005-06-01
This study examined the incremental validity of Hardiness scales in a sample of Iranian managers. Along with measures of the Five Factor Model and of Organizational and Psychological Adjustment, Hardiness scales were administered to 159 male managers (M age = 39.9, SD = 7.5) who had worked in their organizations for 7.9 yr. (SD=5.4). Hardiness predicted greater Job Satisfaction, higher Organization-based Self-esteem, and perceptions of the work environment as being less stressful and constraining. Hardiness also correlated positively with Assertiveness, Emotional Stability, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness and negatively with Depression, Anxiety, Perceived Stress, Chance External Control, and a Powerful Others External Control. Evidence of incremental validity was obtained when the Hardiness scales supplemented the Five Factor Model in predicting organizational and psychological adjustment. These data documented the incremental validity of the Hardiness scales in a non-Western sample and thus confirmed once again that Hardiness has a relevance that extends beyond the culture in which it was developed.
Valid and Reliable Science Content Assessments for Science Teachers
NASA Astrophysics Data System (ADS)
Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn
2013-03-01
Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper describes multiple sources of validity and reliability (Cronbach's alpha greater than 0.8) evidence for physical, life, and earth/space science assessments—part of the Diagnostic Teacher Assessments of Mathematics and Science (DTAMS) project. Validity was strengthened by systematic synthesis of relevant documents, extensive use of external reviewers, and field tests with 900 teachers during assessment development process. Subsequent results from 4,400 teachers, analyzed with Rasch IRT modeling techniques, offer construct and concurrent validity evidence.
Testing the role of external debt in environmental degradation: empirical evidence from Turkey.
Katircioglu, Salih; Celebi, Aysem
2018-03-01
This study investigates the role of external debt stock in Turkey, which has suffered from heavy (external and domestic) debt stock for many years. Annual data from 1960 to 2013 was analyzed using time series analysis in order to study this. The results confirm the validity of the conventional environmental Kuznets curve (EKC) in the case of Turkey. However, this study also found that Turkey's external debt stock did not influence the Turkish economy's long-term EKC behavior. Fortunately, the results suggest that there are important interactions among external debt stock, CO 2 emissions, energy consumption, and real income; that is, changes in external debt volume precede changes in these aggregates' volumes.
ERIC Educational Resources Information Center
Lewis, Scott E.
2014-01-01
Validity of educational research instruments and student assessments has appropriately become a growing interest in the chemistry education research community. Of particular concern is an attention to the consequences to students that result from the interpretation of assessment scores and whether those consequences are swayed by invalidity within…
Validation of the Seating and Mobility Script Concordance Test
ERIC Educational Resources Information Center
Cohen, Laura J.; Fitzgerald, Shirley G.; Lane, Suzanne; Boninger, Michael L.; Minkel, Jean; McCue, Michael
2009-01-01
The purpose of this study was to develop the scoring system for the Seating and Mobility Script Concordance Test (SMSCT), obtain and appraise internal and external structure evidence, and assess the validity of the SMSCT. The SMSCT purpose is to provide a method for testing knowledge of seating and mobility prescription. A sample of 106 therapists…
Behrens, Johann
2010-01-01
Evidence-based Medicine (EbM) is the ongoing self-reflection of an individualised approach to medicine in terms of a science that originates from and focuses on clinical decision-making (pragmatic science="Handlungswissenschaft"). EbM is particularly suitable for self-reflecting individualised medicine on the basis of decision-oriented pragmatic science because it consistently distinguishes between external evidence (i.e., other subjects' experience gained through "qualitative" and "quantitative" scientific methods) and internal evidence, i.e., the individual user's, or patient's, own experience manifesting and developing in the individual contact between therapist and patient. Therefore, internal evidence is completely different from the individual clinical experience, expertise, and conviction which therapists contribute to the encounter with clients. A deeper understanding of internal evidence as a result of this encounter has emerged only in the past 15 years. However, it is an integral part of the logic of evidence-based professional decision-making. Scientifically justified beneficial and effective treatment in the individual case cannot be deduced from external evidence but can only be gathered from internal evidence for which the best external evidence available has been utilised. In the past 15 years nursing science has not only carved out the decision-oriented scientific core of evidence-based practice but has also tried to increase the validity of studies on external evidence by employing a combination of 'qualitative' social science studies and clinical epidemiological methods. Copyright © 2010. Published by Elsevier GmbH.
Phillips, Kaye; Müller-Clemm, Werner; Ysselstein, Margaretha; Sachs, Jonathan
2013-02-01
Including context in the measurement and evaluation of health in equity interventions is critical to understanding how events that occur in an intervention's environment might contribute to or impede its success. This study adapted and piloted a contextual validity assessment framework on a selection of health inequity-related programs funded by the Canadian Health Services Research Foundation (CHSRF) between 1998 and 2006. The two overarching objectives of this study were (1) to determine the relative amount and quality of attention given to conceptualizing, measuring and validating context within CHSRF funded research final reports related to health-inequity; and (2) to contribute evaluative evidence towards the incorporation of context into the assessment and measurement of health inequity interventions. The study found that of the 42/146 CHSRF programs and projects, judged to be related to health inequity 20 adequately reported on the conceptualization, measurement and validation of context. Amongst these health-inequity related project reports, greatest emphasis was placed on describing the socio-political and economical context over actually measuring and validating contextual evidence. Applying a contextual validity assessment framework was useful for distinguishing between the descriptive (conceptual) versus empirical (measurement and validation) inclusion of documented contextual evidence. Although contextual validity measurement frameworks needs further development, this study contributes insight into identifying funded research related to health inequities and preliminary criteria for assessing interventions targeted at specific populations and jurisdictions. This study also feeds a larger critical dialogue (albeit beyond the scope of this study) regarding the relevance and utility of using evaluative techniques for understanding how specific external conditions support or impede the successful implementation of health inequity interventions. Copyright © 2012 Elsevier Ltd. All rights reserved.
Achieving external validity in home advantage research: generalizing crowd noise effects
Myers, Tony D.
2014-01-01
Different factors have been postulated to explain the home advantage phenomenon in sport. One plausible explanation investigated has been the influence of a partisan home crowd on sports officials' decisions. Different types of studies have tested the crowd influence hypothesis including purposefully designed experiments. However, while experimental studies investigating crowd influences have high levels of internal validity, they suffer from a lack of external validity; decision-making in a laboratory setting bearing little resemblance to decision-making in live sports settings. This focused review initially considers threats to external validity in applied and theoretical experimental research. Discussing how such threats can be addressed using representative design by focusing on a recently published study that arguably provides the first experimental evidence of the impact of live crowd noise on officials in sport. The findings of this controlled experiment conducted in a real tournament setting offer a level of confirmation of the findings of laboratory studies in the area. Finally directions for future research and the future conduct of crowd noise studies are discussed. PMID:24917839
Olbert, Charles M.
2013-01-01
It is unknown whether measures adapted from social neuroscience linked to specific neural systems will demonstrate relationships to external variables. Four paradigms adapted from social neuroscience were administered to 173 clinically stable outpatients with schizophrenia to determine their relationships to functionally meaningful variables and to investigate their incremental validity beyond standard measures of social and nonsocial cognition. The 4 paradigms included 2 that assess perception of nonverbal social and action cues (basic biological motion and emotion in biological motion) and 2 that involve higher level inferences about self and others’ mental states (self- referential memory and empathic accuracy). Overall, social neuroscience paradigms showed significant relationships to functional capacity but weak relationships to community functioning; the paradigms also showed weak correlations to clinical symptoms. Evidence for incremental validity beyond standard measures of social and nonsocial cognition was mixed with additional predictive power shown for functional capacity but not community functioning. Of the newly adapted paradigms, the empathic accuracy task had the broadest external validity. These results underscore the difficulty of translating developments from neuroscience into clinically useful tasks with functional significance. PMID:24072806
Olbert, Charles M; Penn, David L; Kern, Robert S; Lee, Junghee; Horan, William P; Reise, Steven P; Ochsner, Kevin N; Marder, Stephen R; Green, Michael F
2013-11-01
It is unknown whether measures adapted from social neuroscience linked to specific neural systems will demonstrate relationships to external variables. Four paradigms adapted from social neuroscience were administered to 173 clinically stable outpatients with schizophrenia to determine their relationships to functionally meaningful variables and to investigate their incremental validity beyond standard measures of social and nonsocial cognition. The 4 paradigms included 2 that assess perception of nonverbal social and action cues (basic biological motion and emotion in biological motion) and 2 that involve higher level inferences about self and others' mental states (self-referential memory and empathic accuracy). Overall, social neuroscience paradigms showed significant relationships to functional capacity but weak relationships to community functioning; the paradigms also showed weak correlations to clinical symptoms. Evidence for incremental validity beyond standard measures of social and nonsocial cognition was mixed with additional predictive power shown for functional capacity but not community functioning. Of the newly adapted paradigms, the empathic accuracy task had the broadest external validity. These results underscore the difficulty of translating developments from neuroscience into clinically useful tasks with functional significance.
Assessing the generalizability of randomized trial results to target populations.
Stuart, Elizabeth A; Bradshaw, Catherine P; Leaf, Philip J
2015-04-01
Recent years have seen increasing interest in and attention to evidence-based practices, where the "evidence" generally comes from well-conducted randomized trials. However, while those trials yield accurate estimates of the effect of the intervention for the participants in the trial (known as "internal validity"), they do not always yield relevant information about the effects in a particular target population (known as "external validity"). This may be due to a lack of specification of a target population when designing the trial, difficulties recruiting a sample that is representative of a prespecified target population, or to interest in considering a target population somewhat different from the population directly targeted by the trial. This paper first provides an overview of existing design and analysis methods for assessing and enhancing the ability of a randomized trial to estimate treatment effects in a target population. It then provides a case study using one particular method, which weights the subjects in a randomized trial to match the population on a set of observed characteristics. The case study uses data from a randomized trial of school-wide positive behavioral interventions and supports (PBIS); our interest is in generalizing the results to the state of Maryland. In the case of PBIS, after weighting, estimated effects in the target population were similar to those observed in the randomized trial. The paper illustrates that statistical methods can be used to assess and enhance the external validity of randomized trials, making the results more applicable to policy and clinical questions. However, there are also many open research questions; future research should focus on questions of treatment effect heterogeneity and further developing these methods for enhancing external validity. Researchers should think carefully about the external validity of randomized trials and be cautious about extrapolating results to specific populations unless they are confident of the similarity between the trial sample and that target population.
The Nature of Science Instrument-Elementary (NOSI-E): the end of the road?
Peoples, Shelagh M; O'Dwyer, Laura M
2014-01-01
This research continues prior work published in this journal (Peoples, O'Dwyer, Shields and Wang, 2013). The first paper described the scale development, psychometric analyses and part-validation of a theoretically-grounded Rasch-based instrument, the Nature of Science Instrument-Elementary (NOSI-E). The NOSI-E was designed to measure elementary students' understanding of the Nature of Science (NOS). In the first paper, evidence was provided for three of the six validity aspects (content, substantive and generalizability) needed to support the construct validity of the NOSI-E. The research described in this paper examines two additional validity aspects (structural and external). The purpose of this study was to determine which of three competing internal models provides reliable, interpretable, and responsive measures of students' understanding of NOS. One postulate is that the NOS construct is unidimensional;. alternatively, the NOS construct is composed of five independent unidimensional constructs (the consecutive approach). Lastly, the NOS construct is multidimensional and composed of five inter-related but separate dimensions. The vast body of evidence supported the claim that the NOS construct is multidimensional. Measures from the multidimensional model were positively related to student science achievement and students' perceptions of their classroom environment; this provided supporting evidence for the external validity aspect of the NOS construct. As US science education moves toward students learning science through engaging in authentic scientific practices and building learning progressions (NRC, 2012), it will be important to assess whether this new approach to teaching science is effective, and the NOSI-E may be used as a measure of the impact of this reform.
McGoey, Tara; Root, Zach; Bruner, Mark W; Law, Barbi
2016-01-01
Existing reviews of physical activity (PA) interventions designed to increase PA behavior exclusively in children (ages 5 to 11years) focus primarily on the efficacy (e.g., internal validity) of the interventions without addressing the applicability of the results in terms of generalizability and translatability (e.g., external validity). This review used the RE-AIM (Reach, Efficacy/Effectiveness, Adoption, Implementation, Maintenance) framework to measure the degree to which randomized and non-randomized PA interventions in children report on internal and external validity factors. A systematic search for controlled interventions conducted within the past 12years identified 78 studies that met the inclusion criteria. Based on the RE-AIM criteria, most of the studies focused on elements of internal validity (e.g., sample size, intervention location and efficacy/effectiveness) with minimal reporting of external validity indicators (e.g., representativeness of participants, start-up costs, protocol fidelity and sustainability). Results of this RE-AIM review emphasize the need for future PA interventions in children to report on real-world challenges and limitations, and to highlight considerations for translating evidence-based results into health promotion practice. Copyright © 2015 Elsevier Inc. All rights reserved.
Evidence-based medicine for every day, everyone, and every therapeutic study.
Govindarajan, Raghav; Narayanaswami, Pushpa
2018-04-17
The rapid growth in published medical literature makes it difficult for clinicians to keep up with advances in their fields. This may result in a cursory scan of the abstract and conclusion of a study without critically evaluating study quality. The application of evidence-based medicine (EBM) is the process of converting the abstract task of reading the literature into a practical method of using the literature to inform care in a specific clinical context while simultaneously expanding one's knowledge. EBM involves 4 steps: (1) stating the clinical problem in a defined question; (2) searching the literature for the evidence; (3) critically appraising the evidence for its validity; and (4) applying the evidence in the context of the patient's situation, preferences, and values. In this review, we use the recently published trial of thymectomy in myasthenia gravis as an example and systematically go through the steps of assessing internal validity, precision, and external validity. Muscle Nerve, 2018. © 2018 Wiley Periodicals, Inc.
Bianchi, Lorenzo; Schiavina, Riccardo; Borghesi, Marco; Bianchi, Federico Mineo; Briganti, Alberto; Carini, Marco; Terrone, Carlo; Mottrie, Alex; Gacci, Mauro; Gontero, Paolo; Imbimbo, Ciro; Marchioro, Giansilvio; Milanese, Giulio; Mirone, Vincenzo; Montorsi, Francesco; Morgia, Giuseppe; Novara, Giacomo; Porreca, Angelo; Volpe, Alessandro; Brunocilla, Eugenio
2018-04-06
To assess the predictive accuracy and the clinical value of a recent nomogram predicting cancer-specific mortality-free survival after surgery in pN1 prostate cancer patients through an external validation. We evaluated 518 prostate cancer patients treated with radical prostatectomy and pelvic lymph node dissection with evidence of nodal metastases at final pathology, at 10 tertiary centers. External validation was carried out using regression coefficients of the previously published nomogram. The performance characteristics of the model were assessed by quantifying predictive accuracy, according to the area under the curve in the receiver operating characteristic curve and model calibration. Furthermore, we systematically analyzed the specificity, sensitivity, positive predictive value and negative predictive value for each nomogram-derived probability cut-off. Finally, we implemented decision curve analysis, in order to quantify the nomogram's clinical value in routine practice. External validation showed inferior predictive accuracy as referred to in the internal validation (65.8% vs 83.3%, respectively). The discrimination (area under the curve) of the multivariable model was 66.7% (95% CI 60.1-73.0%) by testing with receiver operating characteristic curve analysis. The calibration plot showed an overestimation throughout the range of predicted cancer-specific mortality-free survival rates probabilities. However, in decision curve analysis, the nomogram's use showed a net benefit when compared with the scenarios of treating all patients or none. In an external setting, the nomogram showed inferior predictive accuracy and suboptimal calibration characteristics as compared to that reported in the original population. However, decision curve analysis showed a clinical net benefit, suggesting a clinical implication to correctly manage pN1 prostate cancer patients after surgery. © 2018 The Japanese Urological Association.
The Interpersonal Shame Inventory for Asian Americans: Scale Development and Psychometric Properties
Wong, Y. Joel; Kim, Bryan S. K.; Nguyen, Chi P.; Cheng, Janice Ka Yan; Saw, Anne
2016-01-01
This article reports the development and psychometric properties of the Interpersonal Shame Inventory (ISI), a culturally salient and clinically relevant measure of interpersonal shame for Asian Americans. Across 4 studies involving Asian American college students, the authors provided evidence for this new measure’s validity and reliability. Exploratory factor analyses and confirmatory factor analyses provided support for a model with 2 correlated factors: external shame (arising from concerns about others’ negative evaluations) and family shame (arising from perceptions that one has brought shame to one’s family), corresponding to 2 subscales: ISI-E and ISI-F, respectively. Evidence for criterion-related, concurrent, discriminant, and incremental validity was demonstrated by testing the associations between external shame and family shame and immigration/international status, generic state shame, face concerns, thwarted belongingness, perceived burdensomeness, self-esteem, depressive symptoms, and suicide ideation. External shame and family shame also exhibited differential relations with other variables. Mediation findings were consistent with a model in which family shame mediated the effects of thwarted belongingness on suicide ideation. Further, the ISI subscales demonstrated high alpha coefficients and test–retest reliability. These findings are discussed in light of the conceptual, methodological, and clinical contributions of the ISI. PMID:24188650
Ranapurwala, Shabbar I; Naumann, Rebecca B; Austin, Anna E; Dasgupta, Nabarun; Marshall, Stephen W
2018-06-03
The ongoing opioid epidemic has claimed more than a quarter million Americans' lives over the past 15 years. The epidemic began with an escalation of prescription opioid deaths and has now evolved to include secondary waves of illicit heroin and fentanyl deaths, while the deaths due to prescription opioid overdoses are still increasing. In response, the Centers for Disease Control and Prevention (CDC) moved to limit opioid prescribing with the release of opioid prescribing guidelines for chronic noncancer pain in March 2016. The guidelines represent a logical and timely federal response to this growing crisis. However, CDC acknowledged that the evidence base linking opioid prescribing to opioid use disorders and overdose was grades 3 and 4. Motivated by the need to strengthen the evidence base, this review details limitations of the opioid safety studies cited in the CDC guidelines with a focus on methodological limitations related to internal and external validity. Internal validity concerns were related to poor confounding control, variable misclassification, selection bias, competing risks, and potential competing interventions. External validity concerns arose from the use of limited source populations, historical data (in a fast-changing epidemic), and issues with handling of cancer and acute pain patients' data. We provide a nonexhaustive list of 7 recommendations to address these limitations in future opioid safety studies. Strengthening the opioid safety evidence base will aid any future revisions of the CDC guidelines and enhance their prevention impact. Copyright © 2018 John Wiley & Sons, Ltd.
Preliminary Validity of the Eyberg Child Behavior Inventory With Filipino Immigrant Parents
Coffey, Dean M.; Javier, Joyce R.; Schrager, Sheree M.
2016-01-01
Filipinos are an understudied minority affected by significant behavioral health disparities. We evaluate evidence for the reliability, construct validity, and convergent validity of the Eyberg Child Behavior Inventory (ECBI) in 6- to 12- year old Filipino children (N = 23). ECBI scores demonstrated high internal consistency, supporting a single-factor model (pre-intervention α =.91; post-intervention α =.95). Results document convergent validity with the Child Behavior Checklist Externalizing scale at pretest (r = .54, p < .01) and posttest (r = .71, p < .001). We conclude that the ECBI is a promising tool to measure behavior problems in Filipino children. PMID:27087739
Preliminary Validity of the Eyberg Child Behavior Inventory With Filipino Immigrant Parents.
Coffey, Dean M; Javier, Joyce R; Schrager, Sheree M
Filipinos are an understudied minority affected by significant behavioral health disparities. We evaluate evidence for the reliability, construct validity, and convergent validity of the Eyberg Child Behavior Inventory (ECBI) in 6- to 12- year old Filipino children ( N = 23). ECBI scores demonstrated high internal consistency, supporting a single-factor model (pre-intervention α =.91; post-intervention α =.95). Results document convergent validity with the Child Behavior Checklist Externalizing scale at pretest ( r = .54, p < .01) and posttest ( r = .71, p < .001). We conclude that the ECBI is a promising tool to measure behavior problems in Filipino children.
Gaus, Wilhelm; Muche, Rainer
2013-05-01
Clinical studies provide formalised experience for evidence-based medicine (EBM). Many people consider a controlled randomised trial (CRT, identical to a randomised controlled trial RCT) to be the non-plus-ultra design. However, CRTs also have limitations. The problem is not randomisation itself but informed consent for randomisation and masking of therapies according to today's legal and ethical standards. We do not want to de-rate CRTs, but we would like to contribute to the discussion on clinical research methodology. Informed consent to a CRT and masking of therapies plainly select patients. The excellent internal validity of CRTs can be counterbalanced by poor external validity, because internal and external validity act as antagonists. In a CRT, patients may feel like guinea pigs, this can decrease compliance, cause protocol violations, reduce self-healing properties, suppress unspecific therapeutic effects and possibly even modify specific efficacy. A control group (comparative study) is most important for the degree of evidence achieved by a trial. Study control by detailed protocol and good clinical practice (controlled study) is second in importance and randomisation and masking is third (thus the sequence CRT instead of RCT). Controlled non-randomised trials are just as ambitious and detailed as CRTs. We recommend clinicians and biometricians to take high quality controlled non-randomised trials into consideration more often. They combine good internal and external validity, better suit daily medical practice, show better patient compliance and fewer protocol violations, deliver estimators unbiased by alienated patients, and perhaps provide a clearer explanation of the achieved success. Copyright © 2013 Elsevier Inc. All rights reserved.
Houdek, Petr
2017-01-01
The aim of this perspective article is to show that current experimental evidence on factors influencing dishonesty has limited external validity. Most of experimental studies is built on random assignments, in which control/experimental groups of subjects face varied sizes of the expected reward for behaving dishonestly, opportunities for cheating, means of rationalizing dishonest behavior etc., and mean groups' reactions are observed. The studies have internal validity in assessing the causal influence of these and other factors, but they lack external validity in organizational, market and other environments. If people can opt into or out of diverse real-world environments, an experiment aimed at studying factors influencing real-life degree of dishonesty should permit for such an option. The behavior of such self-selected groups of marginal subjects would probably contain a larger level of (non)deception than the behavior of average people. The article warns that there are not many studies that would enable self-selection or sorting of participants into varying environments, and that limits current knowledge of the extent and dynamics of dishonest and fraudulent behavior. The article focuses on suggestions how to improve dishonesty research, especially how to avoid the experimenter demand bias.
Houdek, Petr
2017-01-01
The aim of this perspective article is to show that current experimental evidence on factors influencing dishonesty has limited external validity. Most of experimental studies is built on random assignments, in which control/experimental groups of subjects face varied sizes of the expected reward for behaving dishonestly, opportunities for cheating, means of rationalizing dishonest behavior etc., and mean groups’ reactions are observed. The studies have internal validity in assessing the causal influence of these and other factors, but they lack external validity in organizational, market and other environments. If people can opt into or out of diverse real-world environments, an experiment aimed at studying factors influencing real-life degree of dishonesty should permit for such an option. The behavior of such self-selected groups of marginal subjects would probably contain a larger level of (non)deception than the behavior of average people. The article warns that there are not many studies that would enable self-selection or sorting of participants into varying environments, and that limits current knowledge of the extent and dynamics of dishonest and fraudulent behavior. The article focuses on suggestions how to improve dishonesty research, especially how to avoid the experimenter demand bias. PMID:28955279
Quasi-experimental study designs series-paper 4: uses and value.
Bärnighausen, Till; Tugwell, Peter; Røttingen, John-Arne; Shemilt, Ian; Rockers, Peter; Geldsetzer, Pascal; Lavis, John; Grimshaw, Jeremy; Daniels, Karen; Brown, Annette; Bor, Jacob; Tanner, Jeffery; Rashidian, Arash; Barreto, Mauricio; Vollmer, Sebastian; Atun, Rifat
2017-09-01
Quasi-experimental studies are increasingly used to establish causal relationships in epidemiology and health systems research. Quasi-experimental studies offer important opportunities to increase and improve evidence on causal effects: (1) they can generate causal evidence when randomized controlled trials are impossible; (2) they typically generate causal evidence with a high degree of external validity; (3) they avoid the threats to internal validity that arise when participants in nonblinded experiments change their behavior in response to the experimental assignment to either intervention or control arm (such as compensatory rivalry or resentful demoralization); (4) they are often well suited to generate causal evidence on long-term health outcomes of an intervention, as well as nonhealth outcomes such as economic and social consequences; and (5) they can often generate evidence faster and at lower cost than experiments and other intervention studies. Copyright © 2017 Elsevier Inc. All rights reserved.
Longitudinal Stability of Phonological and Surface Subtypes of Developmental Dyslexia
ERIC Educational Resources Information Center
Peterson, Robin L.; Pennington, Bruce F.; Olson, Richard K.; Wadsworth, Sally J.
2014-01-01
Limited evidence supports the external validity of the distinction between developmental phonological and surface dyslexia. We previously identified children ages 8 to 13 meeting criteria for these subtypes (Peterson, Pennington, & Olson, 2013) and now report on their reading and related skills approximately 5 years later. Longitudinal…
Reconceptualising the external validity of discrete choice experiments.
Lancsar, Emily; Swait, Joffre
2014-10-01
External validity is a crucial but under-researched topic when considering using discrete choice experiment (DCE) results to inform decision making in clinical, commercial or policy contexts. We present the theory and tests traditionally used to explore external validity that focus on a comparison of final outcomes and review how this traditional definition has been empirically tested in health economics and other sectors (such as transport, environment and marketing) in which DCE methods are applied. While an important component, we argue that the investigation of external validity should be much broader than a comparison of final outcomes. In doing so, we introduce a new and more comprehensive conceptualisation of external validity, closely linked to process validity, that moves us from the simple characterisation of a model as being or not being externally valid on the basis of predictive performance, to the concept that external validity should be an objective pursued from the initial conceptualisation and design of any DCE. We discuss how such a broader definition of external validity can be fruitfully used and suggest innovative ways in which it can be explored in practice.
Lefering, R; Tecic, T; Schmidt, Y; Pirente, N; Bouillon, B; Neugebauer, E
2012-08-01
Due to an increasing number of survivors after multiple injuries in Western countries, the health-related quality of life (QoL) is considered to be an important outcome parameter. Up to now, measuring instruments used in this field lacked validity and comparability. Within 6 years, our working group developed a new modular instrument, called the Polytrauma Outcome (POLO) chart. This study documents the validation of the trauma-specific module specifically designed for trauma patients, the Trauma Outcome Profile (TOP). A total of 172 multiply injured patients (mean Injury Severity Score [ISS] 26.7) recruited from eight trauma centres participating in the German Trauma Registry were compared with 166 marginally injured patients (mean ISS 3.9). The mean follow-up was 24.2 and 26.4 months, respectively. The validation questionnaires used were the Beck Depression Inventory (BDI), the State-Trait Anxiety Inventory (STAI), Impact of Event Scale-Revised (IES-R), Social Support Questionnaire (F-SOZU-K-22), Barthel Index of Activities of Daily Living (ADL) and the Short Form Health Survey (SF-36). The internal consistency of the different dimensions of QoL assessed with the TOP was good. Factor analysis provides evidence of the construct validity of the questionnaire. Correlation with external measures gives evidence of criterion validity for the various dimensions of QoL and similar exceedance of proposed cut-off points within TOP and external measures is verified. The TOP module is a reliable and valid instrument to assess health-related QoL in patients with multiple injuries. It can be used stand-alone or as part of the POLO chart together with the Glasgow Outcome Scale (GOS), the EuroQoL and the SF-36 as a regular systematic follow-up instrument.
Female orgasm(s): one, two, several.
Jannini, Emmanuele A; Rubio-Casillas, Alberto; Whipple, Beverly; Buisson, Odile; Komisaruk, Barry R; Brody, Stuart
2012-04-01
There is general agreement that it is possible to have an orgasm thru the direct simulation of the external clitoris. In contrast, the possibility of achieving climax during penetration has been controversial. Six scientists with different experimental evidence debate the existence of the vaginally activated orgasm (VAO). To give reader of The Journal of Sexual Medicine sufficient data to form her/his own opinion on an important topic of female sexuality. Expert #1, the Controversy's section Editor, together with Expert #2, reviewed data from the literature demonstrating the anatomical possibility for the VAO. Expert #3 presents validating women's reports of pleasurable sexual responses and adaptive significance of the VAO. Echographic dynamic evidence induced Expert # 4 to describe one single orgasm, obtained from stimulation of either the external or internal clitoris, during penetration. Expert #5 reviewed his elegant experiments showing the uniquely different sensory responses to clitoral, vaginal, and cervical stimulation. Finally, the last Expert presented findings on the psychological scenario behind VAO. The assumption that women may experience only the clitoral, external orgasm is not based on the best available scientific evidence. © 2012 International Society for Sexual Medicine.
[Spanish adaptation of the Stress Manifestations Scale of the Student Stress Inventory (SSI-SM)].
Escobar Espejo, Milagros; Blanca, María J; Fernández-Baena, F Javier; Trianes Torres, María Victoria
2011-08-01
The aim of the present study was to translate into Spanish and to describe the psychometric properties of the Stress Manifestations Scale of the Student Stress Inventory (SSI-SM), developed by Fimian, Fastenau, Tashner and Cross to identify the main manifestations of stress in adolescents. The scale was applied to a sample of 1,002 pupils from years one and two of Secondary Education. The paper reports the factor structure, an item analysis, the internal consistency, differences by sex and academic year, external evidence of validity, and norms for scoring the scale. The results reveal a factor structure based on three first-order factors (emotional manifestations, physiological manifestations and behavioural manifestations) and one second-order factor (indicative of stress manifestations). In terms of external validity, there was a positive association with measures of perceived stress, aggressiveness, internalized/externalized symptoms, and a negative association with life satisfaction. The results show that the scale is an adequate tool for evaluating stress manifestations in adolescents.
The Preschool Learning Behaviors Scale: Dimensionality and External Validity in Head Start
ERIC Educational Resources Information Center
McDermott, Paul A.; Rikoon, Samuel H.; Waterman, Clare; Fantuzzo, John W.
2012-01-01
Given the importance of accurately gauging early childhood approaches to learning, this study reports evidence for the dimensionality and utility of the Preschool Learning Behaviors Scale for use with disadvantaged preschool children. Data from a large (N = 1,666) sample representative of urban Head Start classrooms revealed three reliable…
Psychometric Evidence of SRSS-IE Scores in Middle and High Schools
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Cantwell, Emily D.; Menzies, Holly Mariah; Schatschneider, Christopher; Lambert, Warren; Common, Eric Alan
2017-01-01
We report results of an exploratory validation study of the "Student Risk Screening Scale-Internalizing and Externalizing" (SRSS-IE) applied with the first sample of middle and high school students from nine middle and three high schools from three states. The "Student Risk Screening Scale" (SRSS) was modified to broaden the…
What Works Clearinghouse Standards and Generalization of Single-Case Design Evidence
ERIC Educational Resources Information Center
Hitchcock, John H.; Kratochwill, Thomas R.; Chezan, Laura C.
2015-01-01
A recent review of existing rubrics designed to help researchers evaluate the internal and external validity of single-case design (SCD) studies found that the various options yield consistent results when examining causal arguments. The authors of the review, however, noted considerable differences across the rubrics when addressing the…
Measuring Perceived Barriers to Physical Activity in Adolescents.
Gunnell, Katie E; Brunet, Jennifer; Wing, Erin K; Bélanger, Mathieu
2015-05-01
Perceived barriers to moderate-to-vigorous physical activity (PA) may contribute to the low rates of moderate-to-vigorous PA in adolescents. We examined the psychometric properties of scores from the perceived barriers to moderate-to-vigorous PA scale (PB-MVPA) by examining composite reliability and validity evidence based on the internal structure of the PB-MVPA and relations with other variables. This study was a cross-sectional analysis of data collected in 2013 from adolescents (N = 507; Mage = 12.40, SD = .62) via self-report scales. Using exploratory and confirmatory factor analyses, we found that perceived barriers were best represented as two factors representing internal (e.g., "I am not interested in physical activity") and external (e.g., "I need equipment I don't have") dimensions. Composite reliability was over .80. Using multiple regression to examine the relationship between perceived barriers and moderate-to-vigorous PA, we found that perceived internal barriers were inversely related to moderate-to-vigorous PA (β = -.32, p < .05). Based on results of the analysis of variances, there were no known-group sex differences for perceived internal and external barriers (p > .26). The PB-MVPA scale demonstrated evidence of score reliability and validity. To improve the understanding of the impact of perceived barriers on moderate-to- vigorous PA in adolescents, researchers should examine internal and external barriers separately.
Walach, Harald; Falkenberg, Torkel; Fønnebø, Vinjar; Lewith, George; Jonas, Wayne B
2006-01-01
Background The reasoning behind evaluating medical interventions is that a hierarchy of methods exists which successively produce improved and therefore more rigorous evidence based medicine upon which to make clinical decisions. At the foundation of this hierarchy are case studies, retrospective and prospective case series, followed by cohort studies with historical and concomitant non-randomized controls. Open-label randomized controlled studies (RCTs), and finally blinded, placebo-controlled RCTs, which offer most internal validity are considered the most reliable evidence. Rigorous RCTs remove bias. Evidence from RCTs forms the basis of meta-analyses and systematic reviews. This hierarchy, founded on a pharmacological model of therapy, is generalized to other interventions which may be complex and non-pharmacological (healing, acupuncture and surgery). Discussion The hierarchical model is valid for limited questions of efficacy, for instance for regulatory purposes and newly devised products and pharmacological preparations. It is inadequate for the evaluation of complex interventions such as physiotherapy, surgery and complementary and alternative medicine (CAM). This has to do with the essential tension between internal validity (rigor and the removal of bias) and external validity (generalizability). Summary Instead of an Evidence Hierarchy, we propose a Circular Model. This would imply a multiplicity of methods, using different designs, counterbalancing their individual strengths and weaknesses to arrive at pragmatic but equally rigorous evidence which would provide significant assistance in clinical and health systems innovation. Such evidence would better inform national health care technology assessment agencies and promote evidence based health reform. PMID:16796762
ERIC Educational Resources Information Center
Hartman, Kelsey; Gresham, Frank M.; Byrd, Shelby
2017-01-01
Universal screening for emotional and behavioral risk in schools facilitates early identification and intervention for students as part of multitiered systems of support. Early identification has the potential to mitigate adverse outcomes of emotional and behavioral disorders. The purpose of this study was to extend existing research on the…
Study design elements for rigorous quasi-experimental comparative effectiveness research.
Maciejewski, Matthew L; Curtis, Lesley H; Dowd, Bryan
2013-03-01
Quasi-experiments are likely to be the workhorse study design used to generate evidence about the comparative effectiveness of alternative treatments, because of their feasibility, timeliness, affordability and external validity compared with randomized trials. In this review, we outline potential sources of discordance in results between quasi-experiments and experiments, review study design choices that can improve the internal validity of quasi-experiments, and outline innovative data linkage strategies that may be particularly useful in quasi-experimental comparative effectiveness research. There is an urgent need to resolve the debate about the evidentiary value of quasi-experiments since equal consideration of rigorous quasi-experiments will broaden the base of evidence that can be brought to bear in clinical decision-making and governmental policy-making.
Strikwerda-Brown, Cherie; Mothakunnel, Annu; Hodges, John R; Piguet, Olivier; Irish, Muireann
2018-04-24
Autobiographical memory (ABM) is typically held to comprise episodic and semantic elements, with the vast majority of studies to date focusing on profiles of episodic details in health and disease. In this context, 'non-episodic' elements are often considered to reflect semantic processing or are discounted from analyses entirely. Mounting evidence suggests that rather than reflecting one unitary entity, semantic autobiographical information may contain discrete subcomponents, which vary in their relative degree of semantic or episodic content. This study aimed to (1) review the existing literature to formally characterize the variability in analysis of 'non-episodic' content (i.e., external details) on the Autobiographical Interview and (2) use these findings to create a theoretically grounded framework for coding external details. Our review exposed discrepancies in the reporting and interpretation of external details across studies, reinforcing the need for a new, consistent approach. We validated our new external details scoring protocol (the 'NExt' taxonomy) in patients with Alzheimer's disease (n = 18) and semantic dementia (n = 13), and 20 healthy older Control participants and compared profiles of the NExt subcategories across groups and time periods. Our results revealed increased sensitivity of the NExt taxonomy in discriminating between ABM profiles of patient groups, when compared to traditionally used internal and external detail metrics. Further, remote and recent autobiographical memories displayed distinct compositions of the NExt detail types. This study is the first to provide a fine-grained and comprehensive taxonomy to parse external details into intuitive subcategories and to validate this protocol in neurodegenerative disorders. © 2018 The British Psychological Society.
Nour, Monica; Chen, Juliana; Allman-Farinelli, Margaret
2016-04-08
Young adults (18-35 years) remain among the lowest vegetable consumers in many western countries. The digital era offers opportunities to engage this age group in interventions in new and appealing ways. This systematic review evaluated the efficacy and external validity of electronic (eHealth) and mobile phone (mHealth) -based interventions that promote vegetable intake in young adults. We searched several electronic databases for studies published between 1990 and 2015, and 2 independent authors reviewed the quality and risk of bias of the eligible papers and extracted data for analyses. The primary outcome of interest was the change in vegetable intake postintervention. Where possible, we calculated effect sizes (Cohen d and 95% CIs) for comparison. A random effects model was applied to the data for meta-analysis. Reach and representativeness of participants, intervention implementation, and program maintenance were assessed to establish external validity. Published validation studies were consulted to determine the validity of tools used to measure intake. We applied the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system to evaluate the overall quality of the body of evidence. Of the 14 studies that met the selection criteria, we included 12 in the meta-analysis. In the meta-analysis, 7 studies found positive effects postintervention for fruit and vegetable intake, Cohen d 0.14-0.56 (pooled effect size 0.22, 95% CI 0.11-0.33, I(2)=68.5%, P=.002), and 4 recorded positive effects on vegetable intake alone, Cohen d 0.11-0.40 (pooled effect size 0.15, 95% CI 0.04-0.28, I(2)=31.4%, P=.2). These findings should be interpreted with caution due to variability in intervention design and outcome measures. With the majority of outcomes documented as a change in combined fruit and vegetable intake, it was difficult to determine intervention effects on vegetable consumption specifically. Measurement of intake was most commonly by self-report, with 5 studies using nonvalidated tools. Longer-term follow-up was lacking from most studies (n=12). Risk of bias was high among the included studies, and the overall body of evidence was rated as low quality. The applicability of interventions to the broader young adult community was unclear due to poor description of external validity components. Preliminary evidence suggests that eHealth and mHealth strategies may be effective in improving vegetable intake in young adults; whether these small effects have clinical or nutritional significance remains questionable. With studies predominantly reporting outcomes as fruit and vegetable intake combined, we suggest that interventions report vegetables separately. Furthermore, to confidently establish the efficacy of these strategies, better-quality interventions are needed for young adults, using valid measures of intake, with improved reporting on costs, sustainability and long-term effects of programs. PROSPERO International Prospective Register of Systematic Reviews: CRD42015017763; http://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42015017763 (Archived by WebCite at http://www.webcitation.org/6fLhMgUP4).
Musical Preferences Predict Personality: Evidence From Active Listening and Facebook Likes.
Nave, Gideon; Minxha, Juri; Greenberg, David M; Kosinski, Michal; Stillwell, David; Rentfrow, Jason
2018-03-01
Research over the past decade has shown that various personality traits are communicated through musical preferences. One limitation of that research is external validity, as most studies have assessed individual differences in musical preferences using self-reports of music-genre preferences. Are personality traits communicated through behavioral manifestations of musical preferences? We addressed this question in two large-scale online studies with demographically diverse populations. Study 1 ( N = 22,252) shows that reactions to unfamiliar musical excerpts predicted individual differences in personality-most notably, openness and extraversion-above and beyond demographic characteristics. Moreover, these personality traits were differentially associated with particular music-preference dimensions. The results from Study 2 ( N = 21,929) replicated and extended these findings by showing that an active measure of naturally occurring behavior, Facebook Likes for musical artists, also predicted individual differences in personality. In general, our findings establish the robustness and external validity of the links between musical preferences and personality.
ERIC Educational Resources Information Center
Schünemann, Holger J.; Tugwell, Peter; Reeves, Barnaby C.; Akl, Elie A.; Santesso, Nancy; Spencer, Frederick A.; Shea, Beverley; Wells, George; Helfand, Mark
2013-01-01
The terms applicability, generalizability, external validity and transferability are related, sometimes used interchangeably and have in common that they lack a clear and consistent definition in the classic epidemiological literature. However, all of these terms generally describe one overarching theme: whether or not available research evidence…
ERIC Educational Resources Information Center
Jonsson, Ulf; Olsson, Nora Choque; Bölte, Sven
2016-01-01
Systematic reviews have traditionally focused on internal validity, while external validity often has been overlooked. In this study, we systematically reviewed determinants of external validity in the accumulated randomized controlled trials of social skills group interventions for children and adolescents with autism spectrum disorder. We…
ERIC Educational Resources Information Center
Steinfatt, Thomas M.
1991-01-01
Responds to an article in the same issue of this journal which defends the applied value of laboratory studies to managers. Agrees that external validity is often irrelevant, and maintains that the problem of making inferences from any subject sample in management communication is one that demands internal, not external, validity. (SR)
Kim, Jeong-Eon; Park, Eun-Jun
2015-04-01
The purpose of this study was to validate the Korean version of the Ethical Leadership at Work questionnaire (K-ELW) that measures RNs' perceived ethical leadership of their nurse managers. The strong validation process suggested by Benson (1998), including translation and cultural adaptation stage, structural stage, and external stage, was used. Participants were 241 RNs who reported their perceived ethical leadership using both the pre-version of K-ELW and a previously known Ethical Leadership Scale, and interactional justice of their managers, as well as their own demographics, organizational commitment and organizational citizenship behavior. Data analyses included descriptive statistics, Pearson correlation coefficients, reliability coefficients, exploratory factor analysis, and confirmatory factor analysis. SPSS 19.0 and Amos 18.0 versions were used. A modified K-ELW was developed from construct validity evidence and included 31 items in 7 domains: People orientation, task responsibility fairness, relationship fairness, power sharing, concern for sustainability, ethical guidance, and integrity. Convergent validity, discriminant validity, and concurrent validity were supported according to the correlation coefficients of the 7 domains with other measures. The results of this study provide preliminary evidence that the modified K-ELW can be adopted in Korean nursing organizations, and reliable and valid ethical leadership scores can be expected.
Hickey, Graeme L; Blackstone, Eugene H
2016-08-01
Clinical risk-prediction models serve an important role in healthcare. They are used for clinical decision-making and measuring the performance of healthcare providers. To establish confidence in a model, external model validation is imperative. When designing such an external model validation study, thought must be given to patient selection, risk factor and outcome definitions, missing data, and the transparent reporting of the analysis. In addition, there are a number of statistical methods available for external model validation. Execution of a rigorous external validation study rests in proper study design, application of suitable statistical methods, and transparent reporting. Copyright © 2016 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
External Standards or Standard Addition? Selecting and Validating a Method of Standardization
NASA Astrophysics Data System (ADS)
Harvey, David T.
2002-05-01
A common feature of many problem-based laboratories in analytical chemistry is a lengthy independent project involving the analysis of "real-world" samples. Students research the literature, adapting and developing a method suitable for their analyte, sample matrix, and problem scenario. Because these projects encompass the complete analytical process, students must consider issues such as obtaining a representative sample, selecting a method of analysis, developing a suitable standardization, validating results, and implementing appropriate quality assessment/quality control practices. Most textbooks and monographs suitable for an undergraduate course in analytical chemistry, however, provide only limited coverage of these important topics. The need for short laboratory experiments emphasizing important facets of method development, such as selecting a method of standardization, is evident. The experiment reported here, which is suitable for an introductory course in analytical chemistry, illustrates the importance of matrix effects when selecting a method of standardization. Students also learn how a spike recovery is used to validate an analytical method, and obtain a practical experience in the difference between performing an external standardization and a standard addition.
Underreporting on the MMPI-2-RF in a high-demand police officer selection context: an illustration.
Detrick, Paul; Chibnall, John T
2014-09-01
Positive response distortion is common in the high-demand context of employment selection. This study examined positive response distortion, in the form of underreporting, on the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF). Police officer job applicants completed the MMPI-2-RF under high-demand and low-demand conditions, once during the preemployment psychological evaluation and once without contingencies after completing the police academy. Demand-related score elevations were evident on the Uncommon Virtues (L-r) and Adjustment Validity (K-r) scales. Underreporting was evident on the Higher-Order scales Emotional/Internalizing Dysfunction and Behavioral/Externalizing Dysfunction; 5 of 9 Restructured Clinical scales; 6 of 9 Internalizing scales; 3 of 4 Externalizing scales; and 3 of 5 Personality Psychopathology 5 scales. Regression analyses indicated that L-r predicted demand-related underreporting on behavioral/externalizing scales, and K-r predicted underreporting on emotional/internalizing scales. Select scales of the MMPI-2-RF are differentially associated with different types of underreporting among police officer applicants. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Myers, Tony; Balmer, Nigel
2012-01-01
Numerous factors have been proposed to explain the home advantage in sport. Several authors have suggested that a partisan home crowd enhances home advantage and that this is at least in part a consequence of their influence on officiating. However, while experimental studies examining this phenomenon have high levels of internal validity (since only the "crowd noise" intervention is allowed to vary), they suffer from a lack of external validity, with decision-making in a laboratory setting typically bearing little resemblance to decision-making in live sports settings. Conversely, observational and quasi-experimental studies with high levels of external validity suffer from low levels of internal validity as countless factors besides crowd noise vary. The present study provides a unique opportunity to address these criticisms, by conducting a controlled experiment on the impact of crowd noise on officiating in a live tournament setting. Seventeen qualified judges officiated on thirty Thai boxing bouts in a live international tournament setting featuring "home" and "away" boxers. In each bout, judges were randomized into a "noise" (live sound) or "no crowd noise" (noise-canceling headphones and white noise) condition, resulting in 59 judgments in the "no crowd noise" and 61 in the "crowd noise" condition. The results provide the first experimental evidence of the impact of live crowd noise on officials in sport. A cross-classified statistical model indicated that crowd noise had a statistically significant impact, equating to just over half a point per bout (in the context of five round bouts with the "10-point must" scoring system shared with professional boxing). The practical significance of the findings, their implications for officiating and for the future conduct of crowd noise studies are discussed.
Myers, Tony; Balmer, Nigel
2012-01-01
Numerous factors have been proposed to explain the home advantage in sport. Several authors have suggested that a partisan home crowd enhances home advantage and that this is at least in part a consequence of their influence on officiating. However, while experimental studies examining this phenomenon have high levels of internal validity (since only the “crowd noise” intervention is allowed to vary), they suffer from a lack of external validity, with decision-making in a laboratory setting typically bearing little resemblance to decision-making in live sports settings. Conversely, observational and quasi-experimental studies with high levels of external validity suffer from low levels of internal validity as countless factors besides crowd noise vary. The present study provides a unique opportunity to address these criticisms, by conducting a controlled experiment on the impact of crowd noise on officiating in a live tournament setting. Seventeen qualified judges officiated on thirty Thai boxing bouts in a live international tournament setting featuring “home” and “away” boxers. In each bout, judges were randomized into a “noise” (live sound) or “no crowd noise” (noise-canceling headphones and white noise) condition, resulting in 59 judgments in the “no crowd noise” and 61 in the “crowd noise” condition. The results provide the first experimental evidence of the impact of live crowd noise on officials in sport. A cross-classified statistical model indicated that crowd noise had a statistically significant impact, equating to just over half a point per bout (in the context of five round bouts with the “10-point must” scoring system shared with professional boxing). The practical significance of the findings, their implications for officiating and for the future conduct of crowd noise studies are discussed. PMID:23049520
Motor Imagery and Tennis Serve Performance: The External Focus Efficacy
Guillot, Aymeric; Desliens, Simon; Rouyer, Christelle; Rogowski, Isabelle
2013-01-01
There is now ample evidence that motor imagery (MI) contributes to enhance motor performance. Previous research also demonstrated that directing athletes’ attention to the effects of their movements on the environment is more effective than focusing on the action per se. The present study aimed therefore at evaluating whether adopting an external focus during MI contributes to enhance tennis serve performance. Twelve high-level young tennis players were included in a test-retest procedure. The effects of regular training were first evaluated. Then, players were subjected to a MI intervention during which they mentally focused on ball trajectory and specifically visualized the space above the net where the serve can be successfully hit. Serve performance was evaluated during both a validated serve test and a real match. The main results showed a significant increase in accuracy and velocity during the ecological serve test after MI practice, as well as a significant improvement in successful first serves and won points during the match. Present data therefore confirmed the efficacy of MI in combination of physical practice to improve tennis serve performance, and further provided evidence that it is feasible to adopt external attentional focus during MI. Practical applications are discussed. Key Points Motor imagery contributes to enhance tennis serve performance. Data provided evidence of the benefits of adopting an external focus of attention during imagery. Results showed significant improvement in successful first serves and won points during a real match. PMID:24149813
Causadias, José M.; Salvatore, Jessica E.; Sroufe, L. Alan
2012-01-01
The present study examines two childhood markers of self-regulation, ego-control and ego-resiliency, as promotive factors for the development of global adjustment and as risk factors for the development of internalizing and externalizing behavior problems in a high-risk sample. Teachers and observers rated ego-control and ego-resiliency when participants (n = 136) were in preschool and elementary school. Ratings showed evidence for convergent and discriminant validity and stability over time. Ego-resiliency, but not ego-control, emerged as powerful predictor of adaptive functioning at age 19 and 26, as well as internalizing and externalizing problems at 16, 23, 26, and 32 years. We interpret these findings as evidence that flexibility and adaptability -measured with ego-resiliency- may reduce risk and promote successful adaptation in low-SES environments. PMID:23155299
Adderley, N J; Mallett, S; Marshall, T; Ghosh, S; Rayman, G; Bellary, S; Coleman, J; Akiboye, F; Toulis, K A; Nirantharakumar, K
2018-06-01
To temporally and externally validate our previously developed prediction model, which used data from University Hospitals Birmingham to identify inpatients with diabetes at high risk of adverse outcome (mortality or excessive length of stay), in order to demonstrate its applicability to other hospital populations within the UK. Temporal validation was performed using data from University Hospitals Birmingham and external validation was performed using data from both the Heart of England NHS Foundation Trust and Ipswich Hospital. All adult inpatients with diabetes were included. Variables included in the model were age, gender, ethnicity, admission type, intensive therapy unit admission, insulin therapy, albumin, sodium, potassium, haemoglobin, C-reactive protein, estimated GFR and neutrophil count. Adverse outcome was defined as excessive length of stay or death. Model discrimination in the temporal and external validation datasets was good. In temporal validation using data from University Hospitals Birmingham, the area under the curve was 0.797 (95% CI 0.785-0.810), sensitivity was 70% (95% CI 67-72) and specificity was 75% (95% CI 74-76). In external validation using data from Heart of England NHS Foundation Trust, the area under the curve was 0.758 (95% CI 0.747-0.768), sensitivity was 73% (95% CI 71-74) and specificity was 66% (95% CI 65-67). In external validation using data from Ipswich, the area under the curve was 0.736 (95% CI 0.711-0.761), sensitivity was 63% (95% CI 59-68) and specificity was 69% (95% CI 67-72). These results were similar to those for the internally validated model derived from University Hospitals Birmingham. The prediction model to identify patients with diabetes at high risk of developing an adverse event while in hospital performed well in temporal and external validation. The externally validated prediction model is a novel tool that can be used to improve care pathways for inpatients with diabetes. Further research to assess clinical utility is needed. © 2018 Diabetes UK.
Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M
2015-03-01
It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Walach, Harald; Loef, Martin
2015-11-01
The hierarchy of evidence presupposes linearity and additivity of effects, as well as commutativity of knowledge structures. It thereby implicitly assumes a classical theoretical model. This is an argumentative article that uses theoretical analysis based on pertinent literature and known facts to examine the standard view of methodology. We show that the assumptions of the hierarchical model are wrong. The knowledge structures gained by various types of studies are not sequentially indifferent, that is, do not commute. External validity and internal validity are at least partially incompatible concepts. Therefore, one needs a different theoretical structure, typical of quantum-type theories, to model this situation. The consequence of this situation is that the implicit assumptions of the hierarchical model are wrong, if generalized to the concept of evidence in total. The problem can be solved by using a matrix-analytical approach to synthesizing evidence. Here, research methods that produce different types of evidence that complement each other are synthesized to yield the full knowledge. We show by an example how this might work. We conclude that the hierarchical model should be complemented by a broader reasoning in methodology. Copyright © 2015 Elsevier Inc. All rights reserved.
Validation of psychoanalytic theories: towards a conceptualization of references.
Zachrisson, Anders; Zachrisson, Henrik Daae
2005-10-01
The authors discuss criteria for the validation of psychoanalytic theories and develop a heuristic and normative model of the references needed for this. Their core question in this paper is: can psychoanalytic theories be validated exclusively from within psychoanalytic theory (internal validation), or are references to sources of knowledge other than psychoanalysis also necessary (external validation)? They discuss aspects of the classic truth criteria correspondence and coherence, both from the point of view of contemporary psychoanalysis and of contemporary philosophy of science. The authors present arguments for both external and internal validation. Internal validation has to deal with the problems of subjectivity of observations and circularity of reasoning, external validation with the problem of relevance. They recommend a critical attitude towards psychoanalytic theories, which, by carefully scrutinizing weak points and invalidating observations in the theories, reduces the risk of wishful thinking. The authors conclude by sketching a heuristic model of validation. This model combines correspondence and coherence with internal and external validation into a four-leaf model for references for the process of validating psychoanalytic theories.
External validation of the NUn score for predicting anastomotic leakage after oesophageal resection.
Paireder, Matthias; Jomrich, Gerd; Asari, Reza; Kristo, Ivan; Gleiss, Andreas; Preusser, Matthias; Schoppmann, Sebastian F
2017-08-29
Early detection of anastomotic leakage (AL) after oesophageal resection for malignancy is crucial. This retrospective study validates a risk score, predicting AL, which includes C-reactive protein, albumin and white cell count in patients undergoing oesophageal resection between 2003 and 2014. For validation of the NUn score a receiver operating characteristic (ROC) curve is estimated. Area under the ROC curve (AUC) is reported with 95% confidence interval (CI). Among 258 patients (79.5% male) 32 patients showed signs of anastomotic leakage (12.4%). NUn score in our data has a median of 9.3 (range 6.2-17.6). The odds ratio for AL was 1.31 (CI 1.03-1.67; p = 0.028). AUC for AL was 0.59 (CI 0.47-0.72). Using the original cutoff value of 10, the sensitivity was 45.2% an the specificity was 73.8%. This results in a positive predictive value of 19.4% and a negative predictive value of 90.6%. The proportion of variation in AL occurrence, which is explained by the NUn score, was 2.5% (PEV = 0.025). This study provides evidence for an external validation of a simple risk score for AL after oesophageal resection. In this cohort, the NUn score is not useful due to its poor discrimination.
Raji, Olaide Y.; Duffy, Stephen W.; Agbaje, Olorunshola F.; Baker, Stuart G.; Christiani, David C.; Cassidy, Adrian; Field, John K.
2013-01-01
Background External validation of existing lung cancer risk prediction models is limited. Using such models in clinical practice to guide the referral of patients for computed tomography (CT) screening for lung cancer depends on external validation and evidence of predicted clinical benefit. Objective To evaluate the discrimination of the Liverpool Lung Project (LLP) risk model and demonstrate its predicted benefit for stratifying patients for CT screening by using data from 3 independent studies from Europe and North America. Design Case–control and prospective cohort study. Setting Europe and North America. Patients Participants in the European Early Lung Cancer (EUELC) and Harvard case–control studies and the LLP population-based prospective cohort (LLPC) study. Measurements 5-year absolute risks for lung cancer predicted by the LLP model. Results The LLP risk model had good discrimination in both the Harvard (area under the receiver-operating characteristic curve [AUC], 0.76 [95% CI, 0.75 to 0.78]) and the LLPC (AUC, 0.82 [CI, 0.80 to 0.85]) studies and modest discrimination in the EUELC (AUC, 0.67 [CI, 0.64 to 0.69]) study. The decision utility analysis, which incorporates the harms and benefit of using a risk model to make clinical decisions, indicates that the LLP risk model performed better than smoking duration or family history alone in stratifying high-risk patients for lung cancer CT screening. Limitations The model cannot assess whether including other risk factors, such as lung function or genetic markers, would improve accuracy. Lack of information on asbestos exposure in the LLPC limited the ability to validate the complete LLP risk model. Conclusion Validation of the LLP risk model in 3 independent external data sets demonstrated good discrimination and evidence of predicted benefits for stratifying patients for lung cancer CT screening. Further studies are needed to prospectively evaluate model performance and evaluate the optimal population risk thresholds for initiating lung cancer screening. Primary Funding Source Roy Castle Lung Cancer Foundation. PMID:22910935
James, Jack E
2017-09-01
Throughout the quarter century since the advent of evidence-based medicine (EBM), medical research has prioritized 'efficacy' (i.e. internal validity) using randomized controlled trials. EBM has consistently neglected 'effectiveness' and 'cost-effectiveness', identified in the pioneering work of Archie Cochrane as essential for establishing the external (i.e. clinical) validity of health care interventions. Neither Cochrane nor other early pioneers appear to have foreseen the extent to which EBM would be appropriated by the pharmaceutical and medical devices industries, which are responsible for extensive biases in clinical research due to selective reporting, exaggeration of benefits, minimization of risks, and misrepresentation of data. The promise of EBM to effect transformational change in health care will remain unfulfilled until (i) studies of effectiveness and cost-effectiveness are pursued with some of the same fervour that previously succeeded in elevating the status of the randomized controlled trial, and (ii) ways are found to defeat threats to scientific integrity posed by commercial conflicts of interest. © 2017 Stichting European Society for Clinical Investigation Journal Foundation.
Graham, Jesse; Nosek, Brian A.; Haidt, Jonathan; Iyer, Ravi; Koleva, Spassena; Ditto, Peter H.
2010-01-01
The moral domain is broader than the empathy and justice concerns assessed by existing measures of moral competence, and it is not just a subset of the values assessed by value inventories. To fill the need for reliable and theoretically-grounded measurement of the full range of moral concerns, we developed the Moral Foundations Questionnaire (MFQ) based on a theoretical model of five universally available (but variably developed) sets of moral intuitions: Harm/care, Fairness/reciprocity, Ingroup/loyalty, Authority/respect, and Purity/sanctity. We present evidence for the internal and external validity of the scale and the model, and in doing so present new findings about morality: 1. Comparative model fitting of confirmatory factor analyses provides empirical justification for a five-factor structure of moral concerns. 2. Convergent/discriminant validity evidence suggests that moral concerns predict personality features and social group attitudes not previously considered morally relevant. 3. We establish pragmatic validity of the measure in providing new knowledge and research opportunities concerning demographic and cultural differences in moral intuitions. These analyses provide evidence for the usefulness of Moral Foundations Theory in simultaneously increasing the scope and sharpening the resolution of psychological views of morality. PMID:21244182
Waldman, Irwin D; Poore, Holly E; van Hulle, Carol; Rathouz, Paul J; Lahey, Benjamin B
2016-11-01
Several recent studies of the hierarchical phenotypic structure of psychopathology have identified a General psychopathology factor in addition to the more expected specific Externalizing and Internalizing dimensions in both youth and adult samples and some have found relevant unique external correlates of this General factor. We used data from 1,568 twin pairs (599 MZ & 969 DZ) age 9 to 17 to test hypotheses for the underlying structure of youth psychopathology and the external validity of the higher-order factors. Psychopathology symptoms were assessed via structured interviews of caretakers and youth. We conducted phenotypic analyses of competing structural models using Confirmatory Factor Analysis and used Structural Equation Modeling and multivariate behavior genetic analyses to understand the etiology of the higher-order factors and their external validity. We found that both a General factor and specific Externalizing and Internalizing dimensions are necessary for characterizing youth psychopathology at both the phenotypic and etiologic levels, and that the 3 higher-order factors differed substantially in the magnitudes of their underlying genetic and environmental influences. Phenotypically, the specific Externalizing and Internalizing dimensions were slightly negatively correlated when a General factor was included, which reflected a significant inverse correlation between the nonshared environmental (but not genetic) influences on Internalizing and Externalizing. We estimated heritability of the general factor of psychopathology for the first time. Its moderate heritability suggests that it is not merely an artifact of measurement error but a valid construct. The General, Externalizing, and Internalizing factors differed in their relations with 3 external validity criteria: mother's smoking during pregnancy, parent's harsh discipline, and the youth's association with delinquent peers. Multivariate behavior genetic analyses supported the external validity of the 3 higher-order factors by suggesting that the General, Externalizing, and Internalizing factors were correlated with peer delinquency and parent's harsh discipline for different etiologic reasons. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
ERIC Educational Resources Information Center
Martel, Michelle M.; Roberts, Bethan; Gremillion, Monica; von Eye, Alexander; Nigg, Joel T.
2011-01-01
The current paper provides external validation of the bifactor model of ADHD by examining associations between ADHD latent factor/profile scores and external validation indices. 548 children (321 boys; 302 with ADHD), 6 to 18 years old, recruited from the community participated in a comprehensive diagnostic procedure. Mothers completed the Child…
Selecting and Improving Quasi-Experimental Designs in Effectiveness and Implementation Research.
Handley, Margaret A; Lyles, Courtney R; McCulloch, Charles; Cattamanchi, Adithya
2018-04-01
Interventional researchers face many design challenges when assessing intervention implementation in real-world settings. Intervention implementation requires holding fast on internal validity needs while incorporating external validity considerations (such as uptake by diverse subpopulations, acceptability, cost, and sustainability). Quasi-experimental designs (QEDs) are increasingly employed to achieve a balance between internal and external validity. Although these designs are often referred to and summarized in terms of logistical benefits, there is still uncertainty about (a) selecting from among various QEDs and (b) developing strategies to strengthen the internal and external validity of QEDs. We focus here on commonly used QEDs (prepost designs with nonequivalent control groups, interrupted time series, and stepped-wedge designs) and discuss several variants that maximize internal and external validity at the design, execution and implementation, and analysis stages.
An empirical assessment of validation practices for molecular classifiers
Castaldi, Peter J.; Dahabreh, Issa J.
2011-01-01
Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21–61%) and 29% (IQR, 15–65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04–5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice. PMID:21300697
Validation of a scenario-based assessment of critical thinking using an externally validated tool.
Buur, Jennifer L; Schmidt, Peggy; Smylie, Dean; Irizarry, Kris; Crocker, Carlos; Tyler, John; Barr, Margaret
2012-01-01
With medical education transitioning from knowledge-based curricula to competency-based curricula, critical thinking skills have emerged as a major competency. While there are validated external instruments for assessing critical thinking, many educators have created their own custom assessments of critical thinking. However, the face validity of these assessments has not been challenged. The purpose of this study was to compare results from a custom assessment of critical thinking with the results from a validated external instrument of critical thinking. Students from the College of Veterinary Medicine at Western University of Health Sciences were administered a custom assessment of critical thinking (ACT) examination and the externally validated instrument, California Critical Thinking Skills Test (CCTST), in the spring of 2011. Total scores and sub-scores from each exam were analyzed for significant correlations using Pearson correlation coefficients. Significant correlations between ACT Blooms 2 and deductive reasoning and total ACT score and deductive reasoning were demonstrated with correlation coefficients of 0.24 and 0.22, respectively. No other statistically significant correlations were found. The lack of significant correlation between the two examinations illustrates the need in medical education to externally validate internal custom assessments. Ultimately, the development and validation of custom assessments of non-knowledge-based competencies will produce higher quality medical professionals.
Choo, Min Soo; Jeong, Seong Jin; Cho, Sung Yong; Yoo, Changwon; Jeong, Chang Wook; Ku, Ja Hyeon; Oh, Seung-June
2017-04-01
We aimed to externally validate the prediction model we developed for having bladder outlet obstruction (BOO) and requiring prostatic surgery using 2 independent data sets from tertiary referral centers, and also aimed to validate a mobile app for using this model through usability testing. Formulas and nomograms predicting whether a subject has BOO and needs prostatic surgery were validated with an external validation cohort from Seoul National University Bundang Hospital and Seoul Metropolitan Government-Seoul National University Boramae Medical Center between January 2004 and April 2015. A smartphone-based app was developed, and 8 young urologists were enrolled for usability testing to identify any human factor issues of the app. A total of 642 patients were included in the external validation cohort. No significant differences were found in the baseline characteristics of major parameters between the original (n=1,179) and the external validation cohort, except for the maximal flow rate. Predictions of requiring prostatic surgery in the validation cohort showed a sensitivity of 80.6%, a specificity of 73.2%, a positive predictive value of 49.7%, and a negative predictive value of 92.0%, and area under receiver operating curve of 0.84. The calibration plot indicated that the predictions have good correspondence. The decision curve showed also a high net benefit. Similar evaluation results using the external validation cohort were seen in the predictions of having BOO. Overall results of the usability test demonstrated that the app was user-friendly with no major human factor issues. External validation of these newly developed a prediction model demonstrated a moderate level of discrimination, adequate calibration, and high net benefit gains for predicting both having BOO and requiring prostatic surgery. Also a smartphone app implementing the prediction model was user-friendly with no major human factor issue.
A review of how to conduct a surgical survey using a questionnaire.
Hing, C B; Smith, T O; Hooper, L; Song, F; Donell, S T
2011-08-01
Health surveys using questionnaires facilitate the acquisition of information on the knowledge, behaviour, attitudes, perceptions and clinical history of a selected population. Their internal and external validities are threatened by poor design and low response rates. Numerous studies have investigated survey design and administration but care should be taken when generalising findings in different clinical and cultural settings. The current evidence-base suggests that no single mode of survey administration, such as postal, electronic or telephone, is superior to another. Whilst there is no evidence of an ideal response rate relationship to survey validity, response rates can be enhanced by including monetary incentives, providing a time cue, and repeat contact with non-responders. Unlike other modes of experimental data collection, few guidelines currently exist for survey and questionnaire design and response rate should not be considered a direct measure of a survey's quality. Copyright © 2010 Elsevier B.V. All rights reserved.
Validation Evidence of the Motivation for Teaching Scale in Secondary Education.
Abós, Ángel; Sevil, Javier; Martín-Albo, José; Aibar, Alberto; García-González, Luis
2018-04-10
Grounded in self-determination theory, the aim of this study was to develop a scale with adequate psychometric properties to assess motivation for teaching and to explain some outcomes of secondary education teachers at work. The sample comprised 584 secondary education teachers. Analyses supported the five-factor model (intrinsic motivation, identified regulation, introjected regulation, external regulation and amotivation) and indicated the presence of a continuum of self-determination. Evidence of reliability was provided by Cronbach's alpha, composite reliability and average variance extracted. Multigroup confirmatory factor analyses supported the partial invariance (configural and metric) of the scale in different sub-samples, in terms of gender and type of school. Concurrent validity was analyzed by a structural equation modeling that explained 71% of the work dedication variance and 69% of the boredom at work variance. Work dedication was positively predicted by intrinsic motivation (ß = .56, p < .001) and external regulation (ß = .29, p < .001) and negatively predicted by introjected regulation (ß = -.22, p < .001) and amotivation (ß = -.49, p < .001). Boredom at work was negatively predicted by intrinsic motivation (ß = -.28, p < .005) and positively predicted by amotivation (ß = .68, p < .001). The Motivation for Teaching Scale in Secondary Education (Spanish acronym EME-ES, Escala de Motivación por la Enseñanza en Educación Secundaria) is discussed as a valid and reliable instrument. This is the first specific scale in the work context of secondary teachers that has integrated the five-factor structure together with their dedication and boredom at work.
Developing an index to measure the voluntariness of consent to research.
Dugosh, Karen L; Festinger, David S; Marlowe, Douglas B; Clements, Nicolle T
2014-10-01
The goals of the current study were to expand the content domain and further validate the Coercion Assessment Scale (CAS), a measure of perceived coercion for criminally involved substance abusers being recruited into research. Unlike the few existing measures of this construct, the CAS identifies specific external sources of pressure that may influence one's decision to participate. In Phase 1, we conducted focus groups with criminal justice clients and stakeholders to expand the instrument by identifying additional sources of pressure. In Phase 2, we evaluated the expanded measure (i.e., endorsement rates, reliability, validity) in an ongoing research trial. Results identified new sources of pressure and provided evidence supporting the CAS's utility and reliability over time as well as convergent and discriminative validity. © The Author(s) 2014.
External validation of a Cox prognostic model: principles and methods
2013-01-01
Background A prognostic model should not enter clinical practice unless it has been demonstrated that it performs a useful role. External validation denotes evaluation of model performance in a sample independent of that used to develop the model. Unlike for logistic regression models, external validation of Cox models is sparsely treated in the literature. Successful validation of a model means achieving satisfactory discrimination and calibration (prediction accuracy) in the validation sample. Validating Cox models is not straightforward because event probabilities are estimated relative to an unspecified baseline function. Methods We describe statistical approaches to external validation of a published Cox model according to the level of published information, specifically (1) the prognostic index only, (2) the prognostic index together with Kaplan-Meier curves for risk groups, and (3) the first two plus the baseline survival curve (the estimated survival function at the mean prognostic index across the sample). The most challenging task, requiring level 3 information, is assessing calibration, for which we suggest a method of approximating the baseline survival function. Results We apply the methods to two comparable datasets in primary breast cancer, treating one as derivation and the other as validation sample. Results are presented for discrimination and calibration. We demonstrate plots of survival probabilities that can assist model evaluation. Conclusions Our validation methods are applicable to a wide range of prognostic studies and provide researchers with a toolkit for external validation of a published Cox model. PMID:23496923
[Modeling in value-based medicine].
Neubauer, A S; Hirneiss, C; Kampik, A
2010-03-01
Modeling plays an important role in value-based medicine (VBM). It allows decision support by predicting potential clinical and economic consequences, frequently combining different sources of evidence. Based on relevant publications and examples focusing on ophthalmology the key economic modeling methods are explained and definitions are given. The most frequently applied model types are decision trees, Markov models, and discrete event simulation (DES) models. Model validation includes besides verifying internal validity comparison with other models (external validity) and ideally validation of its predictive properties. The existing uncertainty with any modeling should be clearly stated. This is true for economic modeling in VBM as well as when using disease risk models to support clinical decisions. In economic modeling uni- and multivariate sensitivity analyses are usually applied; the key concepts here are tornado plots and cost-effectiveness acceptability curves. Given the existing uncertainty, modeling helps to make better informed decisions than without this additional information.
Clinical audit project in undergraduate medical education curriculum: an assessment validation study
Steketee, Carole; Mak, Donna
2016-01-01
Objectives To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. Methods A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). Results The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students’ and examiners’ response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach’s alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. Conclusions This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole. PMID:27716612
Tor, Elina; Steketee, Carole; Mak, Donna
2016-09-24
To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students' and examiners' response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach's alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole.
First-in-human Phase 1 CRISPR Gene Editing Cancer Trials: Are We Ready?
Baylis, Francoise; McLeod, Marcus
2017-01-01
A prospective first-in-human Phase 1 CRISPR gene editing trial in the United States for patients with melanoma, synovial sarcoma, and multiple myeloma offers hope that gene editing tools may usefully treat human disease. An overarching ethical challenge with first-in-human Phase 1 clinical trials, however, is knowing when it is ethically acceptable to initiate such trials on the basis of safety and efficacy data obtained from pre-clinical studies. If the pre-clinical studies that inform trial design are themselves poorly designed - as a result of which the quality of pre-clinical evidence is deficient - then the ethical requirement of scientific validity for clinical research may not be satisfied. In turn, this could mean that the Phase 1 clinical trial will be unsafe and that trial participants will be exposed to risk for no potential benefit. To assist sponsors, researchers, clinical investigators and reviewers in deciding when it is ethically acceptable to initiate first-in-human Phase 1 CRISPR gene editing clinical trials, structured processes have been developed to assess and minimize translational distance between pre-clinical and clinical research. These processes draw attention to various features of internal validity, construct validity, and external validity. As well, the credibility of supporting evidence is to be critically assessed with particular attention to optimism bias, financial conflicts of interest and publication bias. We critically examine the pre-clinical evidence used to justify the first-inhuman Phase 1 CRISPR gene editing cancer trial in the United States using these tools. We conclude that the proposed trial cannot satisfy the ethical requirement of scientific validity because the supporting pre-clinical evidence used to inform trial design is deficient. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Bray, Benjamin D; Campbell, James; Cloud, Geoffrey C; Hoffman, Alex; James, Martin; Tyrrell, Pippa J; Wolfe, Charles D A; Rudd, Anthony G
2014-11-01
Case mix adjustment is required to allow valid comparison of outcomes across care providers. However, there is a lack of externally validated models suitable for use in unselected stroke admissions. We therefore aimed to develop and externally validate prediction models to enable comparison of 30-day post-stroke mortality outcomes using routine clinical data. Models were derived (n=9000 patients) and internally validated (n=18 169 patients) using data from the Sentinel Stroke National Audit Program, the national register of acute stroke in England and Wales. External validation (n=1470 patients) was performed in the South London Stroke Register, a population-based longitudinal study. Models were fitted using general estimating equations. Discrimination and calibration were assessed using receiver operating characteristic curve analysis and correlation plots. Two final models were derived. Model A included age (<60, 60-69, 70-79, 80-89, and ≥90 years), National Institutes of Health Stroke Severity Score (NIHSS) on admission, presence of atrial fibrillation on admission, and stroke type (ischemic versus primary intracerebral hemorrhage). Model B was similar but included only the consciousness component of the NIHSS in place of the full NIHSS. Both models showed excellent discrimination and calibration in internal and external validation. The c-statistics in external validation were 0.87 (95% confidence interval, 0.84-0.89) and 0.86 (95% confidence interval, 0.83-0.89) for models A and B, respectively. We have derived and externally validated 2 models to predict mortality in unselected patients with acute stroke using commonly collected clinical variables. In settings where the ability to record the full NIHSS on admission is limited, the level of consciousness component of the NIHSS provides a good approximation of the full NIHSS for mortality prediction. © 2014 American Heart Association, Inc.
Maier, Jürgen; Hampe, J Felix; Jahn, Nico
2016-01-01
Real-time response (RTR) measurement is an important technique for analyzing human processing of electronic media stimuli. Although it has been demonstrated that RTR data are reliable and internally valid, some argue that they lack external validity. The reason for this is that RTR measurement is restricted to a laboratory environment due to its technical requirements. This paper introduces a smartphone app that 1) captures real-time responses using the dial technique and 2) provides a solution for one of the most important problems in RTR measurement, the (automatic) synchronization of RTR data. In addition, it explores the reliability and validity of mobile RTR measurement by comparing the real-time reactions of two samples of young and well-educated voters to the 2013 German televised debate. Whereas the first sample participated in a classical laboratory study, the second sample was equipped with our mobile RTR system and watched the debate at home. Results indicate that the mobile RTR system yields similar results to the lab-based RTR measurement, providing evidence that laboratory studies using RTR are externally valid. In particular, the argument that the artificial reception situation creates artificial results has to be questioned. In addition, we conclude that RTR measurement outside the lab is possible. Hence, mobile RTR opens the door for large-scale studies to better understand the processing and impact of electronic media content.
Maier, Jürgen; Hampe, J. Felix; Jahn, Nico
2016-01-01
Real-time response (RTR) measurement is an important technique for analyzing human processing of electronic media stimuli. Although it has been demonstrated that RTR data are reliable and internally valid, some argue that they lack external validity. The reason for this is that RTR measurement is restricted to a laboratory environment due to its technical requirements. This paper introduces a smartphone app that 1) captures real-time responses using the dial technique and 2) provides a solution for one of the most important problems in RTR measurement, the (automatic) synchronization of RTR data. In addition, it explores the reliability and validity of mobile RTR measurement by comparing the real-time reactions of two samples of young and well-educated voters to the 2013 German televised debate. Whereas the first sample participated in a classical laboratory study, the second sample was equipped with our mobile RTR system and watched the debate at home. Results indicate that the mobile RTR system yields similar results to the lab-based RTR measurement, providing evidence that laboratory studies using RTR are externally valid. In particular, the argument that the artificial reception situation creates artificial results has to be questioned. In addition, we conclude that RTR measurement outside the lab is possible. Hence, mobile RTR opens the door for large-scale studies to better understand the processing and impact of electronic media content. PMID:27274577
Analysis of model development strategies: predicting ventral hernia recurrence.
Holihan, Julie L; Li, Linda T; Askenasy, Erik P; Greenberg, Jacob A; Keith, Jerrod N; Martindale, Robert G; Roth, J Scott; Liang, Mike K
2016-11-01
There have been many attempts to identify variables associated with ventral hernia recurrence; however, it is unclear which statistical modeling approach results in models with greatest internal and external validity. We aim to assess the predictive accuracy of models developed using five common variable selection strategies to determine variables associated with hernia recurrence. Two multicenter ventral hernia databases were used. Database 1 was randomly split into "development" and "internal validation" cohorts. Database 2 was designated "external validation". The dependent variable for model development was hernia recurrence. Five variable selection strategies were used: (1) "clinical"-variables considered clinically relevant, (2) "selective stepwise"-all variables with a P value <0.20 were assessed in a step-backward model, (3) "liberal stepwise"-all variables were included and step-backward regression was performed, (4) "restrictive internal resampling," and (5) "liberal internal resampling." Variables were included with P < 0.05 for the Restrictive model and P < 0.10 for the Liberal model. A time-to-event analysis using Cox regression was performed using these strategies. The predictive accuracy of the developed models was tested on the internal and external validation cohorts using Harrell's C-statistic where C > 0.70 was considered "reasonable". The recurrence rate was 32.9% (n = 173/526; median/range follow-up, 20/1-58 mo) for the development cohort, 36.0% (n = 95/264, median/range follow-up 20/1-61 mo) for the internal validation cohort, and 12.7% (n = 155/1224, median/range follow-up 9/1-50 mo) for the external validation cohort. Internal validation demonstrated reasonable predictive accuracy (C-statistics = 0.772, 0.760, 0.767, 0.757, 0.763), while on external validation, predictive accuracy dipped precipitously (C-statistic = 0.561, 0.557, 0.562, 0.553, 0.560). Predictive accuracy was equally adequate on internal validation among models; however, on external validation, all five models failed to demonstrate utility. Future studies should report multiple variable selection techniques and demonstrate predictive accuracy on external data sets for model validation. Copyright © 2016 Elsevier Inc. All rights reserved.
Evidence for a relationship between trait gratitude and prosocial behaviour.
Yost-Dubrow, Rachel; Dunham, Yarrow
2018-03-01
Prosocial behaviour towards unrelated others is communally beneficial but can be individually costly. The emotion of gratitude mitigates this cost by encouraging direct as well as "upstream" reciprocity, thereby facilitating cooperation. A widely used method for measuring trait gratitude is the Gratitude Questionnaire (GQ6) [McCullough, M., Emmons, R., & Tsang, J. (2002). The grateful disposition: A conceptual and empirical topography. Journal of Personality and Social Psychology, 82, 112-127. Retrieved from https://doi.org/10.1037/0022-3514.82.1.112 ]. Here we undertake an assessment of the external validity of the GQ6 by examining its relationship with two incentivized economic games that serve as face valid indices of generosity and reciprocity. In two studies (total N = 501) we find that trait gratitude as measured by the GQ6 predicts greater donations in a charity donation task as well as greater transfers and returns in an incentivized trust game. These results support the hypothesis that individuals with higher trait gratitude are more generous and trusting on average, and provide initial evidence as to the predictive validity of the GQ6.
Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong
2015-08-01
The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
On framing the research question and choosing the appropriate research design.
Parfrey, Patrick S; Ravani, Pietro
2015-01-01
Clinical epidemiology is the science of human disease investigation with a focus on diagnosis, prognosis, and treatment. The generation of a reasonable question requires definition of patients, interventions, controls, and outcomes. The goal of research design is to minimize error, to ensure adequate samples, to measure input and output variables appropriately, to consider external and internal validities, to limit bias, and to address clinical as well as statistical relevance. The hierarchy of evidence for clinical decision-making places randomized controlled trials (RCT) or systematic review of good quality RCTs at the top of the evidence pyramid. Prognostic and etiologic questions are best addressed with longitudinal cohort studies.
On framing the research question and choosing the appropriate research design.
Parfrey, Patrick; Ravani, Pietro
2009-01-01
Clinical epidemiology is the science of human disease investigation with a focus on diagnosis, prognosis, and treatment. The generation of a reasonable question requires the definition of patients, interventions, controls, and outcomes. The goal of research design is to minimize error, ensure adequate samples, measure input and output variables appropriately, consider external and internal validities, limit bias, and address clinical as well as statistical relevance. The hierarchy of evidence for clinical decision making places randomized controlled trials (RCT) or systematic review of good quality RCTs at the top of the evidence pyramid. Prognostic and etiologic questions are best addressed with longitudinal cohort studies.
Development and Validation of the Escala de Actitudes Emprendedoras para Estudiantes (EAEE).
Oliver, Amparo; Galiana, Laura
2015-03-17
During the last few years, entrepreneurship has gained an important role in many economic and social policies, with the consequent growth of entrepreneurial research in many social areas. However, in the Spanish psychometric context, there is not an updated scale including recent contributions to entrepreneurship attitudes literature. The aim of this study is to present and validate a new scale named Escala de Actitudes Emprendedoras para Estudiantes-EAEE, (Entrepreneurial Attitudes Scale for Students, EASS), in two samples of high school and university Spanish students. Data comes from a cross-sectional survey of 524 high school and undergraduate students, from Valencia (Spain). Two confirmatory factor analyses (CFAs) were estimated, together with reliability and validity evidence of the scale. Results offered evidence of the adequate psychometric properties of the EASS. The CFAs showed overall and analytical adequate fit indexes (χ 2 (120) = 163.19 (p < .01), GFI = .906, CFI = .959, SRMR = .044, RMSEA = .040 [CI .022-.054]); reliability indices of the entrepreneurial attitudes were appropriate for most of the entrepreneurial attitudes (α were between .63 and .87 for the different dimensions); and external evidence relating entrepreneurial dimensions to personality traits was similar to in previous studies. The scale could be a useful instrument both for previous diagnosis and effectiveness assessment of programs on entrepreneurship promotion.
Cannon, Joanna E; Guardino, Caroline; Antia, Shirin D; Luckner, John L
2016-01-01
The field of education of deaf and hard of hearing (DHH) students has a paucity of evidence-based practices (EBPs) to guide instruction. The authors discussed how the research methodology of single-case design (SCD) can be used to build EBPs through direct and systematic replication of studies. An overview of SCD research methods is presented, including an explanation of how internal and external validity issues are addressed, and why SCD is appropriate for intervention research with DHH children. The authors then examine the SCD research in the field according to quality indicators (QIs; at the individual level and as a body of evidence) to determine the existing evidence base. Finally, future replication areas are recommended to fill the gaps in SCD research with students who are DHH in order to add to the evidence base in the field.
External validation of anti-Müllerian hormone based prediction of live birth in assisted conception
2013-01-01
Background Chronological age and oocyte yield are independent determinants of live birth in assisted conception. Anti-Müllerian hormone (AMH) is strongly associated with oocyte yield after controlled ovarian stimulation. We have previously assessed the ability of AMH and age to independently predict live birth in an Italian assisted conception cohort. Herein we report the external validation of the nomogram in 822 UK first in vitro fertilization (IVF) cycles. Methods Retrospective cohort consisting of 822 patients undergoing their first IVF treatment cycle at Glasgow Centre for Reproductive Medicine. Analyses were restricted to women aged between 25 and 42 years of age. All women had an AMH measured prior to commencing their first IVF cycle. The performance of the model was assessed; discrimination by the area under the receiver operator curve (ROCAUC) and model calibration by the predicted probability versus observed probability. Results Live births occurred in 29.4% of the cohort. The observed and predicted outcomes showed no evidence of miscalibration (p = 0.188). The ROCAUC was 0.64 (95% CI: 0.60, 0.68), suggesting moderate and similar discrimination to the original model. The ROCAUC for a continuous model of age and AMH was 0.65 (95% CI 0.61, 0.69), suggesting that the original categories of AMH were appropriate. Conclusions We confirm by external validation that AMH and age are independent predictors of live birth. Although the confidence intervals for each category are wide, our results support the assessment of AMH in larger cohorts with detailed baseline phenotyping for live birth prediction. PMID:23294733
Mapping the MMPI-2-RF Specific Problems Scales Onto Extant Psychopathology Structures.
Sellbom, Martin
2017-01-01
A main objective in developing the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008 ) was to link the hierarchical structure of the instrument's scales to contemporary psychopathology and personality models for greater enhancement of construct validity. Initial evidence published with the Restructured Clinical scales has indicated promising results in that the higher order structure of these measures maps onto those reported in the extant psychopathology literature. This study focused on evaluating the internal structure of the Specific Problems and Interest scales, which have not yet been examined in this manner. Two large, mixed-gender outpatient and correctional samples were used. Exploratory factor analyses revealed consistent evidence for a 4-factor structure representing somatization, negative affect, externalizing, and social detachment. Convergent and discriminant validity analyses in the outpatient sample yielded a pattern of results consistent with expectations. These findings add further evidence to indicate that the MMPI-2-RF hierarchy of scales map onto extant psychopathology literature, and also add support to the notion that somatization and detachment should be considered important higher order domains in the psychopathology literature.
Fernandez-Hermida, Jose Ramon; Calafat, Amador; Becoña, Elisardo; Tsertsvadze, Alexander; Foxcroft, David R
2012-09-01
To assess external validity characteristics of studies from two Cochrane Systematic Reviews of the effectiveness of universal family-based prevention of alcohol misuse in young people. Two reviewers used an a priori developed external validity rating form and independently assessed three external validity dimensions of generalizability, applicability and predictability (GAP) in randomized controlled trials. The majority (69%) of the included 29 studies were rated 'unclear' on the reporting of sufficient information for judging generalizability from sample to study population. Ten studies (35%) were rated 'unclear' on the reporting of sufficient information for judging applicability to other populations and settings. No study provided an assessment of the validity of the trial end-point measures for subsequent mortality, morbidity, quality of life or other economic or social outcomes. Similarly, no study reported on the validity of surrogate measures using established criteria for assessing surrogate end-points. Studies evaluating the benefits of family-based prevention of alcohol misuse in young people are generally inadequate at reporting information relevant to generalizability of the findings or implications for health or social outcomes. Researchers, study authors, peer reviewers, journal editors and scientific societies should take steps to improve the reporting of information relevant to external validity in prevention trials. © 2012 The Authors. Addiction © 2012 Society for the Study of Addiction.
Beauger, Davy; Fruit, Dorothée; Villeneuve, Claire; Laroche, Marie-Laure; Jouve, Elisabeth; Rousseau, Annick; Boyer, Laurent; Gentile, Stéphanie
2016-09-01
Renal transplantation is considered as the treatment of choice for patients with end-stage renal disease. Health-related quality of life (HRQoL) of renal transplant recipients (RTR) is very important to assess, especially during the first year after transplantation. To provide new evidence about the suitability of HRQoL measures in RTR during the first post-transplant year, we explored the internal structure, reliability and external validity of a French specific HRQoL instrument, the Renal Transplant Quality of life Questionnaire Second Version (RTQ V2). The data were issued from the French multicenter cohort of renal transplant patients followed during 4 years (EPIGREN). The HRQoL of RTR was assessed five times (at 1, 3, 6, 9 and 12 months after transplantation) with the RTQ V2, a specific instrument consisting of 32 items describing five dimensions. Socio-demographic information, clinical characteristics and HRQoL (i.e., RTQ V2 and SF-36) were collected. For the five times, psychometric properties of the RTQ V2 were compared to those reported from the reference population assessed in the validation study. Three hundred and thirty-four patients were enrolled. The proportions of well-projected items, item-internal consistency, item-discriminant validity, floor and ceiling effects, Cronbach's alpha coefficients and item goodness-of-fit statistics were satisfactory for each dimension at the five times of the study. The suitability indices of construct validity were higher than 90 % for each time (minimum-maximum: 90.8-97.4 %). The external validity was less satisfactory, with a suitability indices ranged from 46.7 % at M1 to 66.7 % at M12. However, the discrepancies with the reference population (mainly for the gender) appeared logical considering the scientific literature on HRQoL of RTR during the first post-transplant year and may not compromise the external validity. These results support the validity and reliability of the RTQ V2 for evaluating HRQoL in RTR during the first post-transplant year, and confirm that the RTQ V2 is a useful tool to assess the HRQoL precociously after transplant.
Developing a Brief Cross-Culturally Validated Screening Tool for Externalizing Disorders in Children
ERIC Educational Resources Information Center
Zwirs, Barbara W. C.; Burger, Huibert; Schulpen, Tom W. J.; Buitelaar, Jan K.
2008-01-01
The study aims at developing and validating a brief, easy-to-use screening instrument for teachers to predict externalizing disorders in children and recommending them for timely referral. The scores are compared between Dutch and non-Dutch immigrant children and a significant amount of cases for externalizing disorders were identified but sex and…
Tijssen, J G
1999-01-01
The Domestic/international Gastroenterology Surveillance Study (DIGEST) examined the prevalence of upper gastrointestinal symptoms among the general population in 10 countries, and the impact of these symptoms on healthcare usage and quality of life. This report discusses the validation of the DIGEST sample and reviews the response rates from the survey. External validation of the DIGEST sample was conducted by comparing the age, age by gender and annual household incomes of the sample with census-derived data. A comparison was also made between Psychological General Well-Being Index (PGWBI) scores from study subjects in the Scandinavian countries and the USA and the total sample population norms. Under- and oversampling, defined as > or =5% difference from the population norms, was evident in eight out of 10 countries, but no systematic bias was evident. The final distribution of the sample by gender was 51% female and 49% male. Although differences in PGWBI scores were noted between DIGEST subjects and population norms, these differences were <0.30 standard deviations--markedly below the difference considered as relevant for the PGWBI. Response for the survey in individual countries ranged from 17% in the USA to 61% in Norway, with a survey-wide rate of 27%. The overall response rate, including primary non-respondents, was 13.4%. The majority of nonresponse (51.4%) was attributed to failure to establish contact with the subjects, with 41.7% of subjects declining to be interviewed and the remaining 6.9% of subjects not meeting the age and sex criteria used for the survey. The DIGEST sample exhibited good external validity, providing a foundation for comparison between data derived from individual countries in the survey.
Meertens, Linda J E; van Montfort, Pim; Scheepers, Hubertina C J; van Kuijk, Sander M J; Aardenburg, Robert; Langenveld, Josje; van Dooren, Ivo M A; Zwaan, Iris M; Spaanderman, Marc E A; Smits, Luc J M
2018-04-17
Prediction models may contribute to personalized risk-based management of women at high risk of spontaneous preterm delivery. Although prediction models are published frequently, often with promising results, external validation generally is lacking. We performed a systematic review of prediction models for the risk of spontaneous preterm birth based on routine clinical parameters. Additionally, we externally validated and evaluated the clinical potential of the models. Prediction models based on routinely collected maternal parameters obtainable during first 16 weeks of gestation were eligible for selection. Risk of bias was assessed according to the CHARMS guidelines. We validated the selected models in a Dutch multicenter prospective cohort study comprising 2614 unselected pregnant women. Information on predictors was obtained by a web-based questionnaire. Predictive performance of the models was quantified by the area under the receiver operating characteristic curve (AUC) and calibration plots for the outcomes spontaneous preterm birth <37 weeks and <34 weeks of gestation. Clinical value was evaluated by means of decision curve analysis and calculating classification accuracy for different risk thresholds. Four studies describing five prediction models fulfilled the eligibility criteria. Risk of bias assessment revealed a moderate to high risk of bias in three studies. The AUC of the models ranged from 0.54 to 0.67 and from 0.56 to 0.70 for the outcomes spontaneous preterm birth <37 weeks and <34 weeks of gestation, respectively. A subanalysis showed that the models discriminated poorly (AUC 0.51-0.56) for nulliparous women. Although we recalibrated the models, two models retained evidence of overfitting. The decision curve analysis showed low clinical benefit for the best performing models. This review revealed several reporting and methodological shortcomings of published prediction models for spontaneous preterm birth. Our external validation study indicated that none of the models had the ability to predict spontaneous preterm birth adequately in our population. Further improvement of prediction models, using recent knowledge about both model development and potential risk factors, is necessary to provide an added value in personalized risk assessment of spontaneous preterm birth. © 2018 The Authors Acta Obstetricia et Gynecologica Scandinavica published by John Wiley & Sons Ltd on behalf of Nordic Federation of Societies of Obstetrics and Gynecology (NFOG).
Demonstrating Experimenter "Ineptitude" as a Means of Teaching Internal and External Validity
ERIC Educational Resources Information Center
Treadwell, Kimberli R.H.
2008-01-01
Internal and external validity are key concepts in understanding the scientific method and fostering critical thinking. This article describes a class demonstration of a "botched" experiment to teach validity to undergraduates. Psychology students (N = 75) completed assessments at the beginning of the semester, prior to and immediately following…
Olsen, L R; Jensen, D V; Noerholm, V; Martiny, K; Bech, P
2003-02-01
We have developed the Major Depression Inventory (MDI), consisting of 10 items, covering the DSM-IV as well as the ICD-10 symptoms of depressive illness. We aimed to evaluate this as a scale measuring severity of depressive states with reference to both internal and external validity. Patients representing the score range from no depression to marked depression on the Hamilton Depression Scale (HAM-D) completed the MDI. Both classical and modern psychometric methods were applied for the evaluation of validity, including the Rasch analysis. In total, 91 patients were included. The results showed that the MDI had an adequate internal validity in being a unidimensional scale (the total score an appropriate or sufficient statistic). The external validity of the MDI was also confirmed as the total score of the MDI correlated significantly with the HAM-D (Pearson's coefficient 0.86, P < or = 0.01, Spearman 0.80, P < or = 0.01). When used in a sample of patients with different states of depression the MDI has an adequate internal and external validity.
Wilson, R; Abbott, J H
2018-04-01
To describe the construction and preliminary validation of a new population-based microsimulation model developed to analyse the health and economic burden and cost-effectiveness of treatments for knee osteoarthritis (OA) in New Zealand (NZ). We developed the New Zealand Management of Osteoarthritis (NZ-MOA) model, a discrete-time state-transition microsimulation model of the natural history of radiographic knee OA. In this article, we report on the model structure, derivation of input data, validation of baseline model parameters against external data sources, and validation of model outputs by comparison of the predicted population health loss with previous estimates. The NZ-MOA model simulates both the structural progression of radiographic knee OA and the stochastic development of multiple disease symptoms. Input parameters were sourced from NZ population-based data where possible, and from international sources where NZ-specific data were not available. The predicted distributions of structural OA severity and health utility detriments associated with OA were externally validated against other sources of evidence, and uncertainty resulting from key input parameters was quantified. The resulting lifetime and current population health-loss burden was consistent with estimates of previous studies. The new NZ-MOA model provides reliable estimates of the health loss associated with knee OA in the NZ population. The model structure is suitable for analysis of the effects of a range of potential treatments, and will be used in future work to evaluate the cost-effectiveness of recommended interventions within the NZ healthcare system. Copyright © 2018 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Methodological review of the quality of reach out and read: does it "work"?
Yeager Pelatti, Christina; Pentimonti, Jill M; Justice, Laura M
2014-04-01
A considerable percentage of American children and adults fail to learn adequate literacy skills and read below a third grade level. Shared book reading is perhaps the single most important activity to prepare young children for success in reading. The primary objective of this manuscript was to critically review the methodological quality of Read Out and Read (ROR), a clinically based literacy program/intervention that teaches parents strategies to incorporate while sharing books with children as a method of preventing reading difficulties and academic struggles. A PubMed search was conducted. Articles that met three criteria were considered. First, the study must be clinically based and include parent contact with a pediatrician. Second, parental counseling ("anticipatory guidance") about the importance of parent-child book reading must be included. Third, only experimental or quasi-experimental studies were included; no additional criteria were used. Published articles from any year and peer-reviewed journal were considered. Study quality was determined using a modified version of the Downs and Black (1998) checklist assessing four categories: (1) Reporting, (2) External Validity, (3) Internal Validity-Bias, and (4) Internal Validity-Confounding. We were also interested in whether quality differed based on study design, children's age, sample size, and study outcome. Eleven studies met the inclusion criteria. The overall quality of evidence was variable across all studies; Reporting and External Validity categories were relatively strong while methodological concerns were found in the area of internal validity. Quality scores differed on the four study characteristics. Implications related to clinical practice and future studies are discussed.
Baumstarck, Karine; Boyer, Laurent; Boucekine, Mohamed; Aghababian, Valérie; Parola, Nathalie; Lançon, Christophe; Auquier, Pascal
2013-06-01
Impaired executive functions are among the most widely observed in patients suffering from schizophrenia. The use of self-reported outcomes for evaluating treatment and managing care of these patients has been questioned. The aim of this study was to provide new evidence about the suitability of self-reported outcome for use in this specific population by exploring the internal structure, reliability and external validity of a specific quality of life (QoL) instrument, the Schizophrenia Quality of Life questionnaire (SQoL18). cross-sectional study. age over 18 years, diagnosis of schizophrenia according to the DSM-IV criteria. sociodemographic (age, gender, and education level) and clinical data (duration of illness, Positive and Negative Syndrome Scale, Calgary Depression Scale for Schizophrenia); QoL (SQoL18); and executive performance (Stroop test, lexical and verbal fluency, and trail-making test). Non-impaired and impaired populations were defined for each of the three tests. For the six groups, psychometric properties were compared to those reported from the reference population assessed in the validation study. One hundred and thirteen consecutive patients were enrolled. The factor analysis performed in the impaired groups showed that the questionnaire structure adequately matched the initial structure of the SQoL18. The unidimensionality of the dimensions was preserved, and the internal/external validity indices were close to those of the non-impaired groups and the reference population. Our study suggests that executive dysfunction did not compromise the reliability or validity of self-reported disease-specific QoL questionnaire. Copyright © 2013 Elsevier B.V. All rights reserved.
Ullrich-French, Sarah; González Hernández, Juan; Hidalgo Montesinos, María D
2017-02-01
Mindfulness is an increasingly popular construct with promise in enhancing multiple positive health outcomes. Physical activity is an important behavior for enhancing overall health, but no Spanish language scale exists to test how mindfulness during physical activity may facilitate physical activity motivation or behavior. This study examined the validity of a Spanish adaption of a new scale, the State Mindfulness Scale for Physical Activity, to assess mindfulness during a specific experience of physical activity. Spanish youths (N = 502) completed a cross-sectional survey of state mindfulness during physical activity and physical activity motivation regulations based on Self-Determination Theory. A high-order model fit the data well and supports the use of one general state mindfulness factor or the use of separate subscales of mindfulness of mental (e.g., thoughts, emotions) and body (physical movement, muscles) aspects of the experience. Internal consistency reliability was good for the general scale and both sub-scales. The pattern of correlations with motivation regulations provides further support for construct validity with significant and positive correlations with self-determined forms of motivation and significant and negative correlations with external regulation and amotivation. Initial validity evidence is promising for the use of the adapted measure.
Assessing the Generalizability of Randomized Trial Results to Target Populations
Stuart, Elizabeth A.; Bradshaw, Catherine P.; Leaf, Philip J.
2014-01-01
Recent years have seen increasing interest in and attention to evidence-based practices, where the “evidence” generally comes from well-conducted randomized trials. However, while those trials yield accurate estimates of the effect of the intervention for the participants in the trial (known as “internal validity”), they do not always yield relevant information about the effects in a particular target population (known as “external validity”). This may be due to a lack of specification of a target population when designing the trial, difficulties recruiting a sample that is representative of a pre-specified target population, or to interest in considering a target population somewhat different from the population directly targeted by the trial. This paper first provides an overview of existing design and analysis methods for assessing and enhancing the ability of a randomized trial to estimate treatment effects in a target population. It then provides a case study using one particular method, which weights the subjects in a randomized trial to match the population on a set of observed characteristics. The case study uses data from a randomized trial of School-wide Positive Behavioral Interventions and Supports (PBIS); our interest is in generalizing the results to the state of Maryland. In the case of PBIS, after weighting, estimated effects in the target population were similar to those observed in the randomized trial. The paper illustrates that statistical methods can be used to assess and enhance the external validity of randomized trials, making the results more applicable to policy and clinical questions. However, there are also many open research questions; future research should focus on questions of treatment effect heterogeneity and further developing these methods for enhancing external validity. Researchers should think carefully about the external validity of randomized trials and be cautious about extrapolating results to specific populations unless they are confident of the similarity between the trial sample and that target population. PMID:25307417
External Validity in the Study of Human Development: Theoretical and Methodological Issues
ERIC Educational Resources Information Center
Hultsch, David F.; Hickey, Tom
1978-01-01
An examination of the concept of external validity from two theoretical perspectives: a traditional mechanistic approach and a dialectical organismic approach. Examines the theoretical and methodological implications of these perspectives. (BD)
Vachon, David D; Lynam, Donald R
2016-04-01
Low empathy is a criterion for most externalizing disorders, and empathy training is a regular component of treatment for aggressive people, from school bullies to sex offenders. However, recent meta-analytic evidence suggests that current measures of empathy explain only 1% of the variance in aggressive behavior. A new assessment of empathy was developed to more fully represent the empathy construct and better predict important outcomes--particularly aggressive behavior and externalizing psychopathology. Across three independent samples (N = 210-708), the 36-item Affective and Cognitive measure of Empathy (ACME) was internally consistent, structurally reliable, and invariant across sex. The ACME bore significant associations to important outcomes, which were incremental relative to other measures of empathy and generalizable across sex. Importantly, the affective scales of the ACME-particularly a new "Affective Dissonance" scale--yielded moderate to strong associations with aggressive behavior and externalizing disorders. The ACME is a short, reliable, and useful measure of empathy. © The Author(s) 2015.
An evaluation of the construct of earned security in adolescents: evidence from an inpatient sample.
Venta, Amanda; Sharp, Carla; Shmueli-Goetz, Yael; Newlin, Elizabeth
2015-01-01
In adult attachment research, a group of individuals who convey secure attachments despite recalling difficult early caregiver relationships has been identified. The term earned security refers to individuals in this group, whereas continuous security refers to individuals who convey secure attachments and describe caring early relationships. Evidence on the validity of earned security in adults is mixed--with one longitudinal study showing that earned secure adults, despite contrary recollections, are actually more likely to have experienced positive caregiving than continuous secure adults. There is currently no evidence of earned security in adolescence, and exploring it in this age group may help shed light on the overall problem of the validity of this construct. Therefore, the broad aim of this study was to examine the construct of earned security in a group of inpatient adolescents. First, the authors aimed to identify a group of adolescents with secure attachments and memories of difficult caregiver relationships (i.e., proposed earned secure group) in a sample of 240 inpatient adolescents. Next, to explore external validity, the authors examined whether this group differed from others with regard to internalizing distress and emotion regulation. Findings indicated that a subset of secure adolescents recall difficult caregiving, as has been noted in adults, and that they differ from others with regard to emotion regulation. Despite this preliminary evidence that earned security can be identified in adolescents, the authors conclude with a discussion of the caveats of applying this construct in adolescents as well as adults.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.
2016-01-01
Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W
2016-01-01
External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
Evidence-Based School Behavior Assessment of Externalizing Behavior in Young Children.
Bagner, Daniel M; Boggs, Stephen R; Eyberg, Sheila M
2010-02-01
This study examined the psychometric properties of the Revised Edition of the School Observation Coding System (REDSOCS). Participants were 68 children ages 3 to 6 who completed parent-child interaction therapy for Oppositional Defiant Disorder as part of a larger efficacy trial. Interobserver reliability on REDSOCS categories was moderate to high, with percent agreement ranging from 47% to 90% (M = 67%) and Cohen's kappa coefficients ranging from .69 to .95 (M = .82). Convergent validity of the REDSOCS categories was supported by significant correlations with the Intensity Scale of the Sutter-Eyberg Student Behavior Inventory-Revised and related subscales of the Conners' Teacher Rating Scale-Revised: Long Version (CTRS-R: L). Divergent validity was indicated by nonsignificant correlations between REDSOCS categories and scales on the CTRS-R: L expected not to relate to disruptive classroom behavior. Treatment sensitivity was demonstrated for two of the three primary REDSOCS categories by significant pre to posttreatment changes. This study provides psychometric support for the designation of REDSOCS as an evidence-based assessment procedure for young children.
The use of mechanistic evidence in drug approval.
Aronson, Jeffrey K; La Caze, Adam; Kelly, Michael P; Parkkinen, Veli-Pekka; Williamson, Jon
2018-06-11
The role of mechanistic evidence tends to be under-appreciated in current evidence-based medicine (EBM), which focusses on clinical studies, tending to restrict attention to randomized controlled studies (RCTs) when they are available. The EBM+ programme seeks to redress this imbalance, by suggesting methods for evaluating mechanistic studies alongside clinical studies. Drug approval is a problematic case for the view that mechanistic evidence should be taken into account, because RCTs are almost always available. Nevertheless, we argue that mechanistic evidence is central to all the key tasks in the drug approval process: in drug discovery and development; assessing pharmaceutical quality; devising dosage regimens; assessing efficacy, harms, external validity, and cost-effectiveness; evaluating adherence; and extending product licences. We recommend that, when preparing for meetings in which any aspect of drug approval is to be discussed, mechanistic evidence should be systematically analysed and presented to the committee members alongside analyses of clinical studies. © 2018 The Authors Journal of Evaluation in Clinical Practice Published by John Wiley & Sons Ltd.
Balluerka, Nekane; Gorostiaga, Arantxa; Ulacia, Imanol
2014-11-14
Personal initiative characterizes people who are proactive, persistent and self-starting when facing the difficulties that arise in achieving goals. Despite its importance in the educational field there is a scarcity of measures to assess students' personal initiative. Thus, the aim of the present study was to develop a questionnaire to assess this variable in the academic environment and to validate it for adolescents and young adults. The sample comprised 244 vocational training students. The questionnaire showed a factor structure including three factors (Proactivity-Prosocial behavior, Persistence and Self-Starting) with acceptable indices of internal consistency (ranging between α = .57 and α =.73) and good convergent validity with respect to the Self-Reported Initiative scale. Evidence of external validity was also obtained based on the relationships between personal initiative and variables such as self-efficacy, enterprising attitude, responsibility and control aspirations, conscientiousness, and academic achievement. The results indicate that this new measure is very useful for assessing personal initiative among vocational training students.
Walenkamp, Monique M J; Bentohami, Abdelali; Slaar, Annelie; Beerekamp, M S H Suzan; Maas, Mario; Jager, L C Cara; Sosef, Nico L; van Velde, Romuald; Ultee, Jan M; Steyerberg, Ewout W; Goslings, J C Carel; Schep, Niels W L
2016-01-01
Although only 39% of patients with wrist trauma have sustained a fracture, the majority of patients is routinely referred for radiography. The purpose of this study was to derive and externally validate a clinical decision rule that selects patients with acute wrist trauma in the Emergency Department (ED) for radiography. This multicenter prospective study consisted of three components: (1) derivation of a clinical prediction model for detecting wrist fractures in patients following wrist trauma; (2) external validation of this model; and (3) design of a clinical decision rule. The study was conducted in the EDs of five Dutch hospitals: one academic hospital (derivation cohort) and four regional hospitals (external validation cohort). We included all adult patients with acute wrist trauma. The main outcome was fracture of the wrist (distal radius, distal ulna or carpal bones) diagnosed on conventional X-rays. A total of 882 patients were analyzed; 487 in the derivation cohort and 395 in the validation cohort. We derived a clinical prediction model with eight variables: age; sex, swelling of the wrist; swelling of the anatomical snuffbox, visible deformation; distal radius tender to palpation; pain on radial deviation and painful axial compression of the thumb. The Area Under the Curve at external validation of this model was 0.81 (95% CI: 0.77-0.85). The sensitivity and specificity of the Amsterdam Wrist Rules (AWR) in the external validation cohort were 98% (95% CI: 95-99%) and 21% (95% CI: 15%-28). The negative predictive value was 90% (95% CI: 81-99%). The Amsterdam Wrist Rules is a clinical prediction rule with a high sensitivity and negative predictive value for fractures of the wrist. Although external validation showed low specificity and 100 % sensitivity could not be achieved, the Amsterdam Wrist Rules can provide physicians in the Emergency Department with a useful screening tool to select patients with acute wrist trauma for radiography. The upcoming implementation study will further reveal the impact of the Amsterdam Wrist Rules on the anticipated reduction of X-rays requested, missed fractures, Emergency Department waiting times and health care costs.
Witt, Edward A.; Donnellan, M. Brent; Blonigen, Daniel M.; Krueger, Robert F.; Conger, Rand D.
2009-01-01
This report provides evidence for the reliability, validity, and developmental course of the psychopathic personality traits of Fearless Dominance (FD) and Impulsive Antisociality (IA) as assessed by items from Multidimensional Personality Questionnaire (MPQ; Patrick, Curtin, & Tellegen, 2002). In Study 1, MPQ-based measures of FD and IA were strongly correlated with their corresponding composite scores from the Psychopathic Personality Inventory-Revised (Lilienfeld & Widows, 2005). In Study 2, FD and IA had relatively distinct associations with measures of normal and maladaptive personality traits. In Study 3, FD and IA had substantial retest coefficients during the transition to adulthood and both traits showed average declines with an especially substantial drop in IA. In Study 4, FD and IA were correlated with measures of internalizing and externalizing problems in ways consistent with previous research and theory. Collectively, these results provide important information about the assessment of FD and IA. PMID:19365767
Beliefs about language development: construct validity evidence.
Donahue, Mavis L; Fu, Qiong; Smith, Everett V
2012-01-01
Understanding language development is incomplete without recognizing children's sociocultural environments, including adult beliefs about language development. Yet there is a need for data supporting valid inferences to assess these beliefs. The current study investigated the psychometric properties of data from a survey (MODeL) designed to explore beliefs in the popular culture, and their alignment with more formal theories. Support for the content, substantive, structural, generalizability, and external aspects of construct validity of the data were investigated. Subscales representing Behaviorist, Cognitive, Nativist, and Sociolinguistic models were identified as dimensions of beliefs. More than half of the items showed a high degree of consensus, suggesting culturally-transmitted beliefs. Behaviorist ideas were most popular. Bilingualism and ethnicity were related to Cognitive and Sociolinguistic beliefs. Identifying these beliefs may clarify the nature of child-directed speech, and enable the design of language intervention programs that are congruent with family and cultural expectations.
Assessing child and adolescent pragmatic language competencies: toward evidence-based assessments.
Russell, Robert L; Grizzle, Kenneth L
2008-06-01
Using language appropriately and effectively in social contexts requires pragmatic language competencies (PLCs). Increasingly, deficits in PLCs are linked to child and adolescent disorders, including autism spectrum, externalizing, and internalizing disorders. As the role of PLCs expands in diagnosis and treatment of developmental psychopathology, psychologists and educators will need to appraise and select clinical and research PLC instruments for use in assessments and/or studies. To assist in this appraisal, 24 PLC instruments, containing 1,082 items, are assessed by addressing four questions: (1) Can PLC domains targeted by assessment items be reliably identified?, (2) What are the core PLC domains that emerge across the 24 instruments?, (3) Do PLC questionnaires and tests assess similar PLC domains?, and (4) Do the instruments achieve content, structural, diagnostic, and ecological validity? Results indicate that test and questionnaire items can be reliably categorized into PLC domains, that PLC domains featured in questionnaires and tests significantly differ, and that PLC instruments need empirical confirmation of their dimensional structure, content validity across all developmental age bands, and ecological validity. Progress in building a better evidence base for PLC assessments should be a priority in future research.
Martin, Kevin D; Amendola, Annunziato; Phisitkul, Phinit
2016-01-01
Abstract Purpose Orthopedic education continues to move towards evidence-based curriculum in order to comply with new residency accreditation mandates. There are currently three high fidelity arthroscopic virtual reality (VR) simulators available, each with multiple instructional modules and simulated arthroscopic procedures. The aim of the current study is to assess face validity, defined as the degree to which a procedure appears effective in terms of its stated aims, of three available VR simulators. Methods Thirty subjects were recruited from a single orthopedic residency training program. Each subject completed one training session on each of the three leading VR arthroscopic simulators (ARTHRO mentor-Symbionix, ArthroS-Virtamed, and ArthroSim-Toltech). Each arthroscopic session involved simulator-specific modules. After training sessions, subjects completed a previously validated simulator questionnaire for face validity. Results The median external appearances for the ARTHRO Mentor (9.3, range 6.7-10.0; p=0.0036) and ArthroS (9.3, range 7.3-10.0; p=0.0003) were statistically higher than for Arthro- Sim (6.7, range 3.3-9.7). There was no statistical difference in intraarticular appearance, instrument appearance, or user friendliness between the three groups. Most simulators reached an appropriate level of proportion of sufficient scores for each categor y (≥70%), except for ARTHRO Mentor (intraarticular appearance-50%; instrument appearance- 61.1%) and ArthroSim (external appearance- 50%; user friendliness-68.8%). Conclusion These results demonstrate that ArthroS has the highest overall face validity of the three current arthroscopic VR simulators. However, only external appearance for ArthroS reached statistical significance when compared to the other simulators. Additionally, each simulator had satisfactory intraarticular quality. This study helps further the understanding of VR simulation and necessary features for accurate arthroscopic representation. This data also provides objective data for educators when selecting equipment that will best facilitate residency training. PMID:27528830
Zhang, Xin; Wu, Yuxia; Ren, Pengwei; Liu, Xueting; Kang, Deying
2015-10-30
To explore the relationship between the external validity and the internal validity of hypertension RCTs conducted in China. Comprehensive literature searches were performed in Medline, Embase, Cochrane Central Register of Controlled Trials (CCTR), CBMdisc (Chinese biomedical literature database), CNKI (China National Knowledge Infrastructure/China Academic Journals Full-text Database) and VIP (Chinese scientific journals database) as well as advanced search strategies were used to locate hypertension RCTs. The risk of bias in RCTs was assessed by a modified scale, Jadad scale respectively, and then studies with 3 or more grading scores were included for the purpose of evaluating of external validity. A data extract form including 4 domains and 25 items was used to explore relationship of the external validity and the internal validity. Statistic analyses were performed by using SPSS software, version 21.0 (SPSS, Chicago, IL). 226 hypertension RCTs were included for final analysis. RCTs conducted in university affiliated hospitals (P < 0.001) or secondary/tertiary hospitals (P < 0.001) were scored at higher internal validity. Multi-center studies (median = 4.0, IQR = 2.0) were scored higher internal validity score than single-center studies (median = 3.0, IQR = 1.0) (P < 0.001). Funding-supported trials had better methodological quality (P < 0.001). In addition, the reporting of inclusion criteria also leads to better internal validity (P = 0.004). Multivariate regression indicated sample size, industry-funding, quality of life (QOL) taken as measure and the university affiliated hospital as trial setting had statistical significance (P < 0.001, P < 0.001, P = 0.001, P = 0.006 respectively). Several components relate to the external validity of RCTs do associate with the internal validity, that do not stand in an easy relationship to each other. Regarding the poor reporting, other possible links between two variables need to trace in the future methodological researches.
Paquet, Y; Scoffier, S; d'Arripe-Longueville, F
2016-10-01
In the field of health psychology, the control has consistently been considered as a protective factor. This protective role has been also highlighted in eating attitudes' domain. However, current studies use the one-dimensional scale of Rotter or the multidimensional health locus of control scale, and no specific eating attitudes' scale in the sport context exists. Moreover, the social influence in previous scales is limited. According to recent works, the purpose of this study was to test the internal and external validity of a multidimensional locus of control scale of eating attitudes for athletes. One hundred and seventy-nine participants were solicited. A confirmatory factorial analysis was conducted in order to test the internal validity of the scale. The scale external validity was tested in relation to eating attitudes. The internal validity of the scale was verified as well as the external validity, which confirmed the importance of taking into consideration social influences. Indeed, the 2 subscales "Trainers, friends" and "Parents, family" are related respectively positively and negatively in eating disorders. Copyright © 2016 L'Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
Hu, Alan Shiun Yew; Donohue, Peter O'; Gunnarsson, Ronny K; de Costa, Alan
2018-03-14
Valid and user-friendly prediction models for conversion to open cholecystectomy allow for proper planning prior to surgery. The Cairns Prediction Model (CPM) has been in use clinically in the original study site for the past three years, but has not been tested at other sites. A retrospective, single-centred study collected ultrasonic measurements and clinical variables alongside with conversion status from consecutive patients who underwent laparoscopic cholecystectomy from 2013 to 2016 in The Townsville Hospital, North Queensland, Australia. An area under the curve (AUC) was calculated to externally validate of the CPM. Conversion was necessary in 43 (4.2%) out of 1035 patients. External validation showed an area under the curve of 0.87 (95% CI 0.82-0.93, p = 1.1 × 10 -14 ). In comparison with most previously published models, which have an AUC of approximately 0.80 or less, the CPM has the highest AUC of all published prediction models both for internal and external validation. Crown Copyright © 2018. Published by Elsevier Inc. All rights reserved.
Wang, Wenyi; Kim, Marlene T.; Sedykh, Alexander
2015-01-01
Purpose Experimental Blood–Brain Barrier (BBB) permeability models for drug molecules are expensive and time-consuming. As alternative methods, several traditional Quantitative Structure-Activity Relationship (QSAR) models have been developed previously. In this study, we aimed to improve the predictivity of traditional QSAR BBB permeability models by employing relevant public bio-assay data in the modeling process. Methods We compiled a BBB permeability database consisting of 439 unique compounds from various resources. The database was split into a modeling set of 341 compounds and a validation set of 98 compounds. Consensus QSAR modeling workflow was employed on the modeling set to develop various QSAR models. A five-fold cross-validation approach was used to validate the developed models, and the resulting models were used to predict the external validation set compounds. Furthermore, we used previously published membrane transporter models to generate relevant transporter profiles for target compounds. The transporter profiles were used as additional biological descriptors to develop hybrid QSAR BBB models. Results The consensus QSAR models have R2=0.638 for fivefold cross-validation and R2=0.504 for external validation. The consensus model developed by pooling chemical and transporter descriptors showed better predictivity (R2=0.646 for five-fold cross-validation and R2=0.526 for external validation). Moreover, several external bio-assays that correlate with BBB permeability were identified using our automatic profiling tool. Conclusions The BBB permeability models developed in this study can be useful for early evaluation of new compounds (e.g., new drug candidates). The combination of chemical and biological descriptors shows a promising direction to improve the current traditional QSAR models. PMID:25862462
Veldhuijzen van Zanten, Sophie E M; Lane, Adam; Heymans, Martijn W; Baugh, Joshua; Chaney, Brooklyn; Hoffman, Lindsey M; Doughman, Renee; Jansen, Marc H A; Sanchez, Esther; Vandertop, William P; Kaspers, Gertjan J L; van Vuurden, Dannis G; Fouladi, Maryam; Jones, Blaise V; Leach, James
2017-08-01
We aimed to perform external validation of the recently developed survival prediction model for diffuse intrinsic pontine glioma (DIPG), and discuss its utility. The DIPG survival prediction model was developed in a cohort of patients from the Netherlands, United Kingdom and Germany, registered in the SIOPE DIPG Registry, and includes age <3 years, longer symptom duration and receipt of chemotherapy as favorable predictors, and presence of ring-enhancement on MRI as unfavorable predictor. Model performance was evaluated by analyzing the discrimination and calibration abilities. External validation was performed using an unselected cohort from the International DIPG Registry, including patients from United States, Canada, Australia and New Zealand. Basic comparison with the results of the original study was performed using descriptive statistics, and univariate- and multivariable regression analyses in the validation cohort. External validation was assessed following a variety of analyses described previously. Baseline patient characteristics and results from the regression analyses were largely comparable. Kaplan-Meier curves of the validation cohort reproduced separated groups of standard (n = 39), intermediate (n = 125), and high-risk (n = 78) patients. This discriminative ability was confirmed by similar values for the hazard ratios across these risk groups. The calibration curve in the validation cohort showed a symmetric underestimation of the predicted survival probabilities. In this external validation study, we demonstrate that the DIPG survival prediction model has acceptable cross-cohort calibration and is able to discriminate patients with short, average, and increased survival. We discuss how this clinico-radiological model may serve a useful role in current clinical practice.
Prediction models for successful external cephalic version: a systematic review.
Velzel, Joost; de Hundt, Marcella; Mulder, Frederique M; Molkenboer, Jan F M; Van der Post, Joris A M; Mol, Ben W; Kok, Marjolein
2015-12-01
To provide an overview of existing prediction models for successful ECV, and to assess their quality, development and performance. We searched MEDLINE, EMBASE and the Cochrane Library to identify all articles reporting on prediction models for successful ECV published from inception to January 2015. We extracted information on study design, sample size, model-building strategies and validation. We evaluated the phases of model development and summarized their performance in terms of discrimination, calibration and clinical usefulness. We collected different predictor variables together with their defined significance, in order to identify important predictor variables for successful ECV. We identified eight articles reporting on seven prediction models. All models were subjected to internal validation. Only one model was also validated in an external cohort. Two prediction models had a low overall risk of bias, of which only one showed promising predictive performance at internal validation. This model also completed the phase of external validation. For none of the models their impact on clinical practice was evaluated. The most important predictor variables for successful ECV described in the selected articles were parity, placental location, breech engagement and the fetal head being palpable. One model was assessed using discrimination and calibration using internal (AUC 0.71) and external validation (AUC 0.64), while two other models were assessed with discrimination and calibration, respectively. We found one prediction model for breech presentation that was validated in an external cohort and had acceptable predictive performance. This model should be used to council women considering ECV. Copyright © 2015. Published by Elsevier Ireland Ltd.
2014-01-01
Background. Evidence rankings do not consider equally internal (IV), external (EV), and model validity (MV) for clinical studies including complementary and alternative medicine/integrative medicine (CAM/IM) research. This paper describe this model and offers an EV assessment tool (EVAT©) for weighing studies according to EV and MV in addition to IV. Methods. An abbreviated systematic review methodology was employed to search, assemble, and evaluate the literature that has been published on EV/MV criteria. Standard databases were searched for keywords relating to EV, MV, and bias-scoring from inception to Jan 2013. Tools identified and concepts described were pooled to assemble a robust tool for evaluating these quality criteria. Results. This study assembled a streamlined, objective tool to incorporate for the evaluation of quality of EV/MV research that is more sensitive to CAM/IM research. Conclusion. Improved reporting on EV can help produce and provide information that will help guide policy makers, public health researchers, and other scientists in their selection, development, and improvement in their research-tested intervention. Overall, clinical studies with high EV have the potential to provide the most useful information about “real-world” consequences of health interventions. It is hoped that this novel tool which considers IV, EV, and MV on equal footing will better guide clinical decision making. PMID:24734111
Scientific Reporting: Raising the Standards.
McLeroy, Kenneth R; Garney, Whitney; Mayo-Wilson, Evan; Grant, Sean
2016-10-01
This article is based on a presentation that was made at the 2014 annual meeting of the editorial board of Health Education & Behavior. The article addresses critical issues related to standards of scientific reporting in journals, including concerns about external and internal validity and reporting bias. It reviews current reporting guidelines, effects of adopting guidelines, and offers suggestions for improving reporting. The evidence about the effects of guideline adoption and implementation is briefly reviewed. Recommendations for adoption and implementation of appropriate guidelines, including considerations for journals, are provided. © 2016 Society for Public Health Education.
Digital pathology in clinical use: where are we now and what is holding us back?
Griffin, Jon; Treanor, Darren
2017-01-01
Whole slide imaging is being used increasingly in research applications and in frozen section, consultation and external quality assurance practice. Digital pathology, when integrated with other digital tools such as barcoding, specimen tracking and digital dictation, can be integrated into the histopathology workflow, from specimen accession to report sign-out. These elements can bring about improvements in the safety, quality and efficiency of a histopathology department. The present paper reviews the evidence for these benefits. We then discuss the challenges of implementing a fully digital pathology workflow, including the regulatory environment, validation of whole slide imaging and the evidence for the design of a digital pathology workstation. © 2016 John Wiley & Sons Ltd.
Consumer preferences for food product quality attributes from Swedish agriculture.
Carlsson, Fredrik; Frykblom, Peter; Lagerkvist, Carl Johan
2005-06-01
This paper employs a choice experiment to obtain consumer preferences and willingness to pay for food product quality attributes currently not available in Sweden. Data were obtained from a large mail survey and estimated with a random parameter logit model. We found evidence for intraproduct differences in consumer preferences for identical attributes, as well as interproduct discrepancies in ranking of attributes. Furthermore, we found evidence of a market failure relating to the potential use of genetically modified animal fodder. Finally, we found support for the idea that a cheap-talk script can alleviate problems of external validity of choice experiments. Our results are useful in forming product differentiation strategies within the food industry, as well as for the formation of food policy.
Impact of External Cue Validity on Driving Performance in Parkinson's Disease
Scally, Karen; Charlton, Judith L.; Iansek, Robert; Bradshaw, John L.; Moss, Simon; Georgiou-Karistianis, Nellie
2011-01-01
This study sought to investigate the impact of external cue validity on simulated driving performance in 19 Parkinson's disease (PD) patients and 19 healthy age-matched controls. Braking points and distance between deceleration point and braking point were analysed for red traffic signals preceded either by Valid Cues (correctly predicting signal), Invalid Cues (incorrectly predicting signal), and No Cues. Results showed that PD drivers braked significantly later and travelled significantly further between deceleration and braking points compared with controls for Invalid and No-Cue conditions. No significant group differences were observed for driving performance in response to Valid Cues. The benefit of Valid Cues relative to Invalid Cues and No Cues was significantly greater for PD drivers compared with controls. Trail Making Test (B-A) scores correlated with driving performance for PDs only. These results highlight the importance of external cues and higher cognitive functioning for driving performance in mild to moderate PD. PMID:21789275
Kastner, Rebecca M; Sellbom, Martin; Lilienfeld, Scott O
2012-03-01
The Psychopathic Personality Inventory (PPI) has shown promising construct validity as a measure of psychopathy. Because of its relative efficiency, a short-form version of the PPI (PPI-SF) was developed and has proven useful in many psychopathy studies. The validity of the PPI-SF, however, has not been thoroughly examined, and no studies have directly compared the validity of the short form with that of the full-length version. The current study was designed to compare the psychometric properties of both PPI versions, with an emphasis on convergent and discriminant validity in predicting external criteria conceptually relevant to psychopathy. We used both prison (n = 558) and college samples (n = 322) for this investigation. PPI scale scores were more reliable and more strongly correlated with the conceptually relevant criterion measures compared with the PPI-SF, particularly in the prison sample. There were no differences in relative discriminant validity. Thus, overall, the PPI full-length version showed more evidence of construct validity than did the short form, and the consequences of this psychometric difference should be considered when evaluating the clinical utility of each measure.
NASA Astrophysics Data System (ADS)
Campbell, Chad Edward
Over the past decade, hundreds of studies have introduced genomics and bioinformatics (GB) curricula and laboratory activities at the undergraduate level. While these publications have facilitated the teaching and learning of cutting-edge content, there has yet to be an evaluation of these assessment tools to determine if they are meeting the quality control benchmarks set forth by the educational research community. An analysis of these assessment tools indicated that <10% referenced any quality control criteria and that none of the assessments met more than one of the quality control benchmarks. In the absence of evidence that these benchmarks had been met, it is unclear whether these assessment tools are capable of generating valid and reliable inferences about student learning. To remedy this situation the development of a robust GB assessment aligned with the quality control benchmarks was undertaken in order to ensure evidence-based evaluation of student learning outcomes. Content validity is a central piece of construct validity, and it must be used to guide instrument and item development. This study reports on: (1) the correspondence of content validity evidence gathered from independent sources; (2) the process of item development using this evidence; (3) the results from a pilot administration of the assessment; (4) the subsequent modification of the assessment based on the pilot administration results and; (5) the results from the second administration of the assessment. Twenty-nine different subtopics within GB (Appendix B: Genomics and Bioinformatics Expert Survey) were developed based on preliminary GB textbook analyses. These subtopics were analyzed using two methods designed to gather content validity evidence: (1) a survey of GB experts (n=61) and (2) a detailed content analyses of GB textbooks (n=6). By including only the subtopics that were shown to have robust support across these sources, 22 GB subtopics were established for inclusion in the assessment. An expert panel subsequently developed, evaluated, and revised two multiple-choice items to align with each of the 22 subtopics, producing a final item pool of 44 items. These items were piloted with student samples of varying content exposure levels. Both Classical Test Theory (CTT) and Item Response Theory (IRT) methodologies were used to evaluate the assessment's validity, reliability and ability inferences, and its ability to differentiate students with different magnitudes of content exposure. A total of 18 items were subsequently modified and reevaluated by an expert panel. The 26 original and 18 modified items were once again piloted with student samples of varying content exposure levels. Both CTT and IRT methodologies were once again used to evaluate student responses in order to evaluate the assessment's validity and reliability inferences as well as its ability to differentiate students with different magnitudes of content exposure. Interviews with students from different content exposure levels were also performed in order to gather convergent validity evidence (external validity evidence) as well as substantive validity evidence. Also included are the limitations of the assessment and a set of guidelines on how the assessment can best be used.
Oliveira, Thaís D; Costa, Danielle de S; Albuquerque, Maicon R; Malloy-Diniz, Leandro F; Miranda, Débora M; de Paula, Jonas J
2018-06-11
The Parenting Styles and Dimensions Questionnaire (PSDQ) is used worldwide to assess three styles (authoritative, authoritarian, and permissive) and seven dimensions of parenting. In this study, we adapted the short version of the PSDQ for use in Brazil and investigated its validity and reliability. Participants were 451 mothers of children aged 3 to 18 years, though sample size varied with analyses. The translation and adaptation of the PSDQ followed a rigorous methodological approach. Then, we investigated the content, criterion, and construct validity of the adapted instrument. The scale content validity index (S-CVI) was considered adequate (0.97). There was evidence of internal validity, with the PSDQ dimensions showing strong correlations with their higher-order parenting styles. Confirmatory factor analysis endorsed the three-factor, second-order solution (i.e., three styles consisting of seven dimensions). The PSDQ showed convergent validity with the validated Brazilian version of the Parenting Styles Inventory (Inventário de Estilos Parentais - IEP), as well as external validity, as it was associated with several instruments measuring sociodemographic and behavioral/emotional-problem variables. The PSDQ is an effective and reliable psychometric instrument to assess childrearing strategies according to Baumrind's model of parenting styles.
Yahya, Noorazrul; Ebert, Martin A; Bulsara, Max; Kennedy, Angel; Joseph, David J; Denham, James W
2016-08-01
Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
2014-01-01
Background Cost-effectiveness analyses (CEAs) that use patient-specific data from a randomized controlled trial (RCT) are popular, yet such CEAs are criticized because they neglect to incorporate evidence external to the trial. A popular method for quantifying uncertainty in a RCT-based CEA is the bootstrap. The objective of the present study was to further expand the bootstrap method of RCT-based CEA for the incorporation of external evidence. Methods We utilize the Bayesian interpretation of the bootstrap and derive the distribution for the cost and effectiveness outcomes after observing the current RCT data and the external evidence. We propose simple modifications of the bootstrap for sampling from such posterior distributions. Results In a proof-of-concept case study, we use data from a clinical trial and incorporate external evidence on the effect size of treatments to illustrate the method in action. Compared to the parametric models of evidence synthesis, the proposed approach requires fewer distributional assumptions, does not require explicit modeling of the relation between external evidence and outcomes of interest, and is generally easier to implement. A drawback of this approach is potential computational inefficiency compared to the parametric Bayesian methods. Conclusions The bootstrap method of RCT-based CEA can be extended to incorporate external evidence, while preserving its appealing features such as no requirement for parametric modeling of cost and effectiveness outcomes. PMID:24888356
Sadatsafavi, Mohsen; Marra, Carlo; Aaron, Shawn; Bryan, Stirling
2014-06-03
Cost-effectiveness analyses (CEAs) that use patient-specific data from a randomized controlled trial (RCT) are popular, yet such CEAs are criticized because they neglect to incorporate evidence external to the trial. A popular method for quantifying uncertainty in a RCT-based CEA is the bootstrap. The objective of the present study was to further expand the bootstrap method of RCT-based CEA for the incorporation of external evidence. We utilize the Bayesian interpretation of the bootstrap and derive the distribution for the cost and effectiveness outcomes after observing the current RCT data and the external evidence. We propose simple modifications of the bootstrap for sampling from such posterior distributions. In a proof-of-concept case study, we use data from a clinical trial and incorporate external evidence on the effect size of treatments to illustrate the method in action. Compared to the parametric models of evidence synthesis, the proposed approach requires fewer distributional assumptions, does not require explicit modeling of the relation between external evidence and outcomes of interest, and is generally easier to implement. A drawback of this approach is potential computational inefficiency compared to the parametric Bayesian methods. The bootstrap method of RCT-based CEA can be extended to incorporate external evidence, while preserving its appealing features such as no requirement for parametric modeling of cost and effectiveness outcomes.
Matsumoto, David; Yoo, Seung Hee; Hirayama, Satoko; Petrova, Galina
2005-03-01
As one component of emotion regulation, display rules, which reflect the regulation of expressive behavior, have been the topic of many studies. Despite their theoretical and empirical importance, however, to date there is no measure of display rules that assesses a full range of behavioral responses that are theoretically possible when emotion is elicited. This article reports the development of a new measure of display rules that surveys 5 expressive modes: expression, deamplification, amplification, qualification, and masking. Two studies provide evidence for its internal and temporal reliability and for its content, convergent, discriminant, external, and concurrent predictive validity. Additionally, Study 1, involving American, Russian, and Japanese participants, demonstrated predictable cultural differences on each of the expressive modes. Copyright 2005 APA, all rights reserved.
Problem-solving style and multicultural personality dispositions: a study of construct validity.
Houtz, John C; Ponterotto, Joseph G; Burger, Claudia; Marino, Cherylynn
2010-06-01
This exploratory study examined the relationship between problem-solving styles and multicultural personality dispositions among 91 graduate students enrolled in an urban university located in the northeast United States. Problem-solving style was assessed with the three dimensions of the VIEW: an Assessment of Problem Solving Style. Multicultural personality was assessed with the five-factor Multicultural Personality Questionnaire (MPQ); its factors of Cultural Empathy, Open-mindedness, Social Initiative, and Flexibility correlated significantly with Explorer and External problem-solving styles, as predicted. The Emotional Stability subscale also correlated significantly with scores on Explorer style, suggesting that individuals who prefer "thinking in new directions" in problem solving are more likely to report remaining calm under stressful situations. Collectively, study results provided additional evidence of construct validity for the VIEW.
Simulation models in population breast cancer screening: A systematic review.
Koleva-Kolarova, Rositsa G; Zhan, Zhuozhao; Greuter, Marcel J W; Feenstra, Talitha L; De Bock, Geertruida H
2015-08-01
The aim of this review was to critically evaluate published simulation models for breast cancer screening of the general population and provide a direction for future modeling. A systematic literature search was performed to identify simulation models with more than one application. A framework for qualitative assessment which incorporated model type; input parameters; modeling approach, transparency of input data sources/assumptions, sensitivity analyses and risk of bias; validation, and outcomes was developed. Predicted mortality reduction (MR) and cost-effectiveness (CE) were compared to estimates from meta-analyses of randomized control trials (RCTs) and acceptability thresholds. Seven original simulation models were distinguished, all sharing common input parameters. The modeling approach was based on tumor progression (except one model) with internal and cross validation of the resulting models, but without any external validation. Differences in lead times for invasive or non-invasive tumors, and the option for cancers not to progress were not explicitly modeled. The models tended to overestimate the MR (11-24%) due to screening as compared to optimal RCTs 10% (95% CI - 2-21%) MR. Only recently, potential harms due to regular breast cancer screening were reported. Most scenarios resulted in acceptable cost-effectiveness estimates given current thresholds. The selected models have been repeatedly applied in various settings to inform decision making and the critical analysis revealed high risk of bias in their outcomes. Given the importance of the models, there is a need for externally validated models which use systematical evidence for input data to allow for more critical evaluation of breast cancer screening. Copyright © 2015 Elsevier Ltd. All rights reserved.
A bias-adjusted evidence synthesis of RCT and observational data: the case of total hip replacement.
Schnell-Inderst, Petra; Iglesias, Cynthia P; Arvandi, Marjan; Ciani, Oriana; Matteucci Gothe, Raffaella; Peters, Jaime; Blom, Ashley W; Taylor, Rod S; Siebert, Uwe
2017-02-01
Evaluation of clinical effectiveness of medical devices differs in some aspects from the evaluation of pharmaceuticals. One of the main challenges identified is lack of robust evidence and a will to make use of experimental and observational studies (OSs) in quantitative evidence synthesis accounting for internal and external biases. Using a case study of total hip replacement to compare the risk of revision of cemented and uncemented implant fixation modalities, we pooled treatment effect estimates from OS and RCTs, and simplified existing methods for bias-adjusted evidence synthesis to enhance practical application. We performed an elicitation exercise using methodological and clinical experts to determine the strength of beliefs about the magnitude of internal and external bias affecting estimates of treatment effect. We incorporated the bias-adjusted treatment effects into a generalized evidence synthesis, calculating both frequentist and Bayesian statistical models. We estimated relative risks as summary effect estimates with 95% confidence/credibility intervals to capture uncertainty. When we compared alternative approaches to synthesizing evidence, we found that the pooled effect size strongly depended on the inclusion of observational data as well as on the use bias-adjusted estimates. We demonstrated the feasibility of using observational studies in meta-analyses to complement RCTs and incorporate evidence from a wider spectrum of clinically relevant studies and healthcare settings. To ensure internal validity, OS data require sufficient correction for confounding and selection bias, either through study design and primary analysis, or by applying post-hoc bias adjustments to the results. © 2017 The Authors. Health Economics published by John Wiley & Sons, Ltd. © 2017 The Authors. Health Economics published by John Wiley & Sons, Ltd.
Jessen, Marie K; Skibsted, Simon; Shapiro, Nathan I
2017-06-01
The aim of this study was to validate the association between number of organ dysfunctions and mortality in emergency department (ED) patients with suspected infection. This study was conducted at two medical care center EDs. The internal validation set was a prospective cohort study conducted in Boston, USA. The external validation set was a retrospective case-control study conducted in Aarhus, Denmark. The study included adult patients (>18 years) with clinically suspected infection. Laboratory results and clinical data were used to assess organ dysfunctions. Inhospital mortality was the outcome measure. Multivariate logistic regression was used to determine the independent mortality odds for number and types of organ dysfunctions. We enrolled 4952 (internal) and 483 (external) patients. The mortality rate significantly increased with increasing number of organ dysfunctions: internal validation: 0 organ dysfunctions: 0.5% mortality, 1: 3.6%, 2: 9.5%, 3: 17%, and 4 or more: 37%; external validation: 2.2, 6.7, 17, 41, and 57% mortality (both P<0.001 for trend). Age-adjusted and comorbidity-adjusted number of organ dysfunctions remained an independent predictor. The effect of specific types of organ dysfunction on mortality was most pronounced for hematologic [odds ratio (OR) 3.3 (95% confidence interval (CI) 2.0-5.4)], metabolic [OR 3.3 (95% CI 2.4-4.6); internal validation], and cardiovascular dysfunctions [OR 14 (95% CI 3.7-50); external validation]. The number of organ dysfunctions predicts sepsis mortality.
Walenkamp, Monique M J; Bentohami, Abdelali; Slaar, Annelie; Beerekamp, M Suzan H; Maas, Mario; Jager, L Cara; Sosef, Nico L; van Velde, Romuald; Ultee, Jan M; Steyerberg, Ewout W; Goslings, J Carel; Schep, Niels W L
2015-12-18
Although only 39 % of patients with wrist trauma have sustained a fracture, the majority of patients is routinely referred for radiography. The purpose of this study was to derive and externally validate a clinical decision rule that selects patients with acute wrist trauma in the Emergency Department (ED) for radiography. This multicenter prospective study consisted of three components: (1) derivation of a clinical prediction model for detecting wrist fractures in patients following wrist trauma; (2) external validation of this model; and (3) design of a clinical decision rule. The study was conducted in the EDs of five Dutch hospitals: one academic hospital (derivation cohort) and four regional hospitals (external validation cohort). We included all adult patients with acute wrist trauma. The main outcome was fracture of the wrist (distal radius, distal ulna or carpal bones) diagnosed on conventional X-rays. A total of 882 patients were analyzed; 487 in the derivation cohort and 395 in the validation cohort. We derived a clinical prediction model with eight variables: age; sex, swelling of the wrist; swelling of the anatomical snuffbox, visible deformation; distal radius tender to palpation; pain on radial deviation and painful axial compression of the thumb. The Area Under the Curve at external validation of this model was 0.81 (95 % CI: 0.77-0.85). The sensitivity and specificity of the Amsterdam Wrist Rules (AWR) in the external validation cohort were 98 % (95 % CI: 95-99 %) and 21 % (95 % CI: 15 %-28). The negative predictive value was 90 % (95 % CI: 81-99 %). The Amsterdam Wrist Rules is a clinical prediction rule with a high sensitivity and negative predictive value for fractures of the wrist. Although external validation showed low specificity and 100 % sensitivity could not be achieved, the Amsterdam Wrist Rules can provide physicians in the Emergency Department with a useful screening tool to select patients with acute wrist trauma for radiography. The upcoming implementation study will further reveal the impact of the Amsterdam Wrist Rules on the anticipated reduction of X-rays requested, missed fractures, Emergency Department waiting times and health care costs. This study was registered in the Dutch Trial Registry, reference number NTR2544 on October 1(st), 2010.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Louie, Alexander V., E-mail: Dr.alexlouie@gmail.com; Department of Radiation Oncology, London Regional Cancer Program, University of Western Ontario, London, Ontario; Department of Epidemiology, Harvard School of Public Health, Harvard University, Boston, Massachusetts
Purpose: A prognostic model for 5-year overall survival (OS), consisting of recursive partitioning analysis (RPA) and a nomogram, was developed for patients with early-stage non-small cell lung cancer (ES-NSCLC) treated with stereotactic ablative radiation therapy (SABR). Methods and Materials: A primary dataset of 703 ES-NSCLC SABR patients was randomly divided into a training (67%) and an internal validation (33%) dataset. In the former group, 21 unique parameters consisting of patient, treatment, and tumor factors were entered into an RPA model to predict OS. Univariate and multivariate models were constructed for RPA-selected factors to evaluate their relationship with OS. A nomogrammore » for OS was constructed based on factors significant in multivariate modeling and validated with calibration plots. Both the RPA and the nomogram were externally validated in independent surgical (n=193) and SABR (n=543) datasets. Results: RPA identified 2 distinct risk classes based on tumor diameter, age, World Health Organization performance status (PS) and Charlson comorbidity index. This RPA had moderate discrimination in SABR datasets (c-index range: 0.52-0.60) but was of limited value in the surgical validation cohort. The nomogram predicting OS included smoking history in addition to RPA-identified factors. In contrast to RPA, validation of the nomogram performed well in internal validation (r{sup 2}=0.97) and external SABR (r{sup 2}=0.79) and surgical cohorts (r{sup 2}=0.91). Conclusions: The Amsterdam prognostic model is the first externally validated prognostication tool for OS in ES-NSCLC treated with SABR available to individualize patient decision making. The nomogram retained strong performance across surgical and SABR external validation datasets. RPA performance was poor in surgical patients, suggesting that 2 different distinct patient populations are being treated with these 2 effective modalities.« less
Choudhry, Shahid A.; Li, Jing; Davis, Darcy; Erdmann, Cole; Sikka, Rishi; Sutariya, Bharat
2013-01-01
Introduction: Preventing the occurrence of hospital readmissions is needed to improve quality of care and foster population health across the care continuum. Hospitals are being held accountable for improving transitions of care to avert unnecessary readmissions. Advocate Health Care in Chicago and Cerner (ACC) collaborated to develop all-cause, 30-day hospital readmission risk prediction models to identify patients that need interventional resources. Ideally, prediction models should encompass several qualities: they should have high predictive ability; use reliable and clinically relevant data; use vigorous performance metrics to assess the models; be validated in populations where they are applied; and be scalable in heterogeneous populations. However, a systematic review of prediction models for hospital readmission risk determined that most performed poorly (average C-statistic of 0.66) and efforts to improve their performance are needed for widespread usage. Methods: The ACC team incorporated electronic health record data, utilized a mixed-method approach to evaluate risk factors, and externally validated their prediction models for generalizability. Inclusion and exclusion criteria were applied on the patient cohort and then split for derivation and internal validation. Stepwise logistic regression was performed to develop two predictive models: one for admission and one for discharge. The prediction models were assessed for discrimination ability, calibration, overall performance, and then externally validated. Results: The ACC Admission and Discharge Models demonstrated modest discrimination ability during derivation, internal and external validation post-recalibration (C-statistic of 0.76 and 0.78, respectively), and reasonable model fit during external validation for utility in heterogeneous populations. Conclusions: The ACC Admission and Discharge Models embody the design qualities of ideal prediction models. The ACC plans to continue its partnership to further improve and develop valuable clinical models. PMID:24224068
Does Rational Selection of Training and Test Sets Improve the Outcome of QSAR Modeling?
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external dataset, the best way to validate the predictive ability of a model is to perform its s...
Estimates of External Validity Bias When Impact Evaluations Select Sites Nonrandomly
ERIC Educational Resources Information Center
Bell, Stephen H.; Olsen, Robert B.; Orr, Larry L.; Stuart, Elizabeth A.
2016-01-01
Evaluations of educational programs or interventions are typically conducted in nonrandomly selected samples of schools or districts. Recent research has shown that nonrandom site selection can yield biased impact estimates. To estimate the external validity bias from nonrandom site selection, we combine lists of school districts that were…
What are the most effective risk-reduction strategies in sport concussion?
Benson, Brian W; McIntosh, Andrew S; Maddocks, David; Herring, Stanley A; Raftery, Martin; Dvorák, Jirí
2013-04-01
To critically review the evidence to determine the efficacy and effectiveness of protective equipment, rule changes, neck strength and legislation in reducing sport concussion risk. Electronic databases, grey literature and bibliographies were used to search the evidence using Medical Subject Headings and text words. Inclusion/exclusion criteria were used to select articles for the clinical equipment studies. The quality of evidence was assessed using epidemiological criteria regarding internal/external validity (eg, strength of design, sample size/power, bias and confounding). No new valid, conclusive evidence was provided to suggest the use of headgear in rugby, or mouth guards in American football, significantly reduced players' risk of concussion. No evidence was provided to suggest an association between neck strength increases and concussion risk reduction. There was evidence in ice hockey to suggest fair-play rules and eliminating body checking among 11-years-olds to 12-years-olds were effective injury prevention strategies. Evidence is lacking on the effects of legislation on concussion prevention. Equipment self-selection bias was a common limitation, as was the lack of measurement and control for potential confounding variables. Lastly, helmets need to be able to protect from impacts resulting in a head change in velocities of up to 10 and 7 m/s in professional American and Australian football, respectively, as well as reduce head resultant linear and angular acceleration to below 50 g and 1500 rad/s(2), respectively, to optimise their effectiveness. A multifactorial approach is needed for concussion prevention. Future well-designed and sport-specific prospective analytical studies of sufficient power are warranted.
Jonker, Simone J.; Menting, Theo P.; Warlé, Michiel C.; Ritskes-Hoitinga, Merel; Wever, Kimberley E.
2016-01-01
Background Renal ischemia-reperfusion injury (IRI) is a major cause of kidney damage after e.g. renal surgery and transplantation. Ischemic postconditioning (IPoC) is a promising treatment strategy for renal IRI, but early clinical trials have not yet replicated the promising results found in animal studies. Method We present a systematic review, quality assessment and meta-analysis of the preclinical evidence for renal IPoC, and identify factors which modify its efficacy. Results We identified 39 publications studying >250 control animals undergoing renal IRI only and >290 animals undergoing renal IRI and IPoC. Healthy, male rats undergoing warm ischemia were used in the vast majority of studies. Four studies applied remote IPoC, all others used local IPoC. Meta-analysis showed that both local and remote IPoC ameliorated renal damage after IRI for the outcome measures serum creatinine, blood urea nitrogen and renal histology. Subgroup analysis indicated that IPoC efficacy increased with the duration of index ischemia. Measures to reduce bias were insufficiently reported. Conclusion High efficacy of IPoC is observed in animal models, but factors pertaining to the internal and external validity of these studies may hamper the translation of IPoC to the clinical setting. The external validity of future animal studies should be increased by including females, comorbid animals, and transplantation models, in order to better inform clinical trial design. The severity of renal damage should be taken into account in the design and analysis of future clinical trials. PMID:26963819
Stevens, Andreas; Bahlo, Simone; Licha, Christina; Liske, Benjamin; Vossler-Thies, Elisabeth
2016-11-30
Subnormal performance in attention tasks may result from various sources including lack of effort. In this report, the derivation and validation of a performance validity parameter for reaction time is described, using a set of malingering-indices ("Slick-criteria"), and 3 independent samples of participants (total n =893). The Slick-criteria yield an estimate of the probability of malingering based on the presence of an external incentive, evidence from neuropsychological testing, from self-report and clinical data. In study (1) a validity parameter is derived using reaction time data of a sample, composed of inpatients with recent severe brain lesions not involved in litigation and of litigants with and without brain lesion. In study (2) the validity parameter is tested in an independent sample of litigants. In study (3) the parameter is applied to an independent sample comprising cooperative and non-cooperative testees. Logistic regression analysis led to a derived validity parameter based on median reaction time and standard deviation. It performed satisfactorily in studies (2) and (3) (study 2 sensitivity=0.94, specificity=1.00; study 3 sensitivity=0.79, specificity=0.87). The findings suggest that median reaction time and standard deviation may be used as indicators of negative response bias. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Ortuño-Sierra, Javier; Aritio-Solana, Rebeca; Inchausti, Félix; Chocarro de Luis, Edurne; Lucas Molina, Beatriz; Pérez de Albéniz, Alicia; Fonseca-Pedrero, Eduardo
2017-01-01
The main purpose of the present study was to assess the depressive symptomatology and to gather new validity evidences of the Reynolds Depression Scale-Short form (RADS-SF) in a representative sample of youths. The sample consisted of 2914 adolescents with a mean age of 15.85 years (SD = 1.68). We calculated the descriptive statistics and internal consistency of the RADS-SF scores. Also, confirmatory factor analyses (CFAs) at the item level and successive multigroup CFAs to test measurement invariance, were conducted. Latent mean differences across gender and educational level groups were estimated, and finally, we studied the sources of validity evidences with other external variables. The level of internal consistency of the RADS-SF Total score by means of Ordinal alpha was .89. Results from CFAs showed that the one-dimensional model displayed appropriate goodness of-fit indices with CFI value over .95, and RMSEA value under .08. In addition, the results support the strong measurement invariance of the RADS-SF scores across gender and age. When latent means were compared, statistically significant differences were found by gender and age. Females scored 0.347 over than males in Depression latent variable, whereas older adolescents scored 0.111 higher than the younger group. In addition, the RADS-SF score was associated with the RADS scores. The results suggest that the RADS-SF could be used as an efficient screening test to assess self-reported depressive symptoms in adolescents from the general population.
Steppan, Martin; Kraus, Ludwig; Piontek, Daniela; Siciliano, Valeria
2013-01-01
Prevalence estimation of cannabis use is usually based on self-report data. Although there is evidence on the reliability of this data source, its cross-cultural validity is still a major concern. External objective criteria are needed for this purpose. In this study, cannabis-related search engine query data are used as an external criterion. Data on cannabis use were taken from the 2007 European School Survey Project on Alcohol and Other Drugs (ESPAD). Provincial data came from three Italian nation-wide studies using the same methodology (2006-2008; ESPAD-Italia). Information on cannabis-related search engine query data was based on Google search volume indices (GSI). (1) Reliability analysis was conducted for GSI. (2) Latent measurement models of "true" cannabis prevalence were tested using perceived availability, web-based cannabis searches and self-reported prevalence as indicators. (3) Structure models were set up to test the influences of response tendencies and geographical position (latitude, longitude). In order to test the stability of the models, analyses were conducted on country level (Europe, US) and on provincial level in Italy. Cannabis-related GSI were found to be highly reliable and constant over time. The overall measurement model was highly significant in both data sets. On country level, no significant effects of response bias indicators and geographical position on perceived availability, web-based cannabis searches and self-reported prevalence were found. On provincial level, latitude had a significant positive effect on availability indicating that perceived availability of cannabis in northern Italy was higher than expected from the other indicators. Although GSI showed weaker associations with cannabis use than perceived availability, the findings underline the external validity and usefulness of search engine query data as external criteria. The findings suggest an acceptable relative comparability of national (provincial) prevalence estimates of cannabis use that are based on a common survey methodology. Search engine query data are a too weak indicator to base prevalence estimations on this source only, but in combination with other sources (waste water analysis, sales of cigarette paper) they may provide satisfactory estimates. Copyright © 2012. Published by Elsevier B.V.
Evidence-Based School Behavior Assessment of Externalizing Behavior in Young Children
Bagner, Daniel M.; Boggs, Stephen R.; Eyberg, Sheila M.
2011-01-01
This study examined the psychometric properties of the Revised Edition of the School Observation Coding System (REDSOCS). Participants were 68 children ages 3 to 6 who completed parent-child interaction therapy for Oppositional Defiant Disorder as part of a larger efficacy trial. Interobserver reliability on REDSOCS categories was moderate to high, with percent agreement ranging from 47% to 90% (M = 67%) and Cohen’s kappa coefficients ranging from .69 to .95 (M = .82). Convergent validity of the REDSOCS categories was supported by significant correlations with the Intensity Scale of the Sutter-Eyberg Student Behavior Inventory-Revised and related subscales of the Conners’ Teacher Rating Scale-Revised: Long Version (CTRS-R: L). Divergent validity was indicated by nonsignificant correlations between REDSOCS categories and scales on the CTRS-R: L expected not to relate to disruptive classroom behavior. Treatment sensitivity was demonstrated for two of the three primary REDSOCS categories by significant pre to posttreatment changes. This study provides psychometric support for the designation of REDSOCS as an evidence-based assessment procedure for young children. PMID:21687781
ERIC Educational Resources Information Center
Pruett, Steven R.; Deiches, Jon; Pfaller, Joseph; Moser, Erin; Chan, Fong
2014-01-01
Objective: To determine the factorial validity of the Internal and External Motivation to Respond without Prejudice toward People with Disabilities Scale (D-IMS/EMS). Design: A quantitative descriptive design using factor analysis. Participants: 233 rehabilitation counseling and rehabilitation services students. Results: Both exploratory and…
ODD and ADHD Symptoms in Ukrainian Children: External Validators and Comorbidity
ERIC Educational Resources Information Center
Drabick, Deborah A. G.; Gadow, Kenneth D.; Carlson, Gabrielle A.; Bromet, Evelyn J.
2004-01-01
Objective: To examine potential external validators for oppositional defiant disorder (ODD) and attention-deficient/hyperactive disorder (ADHD) symptoms in a Ukrainian community-based sample of 600 children age 10 to 12 years old and evaluate the nature of co-occurring ODD and ADHD symptoms using mother- and teacher-defined groups. Method: In…
NASA Astrophysics Data System (ADS)
Dutton, Gregory
Forensic science is a collection of applied disciplines that draws from all branches of science. A key question in forensic analysis is: to what degree do a piece of evidence and a known reference sample share characteristics? Quantification of similarity, estimation of uncertainty, and determination of relevant population statistics are of current concern. A 2016 PCAST report questioned the foundational validity and the validity in practice of several forensic disciplines, including latent fingerprints, firearms comparisons and DNA mixture interpretation. One recommendation was the advancement of objective, automated comparison methods based on image analysis and machine learning. These concerns parallel the National Institute of Justice's ongoing R&D investments in applied chemistry, biology and physics. NIJ maintains a funding program spanning fundamental research with potential for forensic application to the validation of novel instruments and methods. Since 2009, NIJ has funded over 179M in external research to support the advancement of accuracy, validity and efficiency in the forensic sciences. An overview of NIJ's programs will be presented, with examples of relevant projects from fluid dynamics, 3D imaging, acoustics, and materials science.
Thill, Azure Welborn; Bachanas, Pamela; Garber, Judy; Miller, Karen Bearman; Abad, Mona; Bruno, Elizabeth Franks; Carter, Jocelyn Smith; David-Ferdon, Corinne; Jandasek, Barbara; Mennuti-Washburn, Jean E.; O’Mahar, Kerry; Zukerman, Jill
2008-01-01
Objective To provide an evidence-based review of measures of psychosocial adjustment and psychopathology, with a specific focus on their use in the field of pediatric psychology. Methods As part of a larger survey of pediatric psychologists from the Society of Pediatric Psychology e-mail listserv (American Psychological Association, APA, Division 54), 37 measures were selected for this psychometric review. Measures that qualified for the review fell into one of the following three categories: (a) internalizing or externalizing rating scales, (b) broad-band rating scales, and (c) self-related rating scales. Results Psychometric characteristics (i.e., three types of reliability, two types of validity) were strong for the majority of measures reviewed, with 34 of the 37 measures meeting “well-established” evidence-based assessment (EBA) criteria. Strengths and weaknesses of existing measures were noted. Conclusions Recommendations for future work in this area of assessment are presented, including suggestions that more fine-grained EBA criteria be developed and that evidence-based “profiles” be devised for each measure. PMID:17728305
Schriver, Michael; Cubaka, Vincent Kalumire; Vedsted, Peter; Besigye, Innocent; Kallestrup, Per
2018-01-01
External supervision of primary health care facilities to monitor and improve services is common in low-income countries. Currently there are no tools to measure the quality of support in external supervision in these countries. To develop a provider-reported instrument to assess the support delivered through external supervision in Rwanda and other countries. "External supervision: Provider Evaluation of Supervisor Support" (ExPRESS) was developed in 18 steps, primarily in Rwanda. Content validity was optimised using systematic search for related instruments, interviews, translations, and relevance assessments by international supervision experts as well as local experts in Nigeria, Kenya, Uganda and Rwanda. Construct validity and reliability were examined in two separate field tests, the first using exploratory factor analysis and a test-retest design, the second for confirmatory factor analysis. We included 16 items in section A ('The most recent experience with an external supervisor'), and 13 items in section B ('The overall experience with external supervisors'). Item-content validity index was acceptable. In field test I, test-retest had acceptable kappa values and exploratory factor analysis suggested relevant factors in sections A and B used for model hypotheses. In field test II, models were tested by confirmatory factor analysis fitting a 4-factor model for section A, and a 3-factor model for section B. ExPRESS is a promising tool for evaluation of the quality of support of primary health care providers in external supervision of primary health care facilities in resource-constrained settings. ExPRESS may be used as specific feedback to external supervisors to help identify and address gaps in the supervision they provide. Further studies should determine optimal interpretation of scores and the number of respondents needed per supervisor to obtain precise results, as well as test the functionality of section B.
Study design and "evidence" in patient-oriented research.
Concato, John
2013-06-01
Individual studies in patient-oriented research, whether described as "comparative effectiveness" or using other terms, are based on underlying methodological designs. A simple taxonomy of study designs includes randomized controlled trials on the one hand, and observational studies (such as case series, cohort studies, and case-control studies) on the other. A rigid hierarchy of these design types is a fairly recent phenomenon, promoted as a tenet of "evidence-based medicine," with randomized controlled trials receiving gold-standard status in terms of producing valid results. Although randomized trials have many strengths, and contribute substantially to the evidence base in clinical care, making presumptions about the quality of a study based solely on category of research design is unscientific. Both the limitations of randomized trials as well as the strengths of observational studies tend to be overlooked when a priori assumptions are made. This essay presents an argument in support of a more balanced approach to evaluating evidence, and discusses representative examples from the general medical as well as pulmonary and critical care literature. The simultaneous consideration of validity (whether results are correct "internally") and generalizability (how well results apply to "external" populations) is warranted in assessing whether a study's results are accurate for patients likely to receive the intervention-examining the intersection of clinical and methodological issues in what can be called a medicine-based evidence approach. Examination of cause-effect associations in patient-oriented research should recognize both the strengths and limitations of randomized trials as well as observational studies.
Assessing Jail Inmates’ Proneness to Shame and Guilt: Feeling Bad About the Behavior or the Self?
Tangney, June P.; Stuewig, Jeffrey; Mashek, Debra; Hastings, Mark
2011-01-01
This study of 550 jail inmates (379 male and 171 female) held on felony charges examines the reliability and validity of the Test of Self Conscious Affect –Socially Deviant Version (TOSCA-SD; Hanson & Tangney, 1996) as a measure of offenders’ proneness to shame and proneness to guilt. Discriminant validity (e.g., vis-à-vis self-esteem, negative affect, social desirability/impression management) and convergent validity (e.g., vis-à-vis correlations with empathy, externalization of blame, anger, psychological symptoms, and substance use problems) was supported, paralleling results from community samples. Further, proneness to shame and guilt were differentially related to widely used risk measures from the field of criminal justice (e.g., criminal history, psychopathy, violence risk, antisocial personality). Guilt-proneness appears to be a protective factor, whereas there was no evidence that shame-proneness serves an inhibitory function. Subsequent analyses indicate these findings generalize quite well across gender and race. Implications for intervention and sentencing practices are discussed. PMID:21743757
Evaluating the spoken English proficiency of graduates of foreign medical schools.
Boulet, J R; van Zanten, M; McKinley, D W; Gary, N E
2001-08-01
The purpose of this study was to gather additional evidence for the validity and reliability of spoken English proficiency ratings provided by trained standardized patients (SPs) in high-stakes clinical skills examination. Over 2500 candidates who took the Educational Commission for Foreign Medical Graduates' (ECFMG) Clinical Skills Assessment (CSA) were studied. The CSA consists of 10 or 11 timed clinical encounters. Standardized patients evaluate spoken English proficiency and interpersonal skills in every encounter. Generalizability theory was used to estimate the consistency of spoken English ratings. Validity coefficients were calculated by correlating summary English ratings with CSA scores and other external criterion measures. Mean spoken English ratings were also compared by various candidate background variables. The reliability of the spoken English ratings, based on 10 independent evaluations, was high. The magnitudes of the associated variance components indicated that the evaluation of a candidate's spoken English proficiency is unlikely to be affected by the choice of cases or SPs used in a given assessment. Proficiency in spoken English was related to native language (English versus other) and scores from the Test of English as a Foreign Language (TOEFL). The pattern of the relationships, both within assessment components and with external criterion measures, suggests that valid measures of spoken English proficiency are obtained. This result, combined with the high reproducibility of the ratings over encounters and SPs, supports the use of trained SPs to measure spoken English skills in a simulated medical environment.
Chen, Po-Yi; Yang, Chien-Ming; Morin, Charles M
2015-05-01
The purpose of this study is to examine the factor structure of the Insomnia Severity Index (ISI) across samples recruited from different countries. We tried to identify the most appropriate factor model for the ISI and further examined the measurement invariance property of the ISI across samples from different countries. Our analyses included one data set collected from a Taiwanese sample and two data sets obtained from samples in Hong Kong and Canada. The data set collected in Taiwan was analyzed with ordinal exploratory factor analysis (EFA) to obtain the appropriate factor model for the ISI. After that, we conducted a series of confirmatory factor analyses (CFAs), which is a special case of the structural equation model (SEM) that concerns the parameters in the measurement model, to the statistics collected in Canada and Hong Kong. The purposes of these CFA were to cross-validate the result obtained from EFA and further examine the cross-cultural measurement invariance of the ISI. The three-factor model outperforms other models in terms of global fit indices in Taiwan's population. Its external validity is also supported by confirmatory factor analyses. Furthermore, the measurement invariance analyses show that the strong invariance property between the samples from different cultures holds, providing evidence that the ISI results obtained in different cultures are comparable. The factorial validity of the ISI is stable in different populations. More importantly, its invariance property across cultures suggests that the ISI is a valid measure of the insomnia severity construct across countries. Copyright © 2014 Elsevier B.V. All rights reserved.
Vuong, Kylie; Armstrong, Bruce K; Weiderpass, Elisabete; Lund, Eiliv; Adami, Hans-Olov; Veierod, Marit B; Barrett, Jennifer H; Davies, John R; Bishop, D Timothy; Whiteman, David C; Olsen, Catherine M; Hopper, John L; Mann, Graham J; Cust, Anne E; McGeechan, Kevin
2016-08-01
Identifying individuals at high risk of melanoma can optimize primary and secondary prevention strategies. To develop and externally validate a risk prediction model for incident first-primary cutaneous melanoma using self-assessed risk factors. We used unconditional logistic regression to develop a multivariable risk prediction model. Relative risk estimates from the model were combined with Australian melanoma incidence and competing mortality rates to obtain absolute risk estimates. A risk prediction model was developed using the Australian Melanoma Family Study (629 cases and 535 controls) and externally validated using 4 independent population-based studies: the Western Australia Melanoma Study (511 case-control pairs), Leeds Melanoma Case-Control Study (960 cases and 513 controls), Epigene-QSkin Study (44 544, of which 766 with melanoma), and Swedish Women's Lifestyle and Health Cohort Study (49 259 women, of which 273 had melanoma). We validated model performance internally and externally by assessing discrimination using the area under the receiver operating curve (AUC). Additionally, using the Swedish Women's Lifestyle and Health Cohort Study, we assessed model calibration and clinical usefulness. The risk prediction model included hair color, nevus density, first-degree family history of melanoma, previous nonmelanoma skin cancer, and lifetime sunbed use. On internal validation, the AUC was 0.70 (95% CI, 0.67-0.73). On external validation, the AUC was 0.66 (95% CI, 0.63-0.69) in the Western Australia Melanoma Study, 0.67 (95% CI, 0.65-0.70) in the Leeds Melanoma Case-Control Study, 0.64 (95% CI, 0.62-0.66) in the Epigene-QSkin Study, and 0.63 (95% CI, 0.60-0.67) in the Swedish Women's Lifestyle and Health Cohort Study. Model calibration showed close agreement between predicted and observed numbers of incident melanomas across all deciles of predicted risk. In the external validation setting, there was higher net benefit when using the risk prediction model to classify individuals as high risk compared with classifying all individuals as high risk. The melanoma risk prediction model performs well and may be useful in prevention interventions reliant on a risk assessment using self-assessed risk factors.
Richard's, María M; Introzzi, Isabel; Zamora, Eliana; Vernucci, Santiago
2017-01-01
Inhibition is one of the main executive functions, because of its fundamental role in cognitive and social development. Given the importance of reliable and computerized measurements to assessment inhibitory performance, this research intends to analyze the internal and external criteria of validity of a computerized conjunction search task, to evaluate the role of perceptual inhibition. A sample of 41 children (21 females and 20 males), aged between 6 and 11 years old (M = 8.49, SD = 1.47), intentionally selected from a private management school of Mar del Plata (Argentina), middle socio-economic level were assessed. The Conjunction Search Task from the TAC Battery, Coding and Symbol Search tasks from Wechsler Intelligence Scale for Children were used. Overall, results allow us to confirm that the perceptual inhibition task form TAC presents solid rates of internal and external validity that make a valid measurement instrument of this process.
[Clinical and empirical findings with the OPD-CA].
Winter, Sibylle; Jelen, Anna; Pressel, Christine; Lenz, Klaus; Lehmkuhl, Ulrike
2011-01-01
60 clinical patients (5-17 years) were diagnosed with an interview-manual of OPD-CA (Winter, 2004). For clinical validity a comparison of patients with internal (N=17) and external disorders (N=19) was shown. References for clinical validity resulted from the comparison of the groups, especially for the axes "conflict" and "prerequisites for treatment". Patients with internal disorders showed the conflict desire for care versus autarchy significantly more often than patients with external disorders. On the other hand patients with external disorders displayed the conflict submission versus control significantly more often. Significant differences were also found for the axis "prerequisites for treatment". Patients with internal disorders had better "prerequisites for treatment" in the domains experience of illness and the prerequisites for therapy. For the axes "interpersonal relation", "structure" and "prerequisites for treatment" satisfactory data for validity and reliability were found. The clinical validity points to the usefulness of OPD-CA-manual for psychodynamic diagnostics in childhood and adolescence.
ERIC Educational Resources Information Center
Kong, Anthony Pak-Hin
2011-01-01
Purpose: The 1st aim of this study was to further establish the external validity of the main concept (MC) analysis by examining its relationship with the Cantonese Linguistic Communication Measure (CLCM; Kong, 2006; Kong & Law, 2004)--an established quantitative system for narrative production--and the Cantonese version of the Western Aphasia…
External Validity of Childhood Disintegrative Disorder in Comparison with Autistic Disorder
ERIC Educational Resources Information Center
Kurita, Hiroshi; Osada, Hirokazu; Miyake, Yuko
2004-01-01
To examine the external validity of DSM-IV childhood disintegrative disorder (CDD), 10 children (M = 8.2 yrs) with CDD and 152 gender- and age-matched children with autistic disorder (AD) were compared on 24 variables. The CDD children had a significantly higher rate of epilepsy, significantly less uneven intellectual functioning, and a tendency…
Shiovitz-Ezra, Sharon; Leitsch, Sara; Graber, Jessica; Karraker, Amelia
2009-11-01
The National Social Life, Health, and Aging Project (NSHAP) measures seven indicators of quality of life (QoL) and psychological health. The measures used for happiness, self-esteem, depression, and loneliness are well established in the literature. Conversely, measures of anxiety, stress, and self-reported emotional health were modified for their use in this unique project. The purpose of this paper is to provide (a) an overview of NSHAP's QoL assessment and (b) evidence for the adequacy of the modified measures. First, we examined the psychometric properties of the modified measures. Second, the established QoL measures were used to examine the concurrent validity of the modified measures. Finally, gender- and age-group differences were examined for each modified measure. The anxiety index exhibited good internal reliability and concurrent validity. Consistent with the literature, a single-factor structure best fit the data. Stress was satisfactory in terms of concurrent validity but with only fair internal consistency. Self-reported emotional health exhibited good concurrent validity and moderate external validity. The modified indices used in NSHAP tended to exhibit good internal reliability and concurrent validity. These measures can confidently be used in the exploration of QoL and psychological health in later life and its many correlates.
Lupano Perugini, María Laura; de la Iglesia, Guadalupe; Castro Solano, Alejandro; Keyes, Corey Lee M.
2017-01-01
The present research aimed at studying the psychometric properties of the Mental Health Continuum–Short Form (MHC–SF; Keyes, 2005) in a sample of 1,300 Argentinean adults (50% males; 50% females). Their mean age was 40.28 years old (SD = 13.59). The MHC–SF is a 14 item test that assesses three components (i.e., emotional, social, and psychological) of well-being. Convergent and divergent evidence of construct validity was assessed by conducting confirmatory factor analysis, cross-validation, factorial invariance, and correlations with external criteria. Internal consistency was studied using Cronbach’s alphas. Results indicated an adequate fit of a three-dimensional model. This structure was also confirmed, and was invariant throughout sex and age. The emotional well-being scores converged with life satisfaction and positive affect measures; the psychological well-being scale had a positive association with the presence of meaning in life; and the social well-being scores showed a positive and strong correlation with an external measure of well-being. Also, all scores were negatively associated with negative affect, search of meaning in life, and presence of depression symptoms. Internal consistency was .89 for the MHC–SF. Furthermore, the findings supported the two - continua model of mental health. PMID:28344677
Ogurtsova, Katherine; Heise, Thomas L; Linnenkamp, Ute; Dintsios, Charalabos-Markos; Lhachimi, Stefan K; Icks, Andrea
2017-12-29
Type 2 diabetes mellitus (T2DM), a highly prevalent chronic disease, puts a large burden on individual health and health care systems. Computer simulation models, used to evaluate the clinical and economic effectiveness of various interventions to handle T2DM, have become a well-established tool in diabetes research. Despite the broad consensus about the general importance of validation, especially external validation, as a crucial instrument of assessing and controlling for the quality of these models, there are no systematic reviews comparing such validation of diabetes models. As a result, the main objectives of this systematic review are to identify and appraise the different approaches used for the external validation of existing models covering the development and progression of T2DM. We will perform adapted searches by applying respective search strategies to identify suitable studies from 14 electronic databases. Retrieved study records will be included or excluded based on predefined eligibility criteria as defined in this protocol. Among others, a publication filter will exclude studies published before 1995. We will run abstract and full text screenings and then extract data from all selected studies by filling in a predefined data extraction spreadsheet. We will undertake a descriptive, narrative synthesis of findings to address the study objectives. We will pay special attention to aspects of quality of these models in regard to the external validation based upon ISPOR and ADA recommendations as well as Mount Hood Challenge reports. All critical stages within the screening, data extraction and synthesis processes will be conducted by at least two authors. This protocol adheres to PRISMA and PRISMA-P standards. The proposed systematic review will provide a broad overview of the current practice in the external validation of models with respect to T2DM incidence and progression in humans built on simulation techniques. PROSPERO CRD42017069983 .
Mullen, Patricia Dolan; Savas, Lara S; Bundy, Łucja T; Haardörfer, Regine; Hovell, Mel; Fernández, Maria E; Monroy, Jo Ann A; Williams, Rebecca S; Kreuter, Matthew W; Jobe, David; Kegler, Michelle C
2016-10-01
Replication of intervention research is reported infrequently, limiting what we know about external validity and generalisability. The Smoke Free Homes Program, a minimal intervention, increased home smoking bans by United Way 2-1-1 callers in randomised controlled trials in Atlanta, Georgia and North Carolina. Test the programme's generalisability-external validity in a different context. A randomised controlled trial (n=508) of English-speaking callers from smoking-discordant households (≥1 smoker and ≥1 non-smoker). 2-1-1 Texas/United Way HELPLINE call specialists serving the Texas Gulf Coast recruited callers and delivered three mailings and one coaching call, supported by an online tracking system. Data collectors, blind to study assignment, conducted telephone interviews 3 and 6 months postbaseline. At 3 months, more intervention households reported a smoke-free home (46.6% vs 25.4%, p<0.0001; growth model intent-to-treat OR=1.48, 95% CI 1.241 to 1.772, p<0.0001). At 6 months, self-reported full bans were 62.9% for intervention participants and 38.4% for controls (OR=2.19). Texas trial participants were predominantly women (83%), single-smoker households (76%) and African-American (65%); half had incomes ≤US$10 000/year (50%). Texas recruitment was <50% of the other sites. Fewer callers reported having a smoker in the household. Almost twice the callers with a household smoker declined interest in the programme/study. Our findings in a region with lower smoking rates and more diverse callers, including English-speaking Latinos, support programme generalisability and convey evidence of external validity. Our recruitment experience indicates that site-specific adjustments might improve recruitment efficiency and reach. NCT02097914, Results. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Mullen, Patricia Dolan; Savas, Lara S; Bundy, Łucja T; Haardörfer, Regine; Hovell, Mel; Fernández, Maria E; Monroy, Jo Ann A; Williams, Rebecca S; Kreuter, Matthew W; Jobe, David; Kegler, Michelle C
2016-01-01
Background Replication of intervention research is reported infrequently, limiting what we know about external validity and generalisability. The Smoke Free Homes Program, a minimal intervention, increased home smoking bans by United Way 2-1-1 callers in randomised controlled trials in Atlanta, Georgia and North Carolina. Objective Test the programme's generalisability-external validity in a different context. Methods A randomised controlled trial (n=508) of English-speaking callers from smoking-discordant households (≥1 smoker and ≥1 non-smoker). 2-1-1 Texas/United Way HELPLINE call specialists serving the Texas Gulf Coast recruited callers and delivered three mailings and one coaching call, supported by an online tracking system. Data collectors, blind to study assignment, conducted telephone interviews 3 and 6 months postbaseline. Results At 3 months, more intervention households reported a smoke-free home (46.6% vs 25.4%, p<0.0001; growth model intent-to-treat OR=1.48, 95% CI 1.241 to 1.772, p<0.0001). At 6 months, self-reported full bans were 62.9% for intervention participants and 38.4% for controls (OR=2.19). Texas trial participants were predominantly women (83%), single-smoker households (76%) and African-American (65%); half had incomes ≤US$10 000/year (50%). Texas recruitment was <50% of the other sites. Fewer callers reported having a smoker in the household. Almost twice the callers with a household smoker declined interest in the programme/study. Conclusions Our findings in a region with lower smoking rates and more diverse callers, including English-speaking Latinos, support programme generalisability and convey evidence of external validity. Our recruitment experience indicates that site-specific adjustments might improve recruitment efficiency and reach. Trial registration number NCT02097914, Results. PMID:27697943
Hazell, Lorna; Raschi, Emanuel; De Ponti, Fabrizio; Thomas, Simon H L; Salvo, Francesco; Ahlberg Helgee, Ernst; Boyer, Scott; Sturkenboom, Miriam; Shakir, Saad
2017-05-01
A systematic review was performed to categorize the hERG (human ether-a-go-go-related gene) liability of antihistamines, antipsychotics, and anti-infectives and to compare it with current clinical risk of torsade de pointes (TdP). Eligible studies were hERG assays reporting half-minimal inhibitory concentrations (IC50). A "hERG safety margin" was calculated from the IC50 divided by the peak human plasma concentration (free C max ). A margin below 30 defined hERG liability. Each drug was assigned an "uncertainty score" based on volume, consistency, precision, and internal and external validity of evidence. The hERG liability was compared to existing knowledge on TdP risk (www.credibledrugs.org). Of 1828 studies, 82 were eligible, allowing calculation of safety margins for 61 drugs. Thirty-one drugs (51%) had evidence of hERG liability including 6 with no previous mention of TdP risk (eg, desloratadine, lopinavir). Conversely, 16 drugs (26%) had no evidence of hERG liability including 6 with known, or at least conditional or possible, TdP risk (eg, chlorpromazine, sulpiride). The main sources of uncertainty were the validity of the experimental conditions used (antihistamines and antipsychotics) and nonuse of reference compounds (anti-infectives). In summary, hERG liability was categorized for 3 widely used drug classes, incorporating a qualitative assessment of the strength of available evidence. Some concordance with TdP risk was observed, although several drugs had hERG liability without evidence of clinical risk and vice versa. This may be due to gaps in clinical evidence, limitations of hERG/C max data, or other patient/drug-specific factors that contribute to real-life TdP risk. © 2016, The American College of Clinical Pharmacology.
ERIC Educational Resources Information Center
Lanyon, Richard I.; Carle, Adam C.
2007-01-01
The internal and external validity of scores on the two-scale Balanced Inventory of Desirable Responding (BIDR) and its recent revision, the Paulhus Deception Scales (PDS), developed to measure two facets of social desirability, were studied with three groups of forensic clients and two groups of college undergraduates (total N = 519). The two…
Translation and validation of the German version of the Bournemouth Questionnaire for Neck Pain.
Soklic, Marina; Peterson, Cynthia; Humphreys, B Kim
2012-01-25
Clinical outcome measures are important tools to monitor patient improvement during treatment as well as to document changes for research purposes. The short-form Bournemouth questionnaire for neck pain patients (BQN) was developed from the biopsychosocial model and measures pain, disability, cognitive and affective domains. It has been shown to be a valid and reliable outcome measure in English, French and Dutch and more sensitive to change compared to other questionnaires. The purpose of this study was to translate and validate a German version of the Bournemouth questionnaire for neck pain patients. German translation and back translation into English of the BQN was done independently by four persons and overseen by an expert committee. Face validity of the German BQN was tested on 30 neck pain patients in a single chiropractic practice. Test-retest reliability was evaluated on 31 medical students and chiropractors before and after a lecture. The German BQN was then assessed on 102 first time neck pain patients at two chiropractic practices for internal consistency, external construct validity, external longitudinal construct validity and sensitivity to change compared to the German versions of the Neck Disability Index (NDI) and the Neck Pain and Disability Scale (NPAD). Face validity testing lead to minor changes to the German BQN. The Intraclass Correlation Coefficient for the test-retest reliability was 0.99. The internal consistency was strong for all 7 items of the BQN with Cronbach α's of .79 and .80 for the pre and post-treatment total scores. External construct validity and external longitudinal construct validity using Pearson's correlation coefficient showed statistically significant correlations for all 7 scales of the BQN with the other questionnaires. The German BQN showed greater responsiveness compared to the other questionnaires for all scales. The German BQN is a valid and reliable outcome measure that has been successfully translated and culturally adapted. It is shorter, easier to use, and more responsive to change than the NDI and NPAD.
[Spanish version of the Multidimensional health locus of control scale innursing students].
Tomás-Sábado, Joaquín; Montes-Hidalgo, Javier
2016-01-01
To determine the preliminary psychometric properties of the Spanish form of the Multidimensional Health Locus of Control Scale (MHLC), which consists of three subscales: (1) Internalitu, (2) Powerful other externality, and (3) Chance externality. It also aims to study the relationship that the internal/external health control beliefs has with self-esteem, self-efficacy and perceived competence in a sample of nursing undergraduates. An observational and cross-sectional study including 109 nursing students who completed an anonymous questionnaire containing the demographic variables and the Spanish versions of the MHLC, the Rosenberg Self-Esteem Scale, the General Self-Efficacy Scale, and the Perceived personal competence Scale. A Cronbach's alpha coefficient of 0.713 for Internality, 0.665 for Chance and 0.728 for Powerful other were obtained. The test-retest correlation for the 18 items of the MHLC was 0.866. Internality subscale was positively and significantly correlated with self-efficacy and competence. By contrast, chance externality has negative and significant correlations with self-esteem and competence. There are no significant gender differences in any of the subscales. Younger subjects show greater tendency to external attribution. Factor analysis confirms the three-factor hypothesis. The results suggest that the Spanish form of the MHLC has adequate construct validity and acceptable metric properties. Also, they evidence the relationship between the attribution of health-related internal control with the perceived well-being and confidence in their own skills and abilities. Copyright © 2016 Elsevier España, S.L.U. All rights reserved.
3D Simulation of External Flooding Events for the RISMC Pathway
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prescott, Steven; Mandelli, Diego; Sampath, Ramprasad
2015-09-01
Incorporating 3D simulations as part of the Risk-Informed Safety Margins Characterization (RISMIC) Toolkit allows analysts to obtain a more complete picture of complex system behavior for events including external plant hazards. External events such as flooding have become more important recently – however these can be analyzed with existing and validated simulated physics toolkits. In this report, we describe these approaches specific to flooding-based analysis using an approach called Smoothed Particle Hydrodynamics. The theory, validation, and example applications of the 3D flooding simulation are described. Integrating these 3D simulation methods into computational risk analysis provides a spatial/visual aspect to themore » design, improves the realism of results, and can prove visual understanding to validate the analysis of flooding.« less
German Translation and Validation of the Cognitive Style Questionnaire Short Form (CSQ-SF-D)
Huys, Quentin J. M.; Renz, Daniel; Petzschner, Frederike; Berwian, Isabel; Stoppel, Christian; Haker, Helene
2016-01-01
Background The Cognitive Style Questionnaire is a valuable tool for the assessment of hopeless cognitive styles in depression research, with predictive power in longitudinal studies. However, it is very burdensome to administer. Even the short form is still long, and neither this nor the original version exist in validated German translations. Methods The questionnaire was translated from English to German, back-translated and commented on by clinicians. The reliability, factor structure and external validity of an online form of the questionnaire were examined on 214 participants. External validity was measured on a subset of 90 subjects. Results The resulting CSQ-SF-D had good to excellent reliability, both across items and subscales, and similar external validity to the original English version. The internality subscale appeared less robust than other subscales. A detailed analysis of individual item performance suggests that stable results could be achieved with a very short form (CSQ-VSF-D) including only 27 of the 72 items. Conclusions The CSQ-SF-D is a validated and freely distributed translation of the CSQ-SF into German. This should make efficient assessment of cognitive style in German samples more accessible to researchers. PMID:26934499
German Translation and Validation of the Cognitive Style Questionnaire Short Form (CSQ-SF-D).
Huys, Quentin J M; Renz, Daniel; Petzschner, Frederike; Berwian, Isabel; Stoppel, Christian; Haker, Helene
2016-01-01
The Cognitive Style Questionnaire is a valuable tool for the assessment of hopeless cognitive styles in depression research, with predictive power in longitudinal studies. However, it is very burdensome to administer. Even the short form is still long, and neither this nor the original version exist in validated German translations. The questionnaire was translated from English to German, back-translated and commented on by clinicians. The reliability, factor structure and external validity of an online form of the questionnaire were examined on 214 participants. External validity was measured on a subset of 90 subjects. The resulting CSQ-SF-D had good to excellent reliability, both across items and subscales, and similar external validity to the original English version. The internality subscale appeared less robust than other subscales. A detailed analysis of individual item performance suggests that stable results could be achieved with a very short form (CSQ-VSF-D) including only 27 of the 72 items. The CSQ-SF-D is a validated and freely distributed translation of the CSQ-SF into German. This should make efficient assessment of cognitive style in German samples more accessible to researchers.
Risk score to predict gastrointestinal bleeding after acute ischemic stroke.
Ji, Ruijun; Shen, Haipeng; Pan, Yuesong; Wang, Penglian; Liu, Gaifen; Wang, Yilong; Li, Hao; Singhal, Aneesh B; Wang, Yongjun
2014-07-25
Gastrointestinal bleeding (GIB) is a common and often serious complication after stroke. Although several risk factors for post-stroke GIB have been identified, no reliable or validated scoring system is currently available to predict GIB after acute stroke in routine clinical practice or clinical trials. In the present study, we aimed to develop and validate a risk model (acute ischemic stroke associated gastrointestinal bleeding score, the AIS-GIB score) to predict in-hospital GIB after acute ischemic stroke. The AIS-GIB score was developed from data in the China National Stroke Registry (CNSR). Eligible patients in the CNSR were randomly divided into derivation (60%) and internal validation (40%) cohorts. External validation was performed using data from the prospective Chinese Intracranial Atherosclerosis Study (CICAS). Independent predictors of in-hospital GIB were obtained using multivariable logistic regression in the derivation cohort, and β-coefficients were used to generate point scoring system for the AIS-GIB. The area under the receiver operating characteristic curve (AUROC) and the Hosmer-Lemeshow goodness-of-fit test were used to assess model discrimination and calibration, respectively. A total of 8,820, 5,882, and 2,938 patients were enrolled in the derivation, internal validation and external validation cohorts. The overall in-hospital GIB after AIS was 2.6%, 2.3%, and 1.5% in the derivation, internal, and external validation cohort, respectively. An 18-point AIS-GIB score was developed from the set of independent predictors of GIB including age, gender, history of hypertension, hepatic cirrhosis, peptic ulcer or previous GIB, pre-stroke dependence, admission National Institutes of Health stroke scale score, Glasgow Coma Scale score and stroke subtype (Oxfordshire). The AIS-GIB score showed good discrimination in the derivation (0.79; 95% CI, 0.764-0.825), internal (0.78; 95% CI, 0.74-0.82) and external (0.76; 95% CI, 0.71-0.82) validation cohorts. The AIS-GIB score was well calibrated in the derivation (P = 0.42), internal (P = 0.45) and external (P = 0.86) validation cohorts. The AIS-GIB score is a valid clinical grading scale to predict in-hospital GIB after AIS. Further studies on the effect of the AIS-GIB score on reducing GIB and improving outcome after AIS are warranted.
Integrated Medical Model (IMM) Project Verification, Validation, and Credibility (VVandC)
NASA Technical Reports Server (NTRS)
Walton, M.; Boley, L.; Keenan, L.; Kerstman, E.; Shah, R.; Young, M.; Saile, L.; Garcia, Y.; Meyers, J.; Reyes, D.
2015-01-01
The Integrated Medical Model (IMM) Project supports end user requests by employing the Integrated Medical Evidence Database (iMED) and IMM tools as well as subject matter expertise within the Project. The iMED houses data used by the IMM. The IMM is designed to forecast relative changes for a specified set of crew health and mission success risk metrics by using a probabilistic model based on historical data, cohort data, and subject matter expert opinion. A stochastic approach is taken because deterministic results would not appropriately reflect the uncertainty in the IMM inputs. Once the IMM was conceptualized, a plan was needed to rigorously assess input information, framework and code, and output results of the IMM, and ensure that end user requests and requirements were considered during all stages of model development and implementation, as well as lay the foundation for external review and application. METHODS: In 2008, the Project team developed a comprehensive verification and validation (VV) plan, which specified internal and external review criteria encompassing 1) verification of data and IMM structure to ensure proper implementation of the IMM, 2) several validation techniques to confirm that the simulation capability of the IMM appropriately represents occurrences and consequences of medical conditions during space missions, and 3) credibility processes to develop user confidence in the information derived from the IMM. When the NASA-STD-7009 (7009) [1] was published, the Project team updated their verification, validation, and credibility (VVC) project plan to meet 7009 requirements and include 7009 tools in reporting VVC status of the IMM. Construction of these tools included meeting documentation and evidence requirements sufficient to meet external review success criteria. RESULTS: IMM Project VVC updates are compiled recurrently and include updates to the 7009 Compliance and Credibility matrices. Reporting tools have evolved over the lifetime of the IMM Project to better communicate VVC status. This has included refining original 7009 methodology with augmentation from the HRP NASA-STD-7009 Guidance Document working group and the NASA-HDBK-7009 [2]. End user requests and requirements are being satisfied as evidenced by ISS Program acceptance of IMM risk forecasts, transition to an operational model and simulation tool, and completion of service requests from a broad end user consortium including operations, science and technology planning, and exploration planning. IMM v4.0 is slated for operational release in the FY015 and current VVC assessments illustrate the expected VVC status prior to the completion of customer lead external review efforts. CONCLUSIONS: The VVC approach established by the IMM Project of incorporating Project-specific recommended practices and guidelines for implementing the 7009 requirements is comprehensive and includes the involvement of end users at every stage in IMM evolution. Methods and techniques used to quantify the VVC status of the IMM Project represented a critical communication tool in providing clear and concise suitability assessments to IMM customers. These processes have not only received approval from the local NASA community but have also garnered recognition by other federal agencies seeking to develop similar guidelines in the medical modeling community.
Pan, Xiaoping; Chen, Haobo; Bickerton, Wai-Ling; Lau, Johnny King Lam; Kong, Anthony Pak Hin; Rotshtein, Pia; Guo, Aihua; Hu, Jianxi; Humphreys, Glyn W
2015-01-01
Background There are no currently effective cognitive assessment tools for patients who have suffered stroke in the People’s Republic of China. The Birmingham Cognitive Screen (BCoS) has been shown to be a promising tool for revealing patients’ poststroke cognitive deficits in specific domains, which facilitates more individually designed rehabilitation in the long run. Hence we examined the reliability and validity of a Cantonese version BCoS in patients with acute ischemic stroke, in Guangzhou. Method A total of 98 patients with acute ischemic stroke were assessed with the Cantonese version of the BCoS, and an additional 133 healthy individuals were recruited as controls. Apart from the BCoS, the patients also completed a number of external cognitive tests, including the Montreal Cognitive Assessment Test (MoCA), Mini Mental State Examination (MMSE), Albert’s cancellation test, the Rey–Osterrieth Complex Figure Test, and six gesture matching tasks. Cutoff scores for failing each subtest, ie, deficits, were computed based on the performance of the controls. The validity and reliability of the Cantonese BCoS were examined, as well as interrater and test–retest reliability. We also compared the proportions of cases being classified as deficits in controlled attention, memory, character writing, and praxis, between patients with and without spoken language impairment. Results Analyses showed high test–retest reliability and agreement across independent raters on the qualitative aspects of measurement. Significant correlations were observed between the subtests of the Cantonese BCoS and the other external cognitive tests, providing evidence for convergent validity of the Cantonese BCoS. The screen was also able to generate measures of cognitive functions that were relatively uncontaminated by the presence of aphasia. Conclusion This study suggests good reliability and validity of the Cantonese version of the BCoS. The Cantonese BCoS is a very promising tool for the detection of cognitive problems in Cantonese speakers. PMID:26396522
Best Practices: How to Evaluate Psychological Science for Use by Organizations.
Fiske, Susan T; Borgida, Eugene
2011-01-01
We discuss how organizations can evaluate psychological science for its potential usefulness to their own purposes. Common sense is often the default but inadequate alternative, and bench-marking supplies only collective hunches instead of validated principles. External validity is an empirical process of identifying moderator variables, not a simple yes-no judgment about whether lab results replicate in the field. Hence, convincing criteria must specify what constitutes high-quality empirical evidence for organizational use. First, we illustrate some theories and science that have potential use. Then we describe generally accepted criteria for scientific quality and consensus, starting with peer review for quality, and scientific agreement in forms ranging from surveys of experts to meta-analyses to National Research Council consensus reports. Linkages of basic science to organizations entail communicating expert scientific consensus, motivating managerial interest, and translating broad principles to specific contexts. We close with parting advice to both sides of the researcher-practitioner divide.
Validity, Responsibility, and Aporia
ERIC Educational Resources Information Center
Koro-Ljungberg, Mirka
2010-01-01
In this article, the author problematizes external, objectified, oversimplified, and mechanical approaches to validity in qualitative research, which endorse simplistic and reductionist views of knowledge and data. Instead of promoting one generalizable definition or operational criteria for validity, the author's "deconstructive validity work"…
Rahman, M Shafiqur; Ambler, Gareth; Choodari-Oskooei, Babak; Omar, Rumana Z
2017-04-18
When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell's concordance measure which tended to increase as censoring increased. We recommend that Uno's concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller's measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston's D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures.
Maarsingh, O R; Heymans, M W; Verhaak, P F; Penninx, B W J H; Comijs, H C
2018-08-01
Given the poor prognosis of late-life depression, it is crucial to identify those at risk. Our objective was to construct and validate a prediction rule for an unfavourable course of late-life depression. For development and internal validation of the model, we used The Netherlands Study of Depression in Older Persons (NESDO) data. We included participants with a major depressive disorder (MDD) at baseline (n = 270; 60-90 years), assessed with the Composite International Diagnostic Interview (CIDI). For external validation of the model, we used The Netherlands Study of Depression and Anxiety (NESDA) data (n = 197; 50-66 years). The outcome was MDD after 2 years of follow-up, assessed with the CIDI. Candidate predictors concerned sociodemographics, psychopathology, physical symptoms, medication, psychological determinants, and healthcare setting. Model performance was assessed by calculating calibration and discrimination. 111 subjects (41.1%) had MDD after 2 years of follow-up. Independent predictors of MDD after 2 years were (older) age, (early) onset of depression, severity of depression, anxiety symptoms, comorbid anxiety disorder, fatigue, and loneliness. The final model showed good calibration and reasonable discrimination (AUC of 0.75; 0.70 after external validation). The strongest individual predictor was severity of depression (AUC of 0.69; 0.68 after external validation). The model was developed and validated in The Netherlands, which could affect the cross-country generalizability. Based on rather simple clinical indicators, it is possible to predict the 2-year course of MDD. The prediction rule can be used for monitoring MDD patients and identifying those at risk of an unfavourable outcome. Copyright © 2018 Elsevier B.V. All rights reserved.
Atri, Alireza; Rountree, Susan D.; Lopez, Oscar L.; Doody, Rachelle S.
2012-01-01
Background Randomized controlled efficacy trials (RCTs), the scientific gold standard, are required for regulatory approval of Alzheimer's disease (AD) interventions, yet provide limited information regarding real-world therapeutic effectiveness. Objective: To compare the nature of evidence regarding the combination of approved AD treatments from RCTs versus long-term observational controlled studies (LTOCs). Methods Comparisons of strengths, limitations, and evidence level for monotherapy [cholinesterase inhibitor (ChEI) or memantine] and combination therapy (ChEI + memantine) in RCTs versus LTOCs. Results RCTs examined highly selected populations over months. LTOCs collected data across multiple AD stages in large populations over many years. RCTs and LTOCs show similar patterns favoring combination over monotherapy over placebo/no treatment. Long-term combination therapy compared to monotherapy reduced cognitive and functional decline and delayed time to nursing home admission. Persistent treatment was associated with slower decline. While LTOCs used control groups, adjusted for multiple covariates, had higher external validity, and favorable ethical, practical and cost considerations, their limitations included potential selection bias due to lack of placebo comparisons and randomization. Conclusions Naturalistic LTOCs provide complementary long-term level II evidence to complement level I evidence from short-term RCTs regarding therapeutic effectiveness in AD that may otherwise be unobtainable. A coordinated strategy/consortium to pool LTOC data from multiple centers to estimate long-term comparative effectiveness, risks/benefits, and costs of AD treatments is needed. PMID:22327239
Geographic Information Systems to Assess External Validity in Randomized Trials.
Savoca, Margaret R; Ludwig, David A; Jones, Stedman T; Jason Clodfelter, K; Sloop, Joseph B; Bollhalter, Linda Y; Bertoni, Alain G
2017-08-01
To support claims that RCTs can reduce health disparities (i.e., are translational), it is imperative that methodologies exist to evaluate the tenability of external validity in RCTs when probabilistic sampling of participants is not employed. Typically, attempts at establishing post hoc external validity are limited to a few comparisons across convenience variables, which must be available in both sample and population. A Type 2 diabetes RCT was used as an example of a method that uses a geographic information system to assess external validity in the absence of a priori probabilistic community-wide diabetes risk sampling strategy. A geographic information system, 2009-2013 county death certificate records, and 2013-2014 electronic medical records were used to identify community-wide diabetes prevalence. Color-coded diabetes density maps provided visual representation of these densities. Chi-square goodness of fit statistic/analysis tested the degree to which distribution of RCT participants varied across density classes compared to what would be expected, given simple random sampling of the county population. Analyses were conducted in 2016. Diabetes prevalence areas as represented by death certificate and electronic medical records were distributed similarly. The simple random sample model was not a good fit for death certificate record (chi-square, 17.63; p=0.0001) and electronic medical record data (chi-square, 28.92; p<0.0001). Generally, RCT participants were oversampled in high-diabetes density areas. Location is a highly reliable "principal variable" associated with health disparities. It serves as a directly measurable proxy for high-risk underserved communities, thus offering an effective and practical approach for examining external validity of RCTs. Copyright © 2017 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Predicting survival of men with recurrent prostate cancer after radical prostatectomy.
Dell'Oglio, Paolo; Suardi, Nazareno; Boorjian, Stephen A; Fossati, Nicola; Gandaglia, Giorgio; Tian, Zhe; Moschini, Marco; Capitanio, Umberto; Karakiewicz, Pierre I; Montorsi, Francesco; Karnes, R Jeffrey; Briganti, Alberto
2016-02-01
To develop and externally validate a novel nomogram aimed at predicting cancer-specific mortality (CSM) after biochemical recurrence (BCR) among prostate cancer (PCa) patients treated with radical prostatectomy (RP) with or without adjuvant external beam radiotherapy (aRT) and/or hormonal therapy (aHT). The development cohort included 689 consecutive PCa patients treated with RP between 1987 and 2011 with subsequent BCR, defined as two subsequent prostate-specific antigen values >0.2 ng/ml. Multivariable competing-risks regression analyses tested the predictors of CSM after BCR for the purpose of 5-year CSM nomogram development. Validation (2000 bootstrap resamples) was internally tested. External validation was performed into a population of 6734 PCa patients with BCR after treatment with RP at the Mayo Clinic from 1987 to 2011. The predictive accuracy (PA) was quantified using the receiver operating characteristic-derived area under the curve and the calibration plot method. The 5-year CSM-free survival rate was 83.6% (confidence interval [CI]: 79.6-87.2). In multivariable analyses, pathologic stage T3b or more (hazard ratio [HR]: 7.42; p = 0.008), pathologic Gleason score 8-10 (HR: 2.19; p = 0.003), lymph node invasion (HR: 3.57; p = 0.001), time to BCR (HR: 0.99; p = 0.03) and age at BCR (HR: 1.04; p = 0.04), were each significantly associated with the risk of CSM after BCR. The bootstrap-corrected PA was 87.4% (bootstrap 95% CI: 82.0-91.7%). External validation of our nomogram showed a good PA at 83.2%. We developed and externally validated the first nomogram predicting 5-year CSM applicable to contemporary patients with BCR after RP with or without adjuvant treatment. Copyright © 2015 Elsevier Ltd. All rights reserved.
Forzley, Brian; Er, Lee; Chiu, Helen Hl; Djurdjev, Ognjenka; Martinusen, Dan; Carson, Rachel C; Hargrove, Gaylene; Levin, Adeera; Karim, Mohamud
2018-02-01
End-stage kidney disease is associated with poor prognosis. Health care professionals must be prepared to address end-of-life issues and identify those at high risk for dying. A 6-month mortality prediction model for patients on dialysis derived in the United States is used but has not been externally validated. We aimed to assess the external validity and clinical utility in an independent cohort in Canada. We examined the performance of the published 6-month mortality prediction model, using discrimination, calibration, and decision curve analyses. Data were derived from a cohort of 374 prevalent dialysis patients in two regions of British Columbia, Canada, which included serum albumin, age, peripheral vascular disease, dementia, and answers to the "the surprise question" ("Would I be surprised if this patient died within the next year?"). The observed mortality in the validation cohort was 11.5% at 6 months. The prediction model had reasonable discrimination (c-stat = 0.70) but poor calibration (calibration-in-the-large = -0.53 (95% confidence interval: -0.88, -0.18); calibration slope = 0.57 (95% confidence interval: 0.31, 0.83)) in our data. Decision curve analysis showed the model only has added value in guiding clinical decision in a small range of threshold probabilities: 8%-20%. Despite reasonable discrimination, the prediction model has poor calibration in this external study cohort; thus, it may have limited clinical utility in settings outside of where it was derived. Decision curve analysis clarifies limitations in clinical utility not apparent by receiver operating characteristic curve analysis. This study highlights the importance of external validation of prediction models prior to routine use in clinical practice.
Hiligsmann, Mickaël; Ethgen, Olivier; Bruyère, Olivier; Richy, Florent; Gathon, Henry-Jean; Reginster, Jean-Yves
2009-01-01
Markov models are increasingly used in economic evaluations of treatments for osteoporosis. Most of the existing evaluations are cohort-based Markov models missing comprehensive memory management and versatility. In this article, we describe and validate an original Markov microsimulation model to accurately assess the cost-effectiveness of prevention and treatment of osteoporosis. We developed a Markov microsimulation model with a lifetime horizon and a direct health-care cost perspective. The patient history was recorded and was used in calculations of transition probabilities, utilities, and costs. To test the internal consistency of the model, we carried out an example calculation for alendronate therapy. Then, external consistency was investigated by comparing absolute lifetime risk of fracture estimates with epidemiologic data. For women at age 70 years, with a twofold increase in the fracture risk of the average population, the costs per quality-adjusted life-year gained for alendronate therapy versus no treatment were estimated at €9105 and €15,325, respectively, under full and realistic adherence assumptions. All the sensitivity analyses in terms of model parameters and modeling assumptions were coherent with expected conclusions and absolute lifetime risk of fracture estimates were within the range of previous estimates, which confirmed both internal and external consistency of the model. Microsimulation models present some major advantages over cohort-based models, increasing the reliability of the results and being largely compatible with the existing state of the art, evidence-based literature. The developed model appears to be a valid model for use in economic evaluations in osteoporosis.
Johnson, Catherine; Burke, Christine; Brinkman, Sally; Wade, Tracey
2017-03-01
Mindfulness-based interventions show consistent benefits in adults for a range of pathologies, but exploration of these approaches in youth is an emergent field, with limited measures of mindfulness for this population. This study aimed to investigate whether multifactor scales of mindfulness can be used in adolescents. A series of studies are presented assessing the performance of a recently developed adult measure, the Comprehensive Inventory of Mindfulness Experiences (CHIME) in 4 early adolescent samples. Study 1 was an investigation of how well the full adult measure (37 items) was understood by youth (N = 292). Study 2 piloted a revision of items in child friendly language with a small group (N = 48). The refined questionnaire for adolescents (CHIME-A) was then tested in Study 3 in a larger sample (N = 461) and subjected to exploratory factor analysis and a range of external validity measures. Study 4 was a confirmatory factor analysis in a new sample (N = 498) with additional external validity measures. Study 5 tested temporal stability (N = 120). Results supported an 8-factor 25-item measure of mindfulness in adolescents, with excellent model fit indices and sound internal consistency for the 8 subscales. Although the CFA supported an overarching factor, internal reliability of a combined total score was poor. The development of a multifactor measure represents a first step toward testing developmental models of mindfulness in young people. This in turn will aid construction of evidence based interventions that are not simply downward derivations of adult mindfulness programs. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
In silico modeling to predict drug-induced phospholipidosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choi, Sydney S.; Kim, Jae S.; Valerio, Luis G., E-mail: luis.valerio@fda.hhs.gov
2013-06-01
Drug-induced phospholipidosis (DIPL) is a preclinical finding during pharmaceutical drug development that has implications on the course of drug development and regulatory safety review. A principal characteristic of drugs inducing DIPL is known to be a cationic amphiphilic structure. This provides evidence for a structure-based explanation and opportunity to analyze properties and structures of drugs with the histopathologic findings for DIPL. In previous work from the FDA, in silico quantitative structure–activity relationship (QSAR) modeling using machine learning approaches has shown promise with a large dataset of drugs but included unconfirmed data as well. In this study, we report the constructionmore » and validation of a battery of complementary in silico QSAR models using the FDA's updated database on phospholipidosis, new algorithms and predictive technologies, and in particular, we address high performance with a high-confidence dataset. The results of our modeling for DIPL include rigorous external validation tests showing 80–81% concordance. Furthermore, the predictive performance characteristics include models with high sensitivity and specificity, in most cases above ≥ 80% leading to desired high negative and positive predictivity. These models are intended to be utilized for regulatory toxicology applied science needs in screening new drugs for DIPL. - Highlights: • New in silico models for predicting drug-induced phospholipidosis (DIPL) are described. • The training set data in the models is derived from the FDA's phospholipidosis database. • We find excellent predictivity values of the models based on external validation. • The models can support drug screening and regulatory decision-making on DIPL.« less
Rubin, Katrine Hass; Friis-Holmberg, Teresa; Hermann, Anne Pernille; Abrahamsen, Bo; Brixen, Kim
2013-08-01
A huge number of risk assessment tools have been developed. Far from all have been validated in external studies, more of them have absence of methodological and transparent evidence, and few are integrated in national guidelines. Therefore, we performed a systematic review to provide an overview of existing valid and reliable risk assessment tools for prediction of osteoporotic fractures. Additionally, we aimed to determine if the performance of each tool was sufficient for practical use, and last, to examine whether the complexity of the tools influenced their discriminative power. We searched PubMed, Embase, and Cochrane databases for papers and evaluated these with respect to methodological quality using the Quality Assessment Tool for Diagnostic Accuracy Studies (QUADAS) checklist. A total of 48 tools were identified; 20 had been externally validated, however, only six tools had been tested more than once in a population-based setting with acceptable methodological quality. None of the tools performed consistently better than the others and simple tools (i.e., the Osteoporosis Self-assessment Tool [OST], Osteoporosis Risk Assessment Instrument [ORAI], and Garvan Fracture Risk Calculator [Garvan]) often did as well or better than more complex tools (i.e., Simple Calculated Risk Estimation Score [SCORE], WHO Fracture Risk Assessment Tool [FRAX], and Qfracture). No studies determined the effectiveness of tools in selecting patients for therapy and thus improving fracture outcomes. High-quality studies in randomized design with population-based cohorts with different case mixes are needed. Copyright © 2013 American Society for Bone and Mineral Research.
Spanish validation of the Domain-Specific Risk-Taking (DOSPERT-30) Scale.
Lozano, Luis M; Megías, Alberto; Catena, Andrés; Perales, José C; Baltruschat, Sabina; Cándido, Antonio
2017-02-01
The aim of the present study was to develop and validate a Spanish version of the short Domain-Specific Risk-Taking (DOSPERT-30) scale, measuring risk-taking behavior, risk perception, and expected beneficial consequences (from taking risks) in five life domains: ethics, finance, health/security, recreational, and social decisions. The scale was back-translated, and administered online to 826 participants. Validity evidence was tested using correlations with construct-related instruments (UPPS-P and SSS-V), as well as using factor analysis. Internal consistency reliability was calculated with the ordinal Alpha coefficient, and gender differences were considered. Internal consistency was good, and factor analysis confirmed the five factors proposed by the authors. With respect to the external validity, high correlations with the positive urgency and the sensation seeking subscales of the UPPS-P, as well as with the thrill and adventure seeking and disinhibition subscales of the SSS-V were found. Finally, gender differences were found in all subscales and domains, with men tending to take more risks, perceive less risk and expect more beneficial consequences, except for the social domain where an inverse pattern was found. As these findings are in line with the original version, they indicate the scale was successfully adapted.
Development and validation of the Spanish-English Language Proficiency Scale (SELPS).
Smyk, Ekaterina; Restrepo, M Adelaida; Gorin, Joanna S; Gray, Shelley
2013-07-01
This study examined the development and validation of a criterion-referenced Spanish-English Language Proficiency Scale (SELPS) that was designed to assess the oral language skills of sequential bilingual children ages 4-8. This article reports results for the English proficiency portion of the scale. The SELPS assesses syntactic complexity, grammatical accuracy, verbal fluency, and lexical diversity based on 2 story retell tasks. In Study 1, 40 children were given 2 story retell tasks to evaluate the reliability of parallel forms. In Study 2, 76 children participated in the validation of the scale against language sample measures and teacher ratings of language proficiency. Study 1 indicated no significant differences between the SELPS scores on the 2 stories. Study 2 indicated that the SELPS scores correlated significantly with their counterpart language sample measures. Correlations between the SELPS and teacher ratings were moderate. The 2 story retells elicited comparable SELPS scores, providing a valuable tool for test-retest conditions in the assessment of language proficiency. Correlations between the SELPS scores and external variables indicated that these measures assessed the same language skills. Results provided empirical evidence regarding the validity of inferences about language proficiency based on the SELPS score.
Soroa, Goretti; Aritzeta, Aitor; Balluerka, Nekane; Gorostiaga, Arantxa
2016-06-03
Emotional creativity is defined as the ability to feel and express emotions in a new, effective and authentic way. There are currently no Basque-language self-report instruments to provide valid and reliable measures of this construct. Thus, this paper describes the process of adapting and validating the Emotional Creativity Inventory (ECI) for the Basque-speaking population. The sample was comprised of 594 higher education students (388 women and 206 men) aged between 18 and 32 years old (Mage = 20.47; SD = 2.48). The Basque version of the ECI was administered along with the TMMS-23, NEO PI-R, and PANAS. The results of exploratory and confirmatory factor analyses on the Basque ECI corroborated the original scale's three-factor structure (preparedness, novelty, and effectiveness/authenticity). Those dimensions showed acceptable indexes of internal consistency (α = .80, .83, and .83) and temporal stability (r = .70, .69, and .74). The study also provided some evidence of external validity (p < .05) based on the relationships found between emotional creativity and emotional intelligence, personality, affect, and sex. The Basque ECI can be regarded as a useful tool to evaluate perceived emotional creativity during the preparation and verification phases of the creative process.
A RE-AIM evaluation of theory-based physical activity interventions.
Antikainen, Iina; Ellis, Rebecca
2011-04-01
Although physical activity interventions have been shown to effectively modify behavior, little research has examined the potential of these interventions for adoption in real-world settings. The purpose of this literature review was to evaluate the external validity of 57 theory-based physical activity interventions using the RE-AIM framework. The physical activity interventions included were more likely to report on issues of internal, rather than external validity and on individual, rather than organizational components of the RE-AIM framework, making the translation of many interventions into practice difficult. Furthermore, most studies included motivated, healthy participants, thus reducing the generalizability of the interventions to real-world settings that provide services to more diverse populations. To determine if a given intervention is feasible and effective in translational research, more information should be reported about the factors that affect external validity.
Psychometric properties of the Spanish version of the Passion Scale.
Chamarro, Andrés; Penelo, Eva; Fornieles, Albert; Oberst, Ursula; Vallerand, Robert J; Fernández-Castro, Jordi
2015-01-01
Passion has been shown to be involved in psychological processes that emerge in diverse human activities like physical activity and sports, work, leisure, videogaming, pathological gambling, and interpersonal relationships. We aimed to present evidence of validity and internal consistency of the Passion Scale in Spanish based on the Dualistic Model of Passion, comprising harmonious and obsessive dimensions. The sample comprised 1,007 participants (350 females and 657 males), aged 16-65 (Md= 30.0 years). Exploratory Structural Equation Modeling (ESEM), measurement invariance and Multiple-Cause-Multiple-Indicator models (MIMIC) were used. Fit for the ESEM 2-factor solution was acceptable. Near full or partial measurement invariance across sex, type of activity, and age was supported. Relationships between both harmonious and obsessive dimensions and the external variables considered (age, sex, and criterion items) reasonably replicated those found in previous studies. Both scale scores showed adequate internal consistency (α = .81). Empirical evidence for the validity and internal consistency of the Spanish version of the Passion Scale is satisfactory and reveals that the scale is comparable to the English and French versions. Therefore, the Passion Scale can be used in research conducted in Spanish.
Comment on ``Ratchet universality in the presence of thermal noise''
NASA Astrophysics Data System (ADS)
Quintero, Niurka R.; Alvarez-Nodarse, Renato; Cuesta, José A.
2013-12-01
A recent paper [P. J. Martínez and R. Chacón, Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.87.062114 87, 062114 (2013)] presents numerical simulations on a system exhibiting directed ratchet transport of a driven overdamped Brownian particle subjected to a spatially periodic, symmetric potential. The authors claim that their simulations prove the existence of a universal waveform of the external force that optimally enhances directed transport, hence confirming the validity of a previous conjecture put forth by one of them in the limit of vanishing noise intensity. With minor corrections due to noise, the conjecture holds even in the presence of noise, according to the authors. On the basis of their results the authors claim that all previous theories, which predict a different optimal force waveform, are incorrect. In this Comment we provide sufficient numerical evidence showing that there is no such universal force waveform and that the evidence obtained by the authors otherwise is due to their particular choice of parameters. Our simulations also suggest that previous theories correctly predict the shape of the optimal waveform within their validity regime, namely, when the forcing is weak. On the contrary, the aforementioned conjecture does not hold.
Comment on "Ratchet universality in the presence of thermal noise".
Quintero, Niurka R; Alvarez-Nodarse, Renato; Cuesta, José A
2013-12-01
A recent paper [P. J. Martínez and R. Chacón, Phys. Rev. E 87, 062114 (2013)] presents numerical simulations on a system exhibiting directed ratchet transport of a driven overdamped Brownian particle subjected to a spatially periodic, symmetric potential. The authors claim that their simulations prove the existence of a universal waveform of the external force that optimally enhances directed transport, hence confirming the validity of a previous conjecture put forth by one of them in the limit of vanishing noise intensity. With minor corrections due to noise, the conjecture holds even in the presence of noise, according to the authors. On the basis of their results the authors claim that all previous theories, which predict a different optimal force waveform, are incorrect. In this Comment we provide sufficient numerical evidence showing that there is no such universal force waveform and that the evidence obtained by the authors otherwise is due to their particular choice of parameters. Our simulations also suggest that previous theories correctly predict the shape of the optimal waveform within their validity regime, namely, when the forcing is weak. On the contrary, the aforementioned conjecture does not hold.
Mihura, Joni L; Meyer, Gregory J; Dumitrascu, Nicolae; Bombel, George
2016-01-01
We respond to Tibon Czopp and Zeligman's (2016) critique of our systematic reviews and meta-analyses of 65 Rorschach Comprehensive System (CS) variables published in Psychological Bulletin (2013). The authors endorsed our supportive findings but critiqued the same methodology when used for the 13 unsupported variables. Unfortunately, their commentary was based on significant misunderstandings of our meta-analytic method and results, such as thinking we used introspectively assessed criteria in classifying levels of support and reporting only a subset of our externally assessed criteria. We systematically address their arguments that our construct label and criterion variable choices were inaccurate and, therefore, meta-analytic validity for these 13 CS variables was artificially low. For example, the authors created new construct labels for these variables that they called "the customary CS interpretation," but did not describe their methodology nor provide evidence that their labels would result in better validity than ours. They cite studies they believe we should have included; we explain how these studies did not fit our inclusion criteria and that including them would have actually reduced the relevant CS variables' meta-analytic validity. Ultimately, criticisms alone cannot change meta-analytic support from negative to positive; Tibon Czopp and Zeligman would need to conduct their own construct validity meta-analyses.
Cook, Karon F; Jensen, Sally E; Schalet, Benjamin D; Beaumont, Jennifer L; Amtmann, Dagmar; Czajkowski, Susan; Dewalt, Darren A; Fries, James F; Pilkonis, Paul A; Reeve, Bryce B; Stone, Arthur A; Weinfurt, Kevin P; Cella, David
2016-05-01
To present an overview of a series of studies in which the clinical validity of the National Institutes of Health's Patient Reported Outcome Measurement Information System (NIH; PROMIS) measures was evaluated, by domain, across six clinical populations. Approximately 1,500 individuals at baseline and 1,300 at follow-up completed PROMIS measures. The analyses reported in this issue were conducted post hoc, pooling data across six previous studies, and accommodating the different designs of the six, within-condition, parent studies. Changes in T-scores, standardized response means, and effect sizes were calculated in each study. When a parent study design allowed, known groups validity was calculated using a linear mixed model. The results provide substantial support for the clinical validity of nine PROMIS measures in a range of chronic conditions. The cross-condition focus of the analyses provided a unique and multifaceted perspective on how PROMIS measures function in "real-world" clinical settings and provides external anchors that can support comparative effectiveness research. The current body of clinical validity evidence for the nine PROMIS measures indicates the success of NIH PROMIS in developing measures that are effective across a range of chronic conditions. Copyright © 2016 Elsevier Inc. All rights reserved.
Self-medication by orang-utans (Pongo pygmaeus) using bioactive properties of Dracaena cantleyi.
Morrogh-Bernard, H C; Foitová, I; Yeen, Z; Wilkin, P; de Martin, R; Rárová, L; Doležal, K; Nurcahyo, W; Olšanský, M
2017-11-30
Animals self-medicate using a variety of plant and arthropod secondary metabolites by either ingesting them or anointing them to their fur or skin apparently to repel ectoparasites and treat skin diseases. In this respect, much attention has been focused on primates. Direct evidence for self-medication among the great apes has been limited to Africa. Here we document self-medication in the only Asian great ape, orang-utans (Pongo pygmaeus), and for the first time, to our knowledge, the external application of an anti-inflammatory agent in animals. The use of leaf extracts from Dracaena cantleyi by orang-utan has been observed on several occasions; rubbing a foamy mixture of saliva and leaf onto specific parts of the body. Interestingly, the local indigenous human population also use a poultice of these leaves for the relief of body pains. We present pharmacological analyses of the leaf extracts from this species, showing that they inhibit TNFα-induced inflammatory cytokine production (E-selectin, ICAM-1, VCAM-1 and IL-6). This validates the topical anti-inflammatory properties of this plant and provides a possible function for its use by orang-utans. This is the first evidence for the deliberate external application of substances with demonstrated bioactive potential for self-medication in great apes.
Towards personalized therapy for multiple sclerosis: prediction of individual treatment response.
Kalincik, Tomas; Manouchehrinia, Ali; Sobisek, Lukas; Jokubaitis, Vilija; Spelman, Tim; Horakova, Dana; Havrdova, Eva; Trojano, Maria; Izquierdo, Guillermo; Lugaresi, Alessandra; Girard, Marc; Prat, Alexandre; Duquette, Pierre; Grammond, Pierre; Sola, Patrizia; Hupperts, Raymond; Grand'Maison, Francois; Pucci, Eugenio; Boz, Cavit; Alroughani, Raed; Van Pesch, Vincent; Lechner-Scott, Jeannette; Terzi, Murat; Bergamaschi, Roberto; Iuliano, Gerardo; Granella, Franco; Spitaleri, Daniele; Shaygannejad, Vahid; Oreja-Guevara, Celia; Slee, Mark; Ampapa, Radek; Verheul, Freek; McCombe, Pamela; Olascoaga, Javier; Amato, Maria Pia; Vucic, Steve; Hodgkinson, Suzanne; Ramo-Tello, Cristina; Flechter, Shlomo; Cristiano, Edgardo; Rozsa, Csilla; Moore, Fraser; Luis Sanchez-Menoyo, Jose; Laura Saladino, Maria; Barnett, Michael; Hillert, Jan; Butzkueven, Helmut
2017-09-01
Timely initiation of effective therapy is crucial for preventing disability in multiple sclerosis; however, treatment response varies greatly among patients. Comprehensive predictive models of individual treatment response are lacking. Our aims were: (i) to develop predictive algorithms for individual treatment response using demographic, clinical and paraclinical predictors in patients with multiple sclerosis; and (ii) to evaluate accuracy, and internal and external validity of these algorithms. This study evaluated 27 demographic, clinical and paraclinical predictors of individual response to seven disease-modifying therapies in MSBase, a large global cohort study. Treatment response was analysed separately for disability progression, disability regression, relapse frequency, conversion to secondary progressive disease, change in the cumulative disease burden, and the probability of treatment discontinuation. Multivariable survival and generalized linear models were used, together with the principal component analysis to reduce model dimensionality and prevent overparameterization. Accuracy of the individual prediction was tested and its internal validity was evaluated in a separate, non-overlapping cohort. External validity was evaluated in a geographically distinct cohort, the Swedish Multiple Sclerosis Registry. In the training cohort (n = 8513), the most prominent modifiers of treatment response comprised age, disease duration, disease course, previous relapse activity, disability, predominant relapse phenotype and previous therapy. Importantly, the magnitude and direction of the associations varied among therapies and disease outcomes. Higher probability of disability progression during treatment with injectable therapies was predominantly associated with a greater disability at treatment start and the previous therapy. For fingolimod, natalizumab or mitoxantrone, it was mainly associated with lower pretreatment relapse activity. The probability of disability regression was predominantly associated with pre-baseline disability, therapy and relapse activity. Relapse incidence was associated with pretreatment relapse activity, age and relapsing disease course, with the strength of these associations varying among therapies. Accuracy and internal validity (n = 1196) of the resulting predictive models was high (>80%) for relapse incidence during the first year and for disability outcomes, moderate for relapse incidence in Years 2-4 and for the change in the cumulative disease burden, and low for conversion to secondary progressive disease and treatment discontinuation. External validation showed similar results, demonstrating high external validity for disability and relapse outcomes, moderate external validity for cumulative disease burden and low external validity for conversion to secondary progressive disease and treatment discontinuation. We conclude that demographic, clinical and paraclinical information helps predict individual response to disease-modifying therapies at the time of their commencement. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Cogswell, Alex; Alloy, Lauren B; Karpinski, Andrew; Grant, David A
2010-07-01
The present study addressed convergence between self-report and indirect approaches to assessing dependency. We were moderately successful in validating an implicit measure, which was found to be reliable, orthogonal to 2 self-report instruments, and predictive of external criteria. This study also examined discrepancies between scores on self-report and implicit measures, and has implications for their significance. The possibility that discrepancies themselves are pathological was not supported, although discrepancies were associated with particular personality profiles. Finally, this study offered additional evidence for the relation between dependency and depressive symptomatology and identified implicit dependency as contributing unique variance in predicting past major depression.
Trianes Torres, María Victoria; Blanca Mena, María José; Fernández Baena, Francisco J; Escobar Espejo, Milagros; Maldonado Montero, Enrique F; Muñoz Sánchez, Angela María
2009-11-01
The present study introduces the Children's Daily Stress Inventory (Inventario Infantil de Estresores Cotidianos, IIEC) as a measure that assesses daily stress in primary school children. The inventory was applied to a sample of 1094 primary school students. The final version includes 25 dichotomic items covering the areas of health, school/peers, and family. The score is obtained by adding the total of positive answers. Analyses of items, reliability and several external pieces of evidence of validity based on relations with other variables are presented. The results show adequate psychometric properties for the assessment of daily stress in children.
Practice-Based Evidence in Community Guide Systematic Reviews.
Vaidya, Namita; Thota, Anilkrishna B; Proia, Krista K; Jamieson, Sara; Mercer, Shawna L; Elder, Randy W; Yoon, Paula; Kaufmann, Rachel; Zaza, Stephanie
2017-03-01
To assess the relative contributions and quality of practice-based evidence (PBE) and research-based evidence (RBE) in The Guide to Community Preventive Services (The Community Guide). We developed operational definitions for PBE and RBE in which the main distinguishing feature was whether allocation of participants to intervention and comparison conditions was under the control of researchers (RBE) or not (PBE). We conceptualized a continuum between RBE and PBE. We then categorized 3656 studies in 202 reviews completed since The Community Guide began in 1996. Fifty-four percent of studies were PBE and 46% RBE. Community-based and policy reviews had more PBE. Health care system and programmatic reviews had more RBE. The majority of both PBE and RBE studies were of high quality according to Community Guide scoring methods. The inclusion of substantial PBE in Community Guide reviews suggests that evidence of adequate rigor to inform practice is being produced. This should increase stakeholders' confidence that The Community Guide provides recommendations with real-world relevance. Limitations in some PBE studies suggest a need for strengthening practice-relevant designs and external validity reporting standards.
Experimental and numerical investigation of one and two phase natural convection in storage tanks
NASA Astrophysics Data System (ADS)
Aszodi, A.; Krepper, E.; Prasser, H.-M.
Experiments were performed to investigate heating up processes of fluids in storage tanks under the influence of an external heat source. As a consequence of an external fire, the heat-up of the inventory may lead to the evaporation of the liquid and to release of significant quantities of dangerous gases into the environment. Several tests were performed both with heating from the bottom and with heating from the side walls. In recent tests in addition to thermocouples, the tank was equipped with needle probes for measuring of the local void fraction. The paper presents experimental and numerical investigations of single and two phase heating up processes of tanks with side wall heating. The measurement of the temperature and of the void fraction makes interesting phenomena evident, which could be explained by an own 2D model. The gained experimental results may be used for the validation of boiling models in 3D CFD codes.
Menton, William H; Crighton, Adam H; Tarescavage, Anthony M; Marek, Ryan J; Hicks, Adam D; Ben-Porath, Yossef S
2017-06-01
The present study investigated the comparability of laptop computer- and tablet-based administration modes for the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF). Employing a counterbalanced within-subjects design, the MMPI-2-RF was administered via both modes to a sample of college undergraduates ( N = 133). Administration modes were compared in terms of mean scale scores, internal consistency, test-retest consistency, external validity, and administration time. Mean scores were generally similar, and scores produced via both methods appeared approximately equal in terms of internal consistency and test-retest consistency. Scores from the two modalities also evidenced highly similar patterns of associations with external criteria. Notably, tablet administration of the MMPI-2-RF was substantially longer than laptop administration in the present study (mean difference 7.2 minutes, Cohen's d = .95). Overall, results suggest that varying administration mode between laptop and tablet has a negligible influence on MMPI-2-RF scores, providing evidence that these modes of administration can be considered psychometrically equivalent.
Bozzolan, M; Simoni, G; Balboni, M; Fiorini, F; Bombardi, S; Bertin, N; Da Roit, M
2014-11-01
This mixed methods study aimed to explore perceptions/attitudes, to evaluate knowledge/ skills, to investigate clinical behaviours of undergraduate physiotherapy students exposed to a composite education curriculum on evidence-based practice (EBP). Students' knowledge and skills were assessed before and after integrated learning activities, using the Adapted Fresno test, whereas their behaviour in EBP was evaluated by examining their internship documentation. Students' perceptions and attitudes were explored through four focus groups. Sixty-two students agreed to participate in the study. The within group mean differences (A-Fresno test) were 34.2 (95% CI 24.4 to 43.9) in the first year and 35.1 (95% CI 23.2 to 47.1) in the second year; no statistically significant change was observed in the third year. Seventy-six percent of the second year and 88% of the third year students reached the pass score. Internship documentation gave evidence of PICOs and database searches (95-100%), critical appraisal of internal validity (25-75%) but not of external validity (5-15%). The correct application of these items ranged from 30 to 100%. Qualitative analysis of the focus groups indicated students valued EBP, but perceived many barriers, with clinicians being both an obstacle and a model. Key elements for changing students' behaviours seem to be internship environment and possibility of continuous practice and feedback.
Street, Brian D; Gage, William
2013-04-01
The external knee adduction moment is an accurate estimation of the load distribution of the knee and is a valid predictor for the presence, severity and progression rate of medial compartment knee osteoarthritis. Gait modification strategies have been shown to be an effective means of reducing the external adduction moment. The purpose of this study was to test narrow gait as a mechanism to reduce the external adduction moment and investigate if limb dominance affects this pattern. Fifteen healthy male participants (mean age: 23.8 (SD=3.1) years, mean height: 1.8 (SD=0.1) m, and mean body mass: 82.9 (SD=16.1 kg) took part in this study. Five walking trials were performed for each of the three different gait conditions: normal gait, toe-out gait, and narrow gait. Adoption of the narrow gait strategy significantly reduced the early stance phase external knee adduction moment compared to normal and toe-out gait (p<.002). However, it was observed that this reduction only occurred in the non-dominant limb. Gait modification can reduce the external knee adduction moment. However, asymmetrical patterns between the dominant and non-dominant limbs, specifically during gait modification, may attenuate the effectiveness of this intervention. The mechanism of limb dominance and the specific roles of each limb during gait may account for an asymmetrical pattern in the moment arm and center of mass displacement during stance. This new insight into how limb-dominance effects gait modification strategies will be useful in the clinical setting when identifying appropriate patients, when indicating a gait modification strategy and in future research methodology. Copyright © 2013 Elsevier B.V. All rights reserved.
Wouters, Edwin; Rau, Asta; Engelbrecht, Michelle; Uebel, Kerry; Siegel, Jacob; Masquillier, Caroline; Kigozi, Gladys; Sommerland, Nina; Yassi, Annalee
2016-05-15
The dual burden of tuberculosis and human immunodeficiency virus (HIV) is severely impacting the South African healthcare workforce. However, the use of on-site occupational health services is hampered by stigma among the healthcare workforce. The success of stigma-reduction interventions is difficult to evaluate because of a dearth of appropriate scientific tools to measure stigma in this specific professional setting. The current pilot study aimed to develop and test a range of scales measuring different aspects of stigma-internal and external stigma toward tuberculosis as well as HIV-in a South African healthcare setting. The study employed data of a sample of 200 staff members of a large hospital in Bloemfontein, South Africa. Confirmatory factor analysis produced 7 scales, displaying internal construct validity: (1) colleagues' external HIV stigma, (2) colleagues' actions against external HIV stigma, (3) respondent's external HIV stigma, (4) respondent's internal HIV stigma, (5) colleagues' external tuberculosis stigma, (6) respondent's external tuberculosis stigma, and (7) respondent's internal tuberculosis stigma. Subsequent analyses (reliability analysis, structural equation modeling) demonstrated that the scales displayed good psychometric properties in terms of reliability and external construct validity. The study outcomes support the use of the developed scales as a valid and reliable means to measure levels of tuberculosis- and HIV-related stigma among the healthcare workforce in a resource-limited context. Future studies should build on these findings to fine-tune the instruments and apply them to larger study populations across a range of different resource-limited healthcare settings with high HIV and tuberculosis prevalence. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.
Wouters, Edwin; Rau, Asta; Engelbrecht, Michelle; Uebel, Kerry; Siegel, Jacob; Masquillier, Caroline; Kigozi, Gladys; Sommerland, Nina; Yassi, Annalee
2016-01-01
Background The dual burden of tuberculosis and human immunodeficiency virus (HIV) is severely impacting the South African healthcare workforce. However, the use of on-site occupational health services is hampered by stigma among the healthcare workforce. The success of stigma-reduction interventions is difficult to evaluate because of a dearth of appropriate scientific tools to measure stigma in this specific professional setting. Methods The current pilot study aimed to develop and test a range of scales measuring different aspects of stigma—internal and external stigma toward tuberculosis as well as HIV—in a South African healthcare setting. The study employed data of a sample of 200 staff members of a large hospital in Bloemfontein, South Africa. Results Confirmatory factor analysis produced 7 scales, displaying internal construct validity: (1) colleagues’ external HIV stigma, (2) colleagues’ actions against external HIV stigma, (3) respondent’s external HIV stigma, (4) respondent’s internal HIV stigma, (5) colleagues’ external tuberculosis stigma, (6) respondent’s external tuberculosis stigma, and (7) respondent’s internal tuberculosis stigma. Subsequent analyses (reliability analysis, structural equation modeling) demonstrated that the scales displayed good psychometric properties in terms of reliability and external construct validity. Conclusions The study outcomes support the use of the developed scales as a valid and reliable means to measure levels of tuberculosis- and HIV-related stigma among the healthcare workforce in a resource-limited context. Future studies should build on these findings to fine-tune the instruments and apply them to larger study populations across a range of different resource-limited healthcare settings with high HIV and tuberculosis prevalence. PMID:27118854
Barsties, Ben; Maryn, Youri
2016-07-01
The Acoustic Voice Quality Index (AVQI) is an objective method to quantify the severity of overall voice quality in concatenated continuous speech and sustained phonation segments. Recently, AVQI was successfully modified to be more representative and ecologically valid because the internal consistency of AVQI was balanced out through equal proportion of the 2 speech types. The present investigation aims to explore its external validation in a large data set. An expert panel of 12 speech-language therapists rated the voice quality of 1058 concatenated voice samples varying from normophonia to severe dysphonia. The Spearman rank-order correlation coefficients (r) were used to measure concurrent validity. The AVQI's diagnostic accuracy was evaluated with several estimates of its receiver operating characteristics (ROC). Finally, 8 of the 12 experts were chosen because of reliability criteria. A strong correlation was identified between AVQI and auditoryperceptual rating (r = 0.815, P = .000). It indicated that 66.4% of the auditory-perceptual rating's variation was explained by AVQI. Additionally, the ROC results showed again the best diagnostic outcome at a threshold of AVQI = 2.43. This study highlights external validation and diagnostic precision of the AVQI version 03.01 as a robust and ecologically valid measurement to objectify voice quality. © The Author(s) 2016.
Piper, Brian J.; Gray, Hilary M.; Raber, Jacob; Birkett, Melissa A.
2014-01-01
Aim The parent form of the 113 item Child Behavior Checklist (CBCL) is widely utilized by child psychiatrists and psychologists. This report examines the reliability and validity of a recently developed abbreviated version of the CBCL, the Brief Problem Monitor (BPM). Methods Caregivers (N=567) completed the CBCL online and the 19 BPM items were examined separately. Results Internal consistency of the BPM was high (Cronbach’s alpha=0.91) and satisfactory for the Internalizing (0.78), Externalizing (0.86), and Attention (0.87) scales. High correlations between the CBCL and BPM were identified for the total score (r=0.95) as well as the Internalizing (0.86), Externalizing (0.93), and Attention (0.97) scales. The BPM and scales were sensitive and identified significantly higher behavioral and emotional problems among children whose caregiver reported a psychiatric diagnosis of Attention Deficit Hyperactivity Disorder, bipolar, depression, anxiety, developmental disabilities, or Autism Spectrum Disorders relative to a comparison group that had not been diagnosed with these disorders. BPM ratings also differed by the socioeconomic status and education of the caregiver. Mothers with higher annual incomes rated their children as having 38.8% fewer total problems (Cohen’s d=0.62) as well as 42.8% lower Internalizing (d=0.53), 44.1% less Externalizing (d=0.62), and 30.9% decreased Attention (d=0.39). A similar pattern was evident for maternal education (d=0.30 to 0.65). Conclusion Overall, these findings provide strong psychometric support for the BPM although the differences based on the characteristics of the parent indicates that additional information from other sources (e.g., teachers) should be obtained to complement parental reports. PMID:24735087
Kane, Greg
2013-11-04
A Drug Influence Evaluation (DIE) is a formal assessment of an impaired driving suspect, performed by a trained law enforcement officer who uses circumstantial facts, questioning, searching, and a physical exam to form an unstandardized opinion as to whether a suspect's driving was impaired by drugs. This paper first identifies the scientific studies commonly cited in American criminal trials as evidence of DIE accuracy, and second, uses the QUADAS tool to investigate whether the methodologies used by these studies allow them to correctly quantify the diagnostic accuracy of the DIEs currently administered by US law enforcement. Three studies were selected for analysis. For each study, the QUADAS tool identified biases that distorted reported accuracies. The studies were subject to spectrum bias, selection bias, misclassification bias, verification bias, differential verification bias, incorporation bias, and review bias. The studies quantified DIE performance with prevalence-dependent accuracy statistics that are internally but not externally valid. The accuracies reported by these studies do not quantify the accuracy of the DIE process now used by US law enforcement. These studies do not validate current DIE practice.
Prognostic models for complete recovery in ischemic stroke: a systematic review and meta-analysis.
Jampathong, Nampet; Laopaiboon, Malinee; Rattanakanokchai, Siwanon; Pattanittum, Porjai
2018-03-09
Prognostic models have been increasingly developed to predict complete recovery in ischemic stroke. However, questions arise about the performance characteristics of these models. The aim of this study was to systematically review and synthesize performance of existing prognostic models for complete recovery in ischemic stroke. We searched journal publications indexed in PUBMED, SCOPUS, CENTRAL, ISI Web of Science and OVID MEDLINE from inception until 4 December, 2017, for studies designed to develop and/or validate prognostic models for predicting complete recovery in ischemic stroke patients. Two reviewers independently examined titles and abstracts, and assessed whether each study met the pre-defined inclusion criteria and also independently extracted information about model development and performance. We evaluated validation of the models by medians of the area under the receiver operating characteristic curve (AUC) or c-statistic and calibration performance. We used a random-effects meta-analysis to pool AUC values. We included 10 studies with 23 models developed from elderly patients with a moderately severe ischemic stroke, mainly in three high income countries. Sample sizes for each study ranged from 75 to 4441. Logistic regression was the only analytical strategy used to develop the models. The number of various predictors varied from one to 11. Internal validation was performed in 12 models with a median AUC of 0.80 (95% CI 0.73 to 0.84). One model reported good calibration. Nine models reported external validation with a median AUC of 0.80 (95% CI 0.76 to 0.82). Four models showed good discrimination and calibration on external validation. The pooled AUC of the two validation models of the same developed model was 0.78 (95% CI 0.71 to 0.85). The performance of the 23 models found in the systematic review varied from fair to good in terms of internal and external validation. Further models should be developed with internal and external validation in low and middle income countries.
Glenn, Beth A.; Bastani, Roshan; Maxwell, Annette E.
2013-01-01
Objective Threats to external validity including pretest sensitization and the interaction of selection and an intervention are frequently overlooked by researchers despite their potential to significantly influence study outcomes. The purpose of this investigation was to conduct secondary data analyses to assess the presence of external validity threats in the setting of a randomized trial designed to promote mammography use in a high risk sample of women. Design During the trial, recruitment and intervention implementation took place in three cohorts (with different ethnic composition), utilizing two different designs (pretest-posttest control group design; posttest only control group design). Results Results reveal that the intervention produced different outcomes across cohorts, dependent upon the research design used and the characteristics of the sample. Conclusion These results illustrate the importance of weighing the pros and cons of potential research designs before making a selection and attending more closely to issues of external validity. PMID:23289517
Fauth, Elizabeth B; Jackson, Mark A; Walberg, Donna K; Lee, Nancy E; Easom, Leisa R; Alston, Gayle; Ramos, Angel; Felten, Kristen; LaRue, Asenath; Mittelman, Mary
2017-06-01
The Administration on Aging funded six New York University Caregiver Intervention (NYUCI) demonstration projects, a counseling/support intervention targeting dementia caregivers and families. Three sites (Georgia, Utah, Wisconsin) pooled data to inform external validity in nonresearch settings. This study (a) assesses collective changes over time, and (b) compares outcomes across sites on caregiver burden, depressive symptoms, satisfaction with social support, family conflict, and quality of life. Data included baseline/preintervention ( N = 294) and follow-up visits (approximately 4, 8, 12 months). Linear mixed models showed that social support satisfaction increased ( p < .05) and family conflict decreased ( p < .05; Cohen's d = 0.49 and 0.35, respectively). Marginally significant findings emerged for quality of life increases ( p = .05) and burden decreases ( p < .10). Depressive symptoms remained stable. Slopes did not differ much by site. NYUCI demonstrated external validity in nonresearch settings across diverse caregiver samples.
Glenn, Beth A; Bastani, Roshan; Maxwell, Annette E
2013-01-01
Threats to external validity, including pretest sensitisation and the interaction of selection and an intervention, are frequently overlooked by researchers despite their potential to significantly influence study outcomes. The purpose of this investigation was to conduct secondary data analyses to assess the presence of external validity threats in the setting of a randomised trial designed to promote mammography use in a high-risk sample of women. During the trial, recruitment and intervention, implementation took place in three cohorts (with different ethnic composition), utilising two different designs (pretest-posttest control group design and posttest only control group design). Results reveal that the intervention produced different outcomes across cohorts, dependent upon the research design used and the characteristics of the sample. These results illustrate the importance of weighing the pros and cons of potential research designs before making a selection and attending more closely to issues of external validity.
Prediction of pelvic organ prolapse using an artificial neural network.
Robinson, Christopher J; Swift, Steven; Johnson, Donna D; Almeida, Jonas S
2008-08-01
The objective of this investigation was to test the ability of a feedforward artificial neural network (ANN) to differentiate patients who have pelvic organ prolapse (POP) from those who retain good pelvic organ support. Following institutional review board approval, patients with POP (n = 87) and controls with good pelvic organ support (n = 368) were identified from the urogynecology research database. Historical and clinical information was extracted from the database. Data analysis included the training of a feedforward ANN, variable selection, and external validation of the model with an independent data set. Twenty variables were used. The median-performing ANN model used a median of 3 (quartile 1:3 to quartile 3:5) variables and achieved an area under the receiver operator curve of 0.90 (external, independent validation set). Ninety percent sensitivity and 83% specificity were obtained in the external validation by ANN classification. Feedforward ANN modeling is applicable to the identification and prediction of POP.
Martel, Michelle M.; Roberts, Bethan; Gremillion, Monica; von Eye, Alexander; Nigg, Joel T.
2011-01-01
The current paper provides external validation of the bifactor model of ADHD by examining associations between ADHD latent factor/profile scores and external validation indices. 548 children (321 boys; 302 with ADHD), 6 to 18 years old, recruited from the community participated in a comprehensive diagnostic procedure. Mothers completed the Child Behavior Checklist, Early Adolescent Temperament Questionnaire, and California Q-Sort. Children completed the Stop and Trail-Making Task. Specific inattention was associated with depression/withdrawal, slower cognitive task performance, introversion, agreeableness, and high reactive control; specific hyperactivity-impulsivity was associated with rule-breaking/aggressive behavior, social problems, errors during set-shifting, extraversion, disagreeableness, and low reactive control. It is concluded that the bifactor model provides better explanation of heterogeneity within ADHD than DSM-IV ADHD symptom counts or subtypes. PMID:21735050
Measuring epistemic curiosity and its diversive and specific components.
Litman, Jordan A; Spielberger, Charles D
2003-02-01
A questionnaire constructed to assess epistemic curiosity (EC) and perceptual curiosity (PC) curiosity was administered to 739 undergraduates (546 women, 193 men) ranging in age from 18 to 65. The study participants also responded to the trait anxiety, anger, depression, and curiosity scales of the State-Trait Personality Inventory (STPI; Spielberger et al., 1979) and selected subscales of the Sensation Seeking (SSS; Zuckerman, Kolin, Price, & Zoob, 1964) and Novelty Experiencing (NES; Pearson, 1970) scales. Factor analyses of the curiosity items with oblique rotation identified EC and PC factors with clear simple structure. Subsequent analyses of the EC items provided the basis for developing an EC scale, with Diversive and Specific Curiosity subscales. Moderately high correlations of the EC scale and subscales with other measures of curiosity provided strong evidence of convergent validity. Divergent validity was demonstrated by minimal correlations with trait anxiety and the sensation-seeking measures, and essentially zero correlations with the STPI trait anger and depression scales. Male participants had significantly higher scores on the EC scale and the NES External Cognition subscale (effect sizes of r =.16 and.21, respectively), indicating that they were more interested than female participants in solving problems and discovering how things work. Male participants also scored significantly higher than female participants on the SSS Thrill-and-Adventure and NES External Sensation subscales (r =.14 and.22, respectively), suggesting that they were more likely to engage in sensation-seeking activities.
Wiklander, Maria; Rydström, Lise-Lott; Ygge, Britt-Marie; Navér, Lars; Wettergren, Lena; Eriksson, Lars E
2013-11-14
HIV is a stigmatizing medical condition. The concept of HIV stigma is multifaceted, with personalized stigma (perceived stigmatizing consequences of others knowing of their HIV status), disclosure concerns, negative self-image, and concerns with public attitudes described as core aspects of stigma for individuals with HIV infection. There is limited research on HIV stigma in children. The aim of this study was to test a short version of the 40-item HIV Stigma Scale (HSS-40), adapted for 8-18 years old children with HIV infection living in Sweden. A Swedish version of the HSS-40 was adapted for children by an expert panel and evaluated by think aloud interviews. A preliminary short version with twelve items covering the four dimensions of stigma in the HSS-40 was tested. The psychometric evaluation included inspection of missing values, principal component analysis (PCA), internal consistency, and correlations with measures of health-related quality of life (HRQoL). Fifty-eight children, representing 71% of all children with HIV infection in Sweden meeting the inclusion criteria, completed the 12-item questionnaire. Four items concerning participants' experiences of others' reactions to their HIV had unacceptable rates of missing values and were therefore excluded. The remaining items constituted an 8-item scale, the HIV Stigma Scale for Children (HSSC-8), measuring HIV-related disclosure concerns, negative self-image, and concerns with public attitudes. Evidence for internal validity was supported by a PCA, suggesting a three factor solution with all items loading on the same subscales as in the original HSS-40. The scale demonstrated acceptable internal consistency, with exception for the disclosure concerns subscale. Evidence for external validity was supported in correlational analyses with measures of HRQoL, where higher levels of stigma correlated with poorer HRQoL. The results suggest feasibility, reliability, as well as internal and external validity of the HSSC-8, an HIV stigma scale for children with HIV infection, measuring disclosure concerns, negative self-image, and concerns with public attitudes. The present study shows that different aspects of HIV stigma can be assessed among children with HIV in the age group 8-18.
External Correlates of the MMPI-2 Content Component Scales in Mental Health Inpatients
ERIC Educational Resources Information Center
Green, Bradley A.; Handel, Richard W.; Archer, Robert P.
2006-01-01
External correlates of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) Content Component Scales were identified using an inpatient sample of 544 adults. The Brief Psychiatric Rating Scale (BPRS) and Symptom Checklist 90-Revised (SCL-90-R) produced correlates of the Content Component Scales, demonstrating external validity with…
Mungroop, Timothy H; van Rijssen, L Bengt; van Klaveren, David; Smits, F Jasmijn; van Woerden, Victor; Linnemann, Ralph J; de Pastena, Matteo; Klompmaker, Sjors; Marchegiani, Giovanni; Ecker, Brett L; van Dieren, Susan; Bonsing, Bert; Busch, Olivier R; van Dam, Ronald M; Erdmann, Joris; van Eijck, Casper H; Gerhards, Michael F; van Goor, Harry; van der Harst, Erwin; de Hingh, Ignace H; de Jong, Koert P; Kazemier, Geert; Luyer, Misha; Shamali, Awad; Barbaro, Salvatore; Armstrong, Thomas; Takhar, Arjun; Hamady, Zaed; Klaase, Joost; Lips, Daan J; Molenaar, I Quintus; Nieuwenhuijs, Vincent B; Rupert, Coen; van Santvoort, Hjalmar C; Scheepers, Joris J; van der Schelling, George P; Bassi, Claudio; Vollmer, Charles M; Steyerberg, Ewout W; Abu Hilal, Mohammed; Groot Koerkamp, Bas; Besselink, Marc G
2017-12-12
The aim of this study was to develop an alternative fistula risk score (a-FRS) for postoperative pancreatic fistula (POPF) after pancreatoduodenectomy, without blood loss as a predictor. Blood loss, one of the predictors of the original-FRS, was not a significant factor during 2 recent external validations. The a-FRS was developed in 2 databases: the Dutch Pancreatic Cancer Audit (18 centers) and the University Hospital Southampton NHS. Primary outcome was grade B/C POPF according to the 2005 International Study Group on Pancreatic Surgery (ISGPS) definition. The score was externally validated in 2 independent databases (University Hospital of Verona and University Hospital of Pennsylvania), using both 2005 and 2016 ISGPS definitions. The a-FRS was also compared with the original-FRS. For model design, 1924 patients were included of whom 12% developed POPF. Three predictors were strongly associated with POPF: soft pancreatic texture [odds ratio (OR) 2.58, 95% confidence interval (95% CI) 1.80-3.69], small pancreatic duct diameter (per mm increase, OR: 0.68, 95% CI: 0.61-0.76), and high body mass index (BMI) (per kg/m increase, OR: 1.07, 95% CI: 1.04-1.11). Discrimination was adequate with an area under curve (AUC) of 0.75 (95% CI: 0.71-0.78) after internal validation, and 0.78 (0.74-0.82) after external validation. The predictive capacity of a-FRS was comparable with the original-FRS, both for the 2005 definition (AUC 0.78 vs 0.75, P = 0.03), and 2016 definition (AUC 0.72 vs 0.70, P = 0.05). The a-FRS predicts POPF after pancreatoduodenectomy based on 3 easily available variables (pancreatic texture, duct diameter, BMI) without blood loss and pathology, and was successfully validated for both the 2005 and 2016 POPF definition.
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Callaghan, Michael E., E-mail: elspeth.raymond@health.sa.gov.au; Freemasons Foundation Centre for Men's Health, University of Adelaide; Urology Unit, Repatriation General Hospital, SA Health, Flinders Centre for Innovation in Cancer
Purpose: To identify, through a systematic review, all validated tools used for the prediction of patient-reported outcome measures (PROMs) in patients being treated with radiation therapy for prostate cancer, and provide a comparative summary of accuracy and generalizability. Methods and Materials: PubMed and EMBASE were searched from July 2007. Title/abstract screening, full text review, and critical appraisal were undertaken by 2 reviewers, whereas data extraction was performed by a single reviewer. Eligible articles had to provide a summary measure of accuracy and undertake internal or external validation. Tools were recommended for clinical implementation if they had been externally validated and foundmore » to have accuracy ≥70%. Results: The search strategy identified 3839 potential studies, of which 236 progressed to full text review and 22 were included. From these studies, 50 tools predicted gastrointestinal/rectal symptoms, 29 tools predicted genitourinary symptoms, 4 tools predicted erectile dysfunction, and no tools predicted quality of life. For patients treated with external beam radiation therapy, 3 tools could be recommended for the prediction of rectal toxicity, gastrointestinal toxicity, and erectile dysfunction. For patients treated with brachytherapy, 2 tools could be recommended for the prediction of urinary retention and erectile dysfunction. Conclusions: A large number of tools for the prediction of PROMs in prostate cancer patients treated with radiation therapy have been developed. Only a small minority are accurate and have been shown to be generalizable through external validation. This review provides an accessible catalogue of tools that are ready for clinical implementation as well as which should be prioritized for validation.« less
Carrión, Ricardo E.; Cornblatt, Barbara A.; Burton, Cynthia Z.; Tso, Ivy F; Auther, Andrea; Adelsheim, Steven; Calkins, Roderick; Carter, Cameron S.; Niendam, Tara; Taylor, Stephan F.; McFarlane, William R.
2016-01-01
Objective In the current issue, Cannon and colleagues, as part of the second phase of the North American Prodrome Longitudinal Study (NAPLS2), report on a risk calculator for the individualized prediction of developing a psychotic disorder in a 2-year period. The present study represents an external validation of the NAPLS2 psychosis risk calculator using an independent sample of subjects at clinical high risk for psychosis collected as part of the Early Detection, Intervention, and Prevention of Psychosis Program (EDIPPP). Methods 176 subjects with follow-up (from the total EDIPPP sample of 210) rated as clinical high-risk (CHR) based on the Structured Interview for Prodromal Syndromes were used to construct a new prediction model with the 6 significant predictor variables in the NAPLS2 psychosis risk calculator (unusual thoughts, suspiciousness, Symbol Coding, verbal learning, social functioning decline, baseline age, and family history). Discrimination performance was assessed with the area under the receiver operating curve (AUC). The NAPLS2 risk calculator was then used to generate a psychosis risk estimate for each case in the external validation sample. Results The external validation model showed good discrimination, with an AUC of 79% (95% CI 0.644–0.937). In addition, the personalized risk generated by the NAPLS calculator provided a solid estimation of the actual conversion outcome in the validation sample. Conclusions In the companion papers in this issue, two independent samples of CHR subjects converge to validate the NAPLS2 psychosis risk calculator. This prediction calculator represents a meaningful step towards early intervention and personalized treatment of psychotic disorders. PMID:27363511
Turusheva, Anna; Frolova, Elena; Bert, Vaes; Hegendoerfer, Eralda; Degryse, Jean-Marie
2017-07-01
Prediction models help to make decisions about further management in clinical practice. This study aims to develop a mortality risk score based on previously identified risk predictors and to perform internal and external validations. In a population-based prospective cohort study of 611 community-dwelling individuals aged 65+ in St. Petersburg (Russia), all-cause mortality risks over 2.5 years follow-up were determined based on the results obtained from anthropometry, medical history, physical performance tests, spirometry and laboratory tests. C-statistic, risk reclassification analysis, integrated discrimination improvement analysis, decision curves analysis, internal validation and external validation were performed. Older adults were at higher risk for mortality [HR (95%CI)=4.54 (3.73-5.52)] when two or more of the following components were present: poor physical performance, low muscle mass, poor lung function, and anemia. If anemia was combined with high C-reactive protein (CRP) and high B-type natriuretic peptide (BNP) was added the HR (95%CI) was slightly higher (5.81 (4.73-7.14)) even after adjusting for age, sex and comorbidities. Our models were validated in an external population of adults 80+. The extended model had a better predictive capacity for cardiovascular mortality [HR (95%CI)=5.05 (2.23-11.44)] compared to the baseline model [HR (95%CI)=2.17 (1.18-4.00)] in the external population. We developed and validated a new risk prediction score that may be used to identify older adults at higher risk for mortality in Russia. Additional studies need to determine which targeted interventions improve the outcomes of these at-risk individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Building research capacity for evidence-informed tobacco control in Canada: a case description
McDonald, Paul W; Viehbeck, Sarah; Robinson, Sarah J; Leatherdale, Scott T; Nykiforuk, Candace IJ; Jolin, Mari Alice
2009-01-01
Tobacco use remains the leading cause of death and disability in Canada. Insufficient research capacity can inhibit evidence-informed decision making for tobacco control. This paper outlines a Canadian project to build research capacity, defined as a community's ability to produce research that adequately informs practice, policy, and future research in a timely, practical manner. A key component is that individuals and teams within the community must mutually engage around common, collectively negotiated goals to address specific practices, policies or programs of research. An organizing framework, a set of activities to build strategic recruitment, productivity tools, and procedures for enhancing social capital are described. Actions are intended to facilitate better alignment between research and the priorities of policy developers and service providers, enhance the external validity of the work performed, and reduce the time required to inform policy and practice. PMID:19664224
Lazcano-Ponce, Eduardo; Katz, Gregorio; Rodríguez-Valentín, Rocío; Castro, Filipa de; Allen-Leigh, Betania; Márquez-Caraveo, María Elena; Ramírez-García, Miguel Ángel; Arroyo-García, Eduardo; Medina-Mora, María Elena; Ángeles, Gustavo; Urquieta-Salomón, José Edmundo; Salvador-Carulla, Luis
2016-01-01
This study aims to generate evidence on intellectual development disorders (IDD) in Mexico. IDD disease burden will be estimated with a probabilistic model, using population-based surveys. Direct and indirect costs of catastrophic expenses of families with a member with an IDD will be evaluated. Genomic characterization of IDD will include: sequencing participant exomes and performing bioinformatics analyses to identify de novo or inherited variants through trio analysis; identifying genetic variants associated with IDD, and validating randomly selected variants by polymerase chain reaction (PCR) and sequencing or real-time quantitative PCR (qPCR). Delphi surveys will be done on best practices for IDD diagnosis and management. An external evaluation will employ qualitative case studies of two social and labor inclusion programs for people with IDD. The results will constitute scientific evidence for the design, promotion and evaluation of public policies, which are currently absent on IDD.
Effective Recruitment of Schools for Randomized Clinical Trials: Role of School Nurses.
Petosa, R L; Smith, L
2017-01-01
In school settings, nurses lead efforts to improve the student health and well-being to support academic success. Nurses are guided by evidenced-based practice and data to inform care decisions. The randomized controlled trial (RCT) is considered the gold standard of scientific rigor for clinical trials. RCTs are critical to the development of evidence-based health promotion programs in schools. The purpose of this article is to present practical solutions to implementing principles of randomization to RCT trials conducted in school settings. Randomization is a powerful sampling method used to build internal and external validity. The school's daily organization and educational mission provide several barriers to randomization. Based on the authors' experience in conducting school-based RCTs, they offer a host of practical solutions to working with schools to successfully implement randomization procedures. Nurses play a critical role in implementing RCTs in schools to promote rigorous science in support of evidence-based practice.
Validity of self-assessment in a quality improvement collaborative in Ecuador.
Hermida, Jorge; Broughton, Edward I; Miller Franco, Lynne
2011-12-01
Health care quality improvement (QI) efforts commonly use self-assessment to measure compliance with quality standards. This study investigates the validity of self-assessment of quality indicators. Cross sectional. A maternal and newborn care improvement collaborative intervention conducted in health facilities in Ecuador in 2005. Four external evaluators were trained in abstracting medical records to calculate six indicators reflecting compliance with treatment standards. About 30 medical records per month were examined at 12 participating health facilities for a total of 1875 records. The same records had already been reviewed by QI teams at these facilities (self-assessment). Overall compliance, agreement (using the Kappa statistic), sensitivity and specificity were analyzed. We also examined patterns of disagreement and the effect of facility characteristics on levels of agreement. External evaluators reported compliance of 69-90%, while self-assessors reported 71-92%, with raw agreement of 71-95% and Kappa statistics ranging from fair to almost perfect agreement. Considering external evaluators as the gold standard, sensitivity of self-assessment ranged from 90 to 99% and specificity from 48 to 86%. Simpler indicators had fewer disagreements. When disagreements occurred between self-assessment and external valuators, the former tended to report more positive findings in five of six indicators, but this tendency was not of a magnitude to change program actions. Team leadership, understanding of the tools and facility size had no overall impact on the level of agreement. When compared with external evaluation (gold standard), self-assessment was found to be sufficiently valid for tracking QI team performance. Sensitivity was generally higher than specificity. Simplifying indicators may improve validity.
Psychometric Validation of the Academic Motivation Scale in a Dental Student Sample.
Orsini, Cesar; Binnie, Vivian; Evans, Phillip; Ledezma, Priscilla; Fuentes, Fernando; Villegas, Maria J
2015-08-01
The Academic Motivation Scale is one of the most frequently used instruments to assess academic motivation. It relies on the self-determination theory of human motivation. However, motivation has been understudied in dental education. Therefore, to address the lack of valid instruments to assess academic motivation in dental education and contribute to future research in the field, the aim of this study was to analyze the psychometric properties of this instrument in a sample of dental students. Participants were 989 Chilean undergraduate dental students (86% response rate) who completed a survey containing a Chilean face-valid version of the Spanish Academic Motivation Scale and three other motivation-related instruments to assess the survey's construct and criterion validity. Later, 76 of the students (out of 100 invited) took the survey again to assess its test-retest stability. The instrument's construct validity was supported by the superior goodness of fit of the seven-subscale Academic Motivation Scale over competing models through confirmatory factor analysis and by the expected correlations among its subscales. The concurrent criterion validity was supported by the confirmation of correlations between its subscales and external criteria. Adequate internal consistency and test-retest correlations were also found. The evidence from this study suggests that the Academic Motivation Scale is a preliminarily valid and reliable instrument to assess motivation in the predoctoral dental context. Future research in this area is needed to confirm or refute these results.
Fun and Games: The Validity of Games for the Study of Conflict
ERIC Educational Resources Information Center
Schlenker, Barry R.; Bonoma, Thomas V.
1978-01-01
Examines claimed advantages and criticisms of the use of games in the study of social conflict, differentiating the advantages and criticisms into questions of internal validity, external validity, and ecological validity. Available from: Sage Publications, Inc., 275 South Beverly Drive, Beverly Hills, California 90212. (JG)
Homework Stress: Construct Validation of a Measure
ERIC Educational Resources Information Center
Katz, Idit; Buzukashvili, Tamara; Feingold, Liat
2012-01-01
This article presents 2 studies aimed at validating a measure of stress experienced by children and parents around the issue of homework, applying Benson's program of validation (Benson, 1998). Study 1 provides external validity of the measure by supporting hypothesized relations between stress around homework and students' and parents' positive…
Kundu, Suman; Mazumdar, Madhu; Ferket, Bart
2017-04-19
The area under the ROC curve (AUC) of risk models is known to be influenced by differences in case-mix and effect size of predictors. The impact of heterogeneity in correlation among predictors has however been under investigated. We sought to evaluate how correlation among predictors affects the AUC in development and external populations. We simulated hypothetical populations using two different methods based on means, standard deviations, and correlation of two continuous predictors. In the first approach, the distribution and correlation of predictors were assumed for the total population. In the second approach, these parameters were modeled conditional on disease status. In both approaches, multivariable logistic regression models were fitted to predict disease risk in individuals. Each risk model developed in a population was validated in the remaining populations to investigate external validity. For both approaches, we observed that the magnitude of the AUC in the development and external populations depends on the correlation among predictors. Lower AUCs were estimated in scenarios of both strong positive and negative correlation, depending on the direction of predictor effects and the simulation method. However, when adjusted effect sizes of predictors were specified in the opposite directions, increasingly negative correlation consistently improved the AUC. AUCs in external validation populations were higher or lower than in the derivation cohort, even in the presence of similar predictor effects. Discrimination of risk prediction models should be assessed in various external populations with different correlation structures to make better inferences about model generalizability.
Meertens, Linda Jacqueline Elisabeth; Scheepers, Hubertina Cj; De Vries, Raymond G; Dirksen, Carmen D; Korstjens, Irene; Mulder, Antonius Lm; Nieuwenhuijze, Marianne J; Nijhuis, Jan G; Spaanderman, Marc Ea; Smits, Luc Jm
2017-10-26
A number of first-trimester prediction models addressing important obstetric outcomes have been published. However, most models have not been externally validated. External validation is essential before implementing a prediction model in clinical practice. The objective of this paper is to describe the design of a study to externally validate existing first trimester obstetric prediction models, based upon maternal characteristics and standard measurements (eg, blood pressure), for the risk of pre-eclampsia (PE), gestational diabetes mellitus (GDM), spontaneous preterm birth (PTB), small-for-gestational-age (SGA) infants, and large-for-gestational-age (LGA) infants among Dutch pregnant women (Expect Study I). The results of a pilot study on the feasibility and acceptability of the recruitment process and the comprehensibility of the Pregnancy Questionnaire 1 are also reported. A multicenter prospective cohort study was performed in The Netherlands between July 1, 2013 and December 31, 2015. First trimester obstetric prediction models were systematically selected from the literature. Predictor variables were measured by the Web-based Pregnancy Questionnaire 1 and pregnancy outcomes were established using the Postpartum Questionnaire 1 and medical records. Information about maternal health-related quality of life, costs, and satisfaction with Dutch obstetric care was collected from a subsample of women. A pilot study was carried out before the official start of inclusion. External validity of the models will be evaluated by assessing discrimination and calibration. Based on the pilot study, minor improvements were made to the recruitment process and online Pregnancy Questionnaire 1. The validation cohort consists of 2614 women. Data analysis of the external validation study is in progress. This study will offer insight into the generalizability of existing, non-invasive first trimester prediction models for various obstetric outcomes in a Dutch obstetric population. An impact study for the evaluation of the best obstetric prediction models in the Dutch setting with respect to their effect on clinical outcomes, costs, and quality of life-Expect Study II-is being planned. Netherlands Trial Registry (NTR): NTR4143; http://www.trialregister.nl/trialreg/admin/rctview.asp?TC=4143 (Archived by WebCite at http://www.webcitation.org/6t8ijtpd9). ©Linda Jacqueline Elisabeth Meertens, Hubertina CJ Scheepers, Raymond G De Vries, Carmen D Dirksen, Irene Korstjens, Antonius LM Mulder, Marianne J Nieuwenhuijze, Jan G Nijhuis, Marc EA Spaanderman, Luc JM Smits. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 26.10.2017.
Generalizing disease management program results: how to get from here to there.
Linden, Ariel; Adams, John L; Roberts, Nancy
2004-07-01
For a disease management (DM) program, the ability to generalize results from the intervention group to the population, to other populations, or to other diseases is as important as demonstrating internal validity. This article provides an overview of the threats to external validity of DM programs, and offers methods to improve the capability for generalizing results obtained through the program. The external validity of DM programs must be evaluated even before program selection and implementation are begun with a prospective new client. Any fundamental differences in characteristics between individuals in an established DM program and in a new population/environment may limit the ability to generalize.
NASA Astrophysics Data System (ADS)
Dalvi, Tejaswini; Wendell, Kristen
2017-10-01
Our study addresses the need for new approaches to prepare novice elementary teachers to teach both science and engineering, and for new tools to measure how well those approaches are working. This in particular would inform the teacher educators of the extent to which novice teachers are developing expertise in facilitating their students' engineering design work. One important dimension to measure is novice teachers' abilities to notice the substance of student thinking and to respond in productive ways. This teacher noticing is particularly important in science and engineering education, where students' initial, idiosyncratic ideas and practices influence the likelihood that particular instructional strategies will help them learn. This paper describes evidence of validity and reliability for the Video Case Diagnosis (VCD) task, a new instrument for measuring pre-service elementary teachers' engineering teaching responsiveness. To complete the VCD, participants view a 6-min video episode of children solving an engineering design problem, describe in writing what they notice about the students' science ideas and engineering practices, and propose how a teacher could productively respond to the students. The rubric for scoring VCD responses allowed two independent scorers to achieve inter-rater reliability. Content analysis of the video episode, systematic review of literature on science and engineering practices, and solicitation of external expert educator responses establish content validity for VCD. Field test results with three different participant groups who have different levels of engineering education experience offer evidence of construct validity.
Chirico, Nicola; Gramatica, Paola
2011-09-26
The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have proposed the predictive squared correlation coefficient Q(2)(F1) (Shi et al.). However, other validation criteria have been proposed by other authors: the Golbraikh-Tropsha method, r(2)(m) (Roy), Q(2)(F2) (Schüürmann et al.), Q(2)(F3) (Consonni et al.). In QSAR studies these measures are usually in accordance, though this is not always the case, thus doubts can arise when contradictory results are obtained. It is likely that none of the aforementioned criteria is the best in every situation, so a comparative study using simulated data sets is proposed here, using threshold values suggested by the proponents or those widely used in QSAR modeling. In addition, a different and simple external validation measure, the concordance correlation coefficient (CCC), is proposed and compared with other criteria. Huge data sets were used to study the general behavior of validation measures, and the concordance correlation coefficient was shown to be the most restrictive. On using simulated data sets of a more realistic size, it was found that CCC was broadly in agreement, about 96% of the time, with other validation measures in accepting models as predictive, and in almost all the examples it was the most precautionary. The proposed concordance correlation coefficient also works well on real data sets, where it seems to be more stable, and helps in making decisions when the validation measures are in conflict. Since it is conceptually simple, and given its stability and restrictiveness, we propose the concordance correlation coefficient as a complementary, or alternative, more prudent measure of a QSAR model to be externally predictive.
Esbenshade, Adam J; Zhao, Zhiguo; Aftandilian, Catherine; Saab, Raya; Wattier, Rachel L; Beauchemin, Melissa; Miller, Tamara P; Wilkes, Jennifer J; Kelly, Michael J; Fernbach, Alison; Jeng, Michael; Schwartz, Cindy L; Dvorak, Christopher C; Shyr, Yu; Moons, Karl G M; Sulis, Maria-Luisa; Friedman, Debra L
2017-10-01
Pediatric oncology patients are at an increased risk of invasive bacterial infection due to immunosuppression. The risk of such infection in the absence of severe neutropenia (absolute neutrophil count ≥ 500/μL) is not well established and a validated prediction model for blood stream infection (BSI) risk offers clinical usefulness. A 6-site retrospective external validation was conducted using a previously published risk prediction model for BSI in febrile pediatric oncology patients without severe neutropenia: the Esbenshade/Vanderbilt (EsVan) model. A reduced model (EsVan2) excluding 2 less clinically reliable variables also was created using the initial EsVan model derivative cohort, and was validated using all 5 external validation cohorts. One data set was used only in sensitivity analyses due to missing some variables. From the 5 primary data sets, there were a total of 1197 febrile episodes and 76 episodes of bacteremia. The overall C statistic for predicting bacteremia was 0.695, with a calibration slope of 0.50 for the original model and a calibration slope of 1.0 when recalibration was applied to the model. The model performed better in predicting high-risk bacteremia (gram-negative or Staphylococcus aureus infection) versus BSI alone, with a C statistic of 0.801 and a calibration slope of 0.65. The EsVan2 model outperformed the EsVan model across data sets with a C statistic of 0.733 for predicting BSI and a C statistic of 0.841 for high-risk BSI. The results of this external validation demonstrated that the EsVan and EsVan2 models are able to predict BSI across multiple performance sites and, once validated and implemented prospectively, could assist in decision making in clinical practice. Cancer 2017;123:3781-3790. © 2017 American Cancer Society. © 2017 American Cancer Society.
Schmidt-Hansen, Mia; Berendse, Sabine; Hamilton, Willie; Baldwin, David R
2017-01-01
Background Lung cancer is the leading cause of cancer deaths. Around 70% of patients first presenting to specialist care have advanced disease, at which point current treatments have little effect on survival. The issue for primary care is how to recognise patients earlier and investigate appropriately. This requires an assessment of the risk of lung cancer. Aim The aim of this study was to systematically review the existing risk prediction tools for patients presenting in primary care with symptoms that may indicate lung cancer Design and setting Systematic review of primary care data. Method Medline, PreMedline, Embase, the Cochrane Library, Web of Science, and ISI Proceedings (1980 to March 2016) were searched. The final list of included studies was agreed between two of the authors, who also appraised and summarised them. Results Seven studies with between 1482 and 2 406 127 patients were included. The tools were all based on UK primary care data, but differed in complexity of development, number/type of variables examined/included, and outcome time frame. There were four multivariable tools with internal validation area under the curves between 0.88 and 0.92. The tools all had a number of limitations, and none have been externally validated, or had their clinical and cost impact examined. Conclusion There is insufficient evidence for the recommendation of any one of the available risk prediction tools. However, some multivariable tools showed promising discrimination. What is needed to guide clinical practice is both external validation of the existing tools and a comparative study, so that the best tools can be incorporated into clinical decision tools used in primary care. PMID:28483820
Faxén, Jonas; Hall, Marlous; Gale, Chris P; Sundström, Johan; Lindahl, Bertil; Jernberg, Tomas; Szummer, Karolina
2017-12-01
To develop a simple risk-score model for predicting in-hospital cardiac arrest (CA) among patients hospitalized with suspected non-ST elevation acute coronary syndrome (NSTE-ACS). Using the Swedish Web-system for Enhancement and Development of Evidence-based care in Heart disease Evaluated According to Recommended Therapies (SWEDEHEART), we identified patients (n=242 303) admitted with suspected NSTE-ACS between 2008 and 2014. Logistic regression was used to assess the association between 26 candidate variables and in-hospital CA. A risk-score model was developed and validated using a temporal cohort (n=126 073) comprising patients from SWEDEHEART between 2005 and 2007 and an external cohort (n=276 109) comprising patients from the Myocardial Ischaemia National Audit Project (MINAP) between 2008 and 2013. The incidence of in-hospital CA for NSTE-ACS and non-ACS was lower in the SWEDEHEART-derivation cohort than in MINAP (1.3% and 0.5% vs. 2.3% and 2.3%). A seven point, five variable risk score (age ≥60 years (1 point), ST-T abnormalities (2 points), Killip Class >1 (1 point), heart rate <50 or ≥100bpm (1 point), and systolic blood pressure <100mmHg (2 points) was developed. Model discrimination was good in the derivation cohort (c-statistic 0.72) and temporal validation cohort (c-statistic 0.74), and calibration was reasonable with a tendency towards overestimation of risk with a higher sum of score points. External validation showed moderate discrimination (c-statistic 0.65) and calibration showed a general underestimation of predicted risk. A simple points score containing five variables readily available on admission predicts in-hospital CA for patients with suspected NSTE-ACS. Copyright © 2017 Elsevier B.V. All rights reserved.
Predicting neutropenia risk in patients with cancer using electronic data.
Pawloski, Pamala A; Thomas, Avis J; Kane, Sheryl; Vazquez-Benitez, Gabriela; Shapiro, Gary R; Lyman, Gary H
2017-04-01
Clinical guidelines recommending the use of myeloid growth factors are largely based on the prescribed chemotherapy regimen. The guidelines suggest that oncologists consider patient-specific characteristics when prescribing granulocyte-colony stimulating factor (G-CSF) prophylaxis; however, a mechanism to quantify individual patient risk is lacking. Readily available electronic health record (EHR) data can provide patient-specific information needed for individualized neutropenia risk estimation. An evidence-based, individualized neutropenia risk estimation algorithm has been developed. This study evaluated the automated extraction of EHR chemotherapy treatment data and externally validated the neutropenia risk prediction model. A retrospective cohort of adult patients with newly diagnosed breast, colorectal, lung, lymphoid, or ovarian cancer who received the first cycle of a cytotoxic chemotherapy regimen from 2008 to 2013 were recruited from a single cancer clinic. Electronically extracted EHR chemotherapy treatment data were validated by chart review. Neutropenia risk stratification was conducted and risk model performance was assessed using calibration and discrimination. Chemotherapy treatment data electronically extracted from the EHR were verified by chart review. The neutropenia risk prediction tool classified 126 patients (57%) as being low risk for febrile neutropenia, 44 (20%) as intermediate risk, and 51 (23%) as high risk. The model was well calibrated (Hosmer-Lemeshow goodness-of-fit test = 0.24). Discrimination was adequate and slightly less than in the original internal validation (c-statistic 0.75 vs 0.81). Chemotherapy treatment data were electronically extracted from the EHR successfully. The individualized neutropenia risk prediction model performed well in our retrospective external cohort. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Zohrabian, Armineh; Philipson, Tomas J
2010-06-01
This paper reviews the evidence on external costs of risky behaviors in the U.S. and provides a framework for estimating them. External costs arise when a person does not bear all the costs of his or her behavior. They provide one of the strongest rationales for government interventions. Although the earlier estimates of external costs no longer have policy relevance, they demonstrated that the existence of external costs was an empirical question. We recommend that the estimates of external costs be updated as insurance structures, environments, and knowledge about these behaviors change. The general aspects of external costs may apply to countries other than the U.S. after taking into account differences in institutional, policy and epidemiological characteristics.
Considerations Underlying the Use of Mixed Group Validation
ERIC Educational Resources Information Center
Jewsbury, Paul A.; Bowden, Stephen C.
2013-01-01
Mixed Group Validation (MGV) is an approach for estimating the diagnostic accuracy of tests. MGV is a promising alternative to the more commonly used Known Groups Validation (KGV) approach for estimating diagnostic accuracy. The advantage of MGV lies in the fact that the approach does not require a perfect external validity criterion or gold…
Al-Namnam, N M N; Hariri, F; Rahman, Z A A
2018-04-13
Our aim was to summarise current published evidence about the prognosis of various techniques of craniofacial distraction osteogenesis, particularly its indications, protocols, and complications. Published papers were acquired from online sources using the keywords "distraction osteogenesis", "Le Fort III", "monobloc", and "syndromic craniosynostosis" in combination with other keywords, such as "craniofacial deformity" and "midface". The search was confined to publications in English, and we followed the guidelines of the PRISMA statement. We found that deformity of the skull resulted mainly from Crouzon syndrome. Recently craniofacial distraction has been achieved by monobloc distraction osteogenesis using an external distraction device during childhood, while Le Fort III distraction osteogenesis was used in maturity. Craniofacial distraction was indicated primarily to correct increased intracranial pressure, exorbitism, and obstructive sleep apnoea in childhood, while midface hypoplasia was the main indication in maturity. Overall the most commonly reported complications were minor inflammatory reactions around the pins, and anticlockwise rotation when using external distraction systems. The mean amount of bony advancement was 12.3mm for an external device, 18.6mm for an internal device and 18.7mm when both external and internal devices were used. Treatment by craniofacial distraction must be validated by long-term studies as there adequate data are lacking, particularly about structural relapse and the assessment of function. Copyright © 2018 The British Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
Leontjevas, Ruslan; Gerritsen, Debby L; Koopmans, Raymond T C M; Smalbrugge, Martin; Vernooij-Dassen, Myrra J F J
2012-06-01
A multidisciplinary, evidence-based care program to improve the management of depression in nursing home residents was implemented and tested using a stepped-wedge design in 23 nursing homes (NHs): "Act in case of Depression" (AiD). Before effect analyses, to evaluate AiD process data on sampling quality (recruitment and randomization, reach) and intervention quality (relevance and feasibility, extent to which AiD was performed), which can be used for understanding internal and external validity. In this article, a model is presented that divides process evaluation data into first- and second-order process data. Qualitative and quantitative data based on personal files of residents, interviews of nursing home professionals, and a research database were analyzed according to the following process evaluation components: sampling quality and intervention quality. Nursing home. The pattern of residents' informed consent rates differed for dementia special care units and somatic units during the study. The nursing home staff was satisfied with the AiD program and reported that the program was feasible and relevant. With the exception of the first screening step (nursing staff members using a short observer-based depression scale), AiD components were not performed fully by NH staff as prescribed in the AiD protocol. Although NH staff found the program relevant and feasible and was satisfied with the program content, individual AiD components may have different feasibility. The results on sampling quality implied that statistical analyses of AiD effectiveness should account for the type of unit, whereas the findings on intervention quality implied that, next to the type of unit, analyses should account for the extent to which individual AiD program components were performed. In general, our first-order process data evaluation confirmed internal and external validity of the AiD trial, and this evaluation enabled further statistical fine tuning. The importance of evaluating the first-order process data before executing statistical effect analyses is thus underlined. Copyright © 2012 American Medical Directors Association, Inc. Published by Elsevier Inc. All rights reserved.
Haile, Sarah R; Guerra, Beniamino; Soriano, Joan B; Puhan, Milo A
2017-12-21
Prediction models and prognostic scores have been increasingly popular in both clinical practice and clinical research settings, for example to aid in risk-based decision making or control for confounding. In many medical fields, a large number of prognostic scores are available, but practitioners may find it difficult to choose between them due to lack of external validation as well as lack of comparisons between them. Borrowing methodology from network meta-analysis, we describe an approach to Multiple Score Comparison meta-analysis (MSC) which permits concurrent external validation and comparisons of prognostic scores using individual patient data (IPD) arising from a large-scale international collaboration. We describe the challenges in adapting network meta-analysis to the MSC setting, for instance the need to explicitly include correlations between the scores on a cohort level, and how to deal with many multi-score studies. We propose first using IPD to make cohort-level aggregate discrimination or calibration scores, comparing all to a common comparator. Then, standard network meta-analysis techniques can be applied, taking care to consider correlation structures in cohorts with multiple scores. Transitivity, consistency and heterogeneity are also examined. We provide a clinical application, comparing prognostic scores for 3-year mortality in patients with chronic obstructive pulmonary disease using data from a large-scale collaborative initiative. We focus on the discriminative properties of the prognostic scores. Our results show clear differences in performance, with ADO and eBODE showing higher discrimination with respect to mortality than other considered scores. The assumptions of transitivity and local and global consistency were not violated. Heterogeneity was small. We applied a network meta-analytic methodology to externally validate and concurrently compare the prognostic properties of clinical scores. Our large-scale external validation indicates that the scores with the best discriminative properties to predict 3 year mortality in patients with COPD are ADO and eBODE.
Cogswell, Alex; Alloy, Lauren B.; Karpinski, Andrew; Grant, David
2011-01-01
The present study addressed convergence between self-report and indirect approaches to assessing dependency. The study was moderately successful in validating an implicit measure, which was found to be reliable, orthogonal to two self-report instruments, and predictive of external criteria. This study also examined discrepancies between scores on self-report and implicit measures, and has implications for their significance. The possibility that discrepancies themselves are pathological was not supported, although discrepancies were associated with particular personality profiles. Finally, this study offered additional evidence for the relation between dependency and depressive symptomatology, and identified implicit dependency as contributing unique variance in predicting past major depression. PMID:20552505
Higher male educational hypergamy: evidence from Portugal.
Correia, Hamilton R
2003-04-01
Studies of human mate choice have been based almost exclusively on stated preferences and personal advertisements, and the external validity of such studies has therefore been questioned. In the present study, real-life matings based on a large representative sample of newly wed couples in 1998 (n=66,598) were analysed according to educational assortative mating. The results demonstrate a strong educational homogamy in this national Portuguese sample. However, men tend to marry women who are slightly more educated than themselves. The results are compared with those of a modern society (US, 1940-87) and a traditional society (Kipsigis, 1952-91). Since educational attainment is strongly associated with social status and intelligence, these results are discussed in an evolutionary perspective.
Rational selection of training and test sets for the development of validated QSAR models
NASA Astrophysics Data System (ADS)
Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander
2003-02-01
Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.
Miles, Robin; Havstad, Mark; LeBlanc, Mary; ...
2015-09-15
External heat transfer coefficients were measured around a surrogate Indirect inertial confinement fusion (ICF) based on the Laser Inertial Fusion Energy (LIFE) design target to validate thermal models of the LIFE target during flight through a fusion chamber. Results indicate that heat transfer coefficients for this target 25-50 W/m 2∙K are consistent with theoretically derived heat transfer coefficients and valid for use in calculation of target heating during flight through a fusion chamber.
Esteba-Castillo, Susanna; Torrents-Rodas, David; García-Alba, Javier; Ribas-Vidal, Núria; Novell-Alsina, Ramon
2016-12-21
The Health of the Nation Outcome Scales for People with Learning Disabilities (HoNOS-LD) is a brief instrument that assesses functioning in people with intellectual development disorder and mental health problems/behaviour disorders. The aim of the present study was to examine the evidence on the validity of the scores based on the Spanish version of the HoNOS-LD. The study included 111 participants that were assessed by the Spanish version of the HoNOS-LD and other questionnaires that measured different variables related to the scale. Thirty-three participants were assessed by 2 examiners, and retested 7 days later, in order to study inter-examiner reliability and test-retest reliabilities. Based on clinical and conceptual criteria, and on the results of the parallel analysis, a factorial solution with one factor was selected. Internal consistency was good (Omega coefficient of 0.87). Inter-examiner and test-retest reliabilities were excellent (intraclass correlation coefficients of 0.95 and 0.98, respectively). Correlations between sections of the HoNOS-LD and the related instruments showed the expected direction, and were highly significant (P<.001), and the HoNOS-LD score increased with the intensity of the support required by the participants. These results showed evidence of the validity of association with other external variables. The Spanish version of the HoNOS-LD is a brief, valid and reliable instrument, which will enable a routine assessment of functioning for different uses, including diagnosis and intervention. Copyright © 2016 SEP y SEPB. Publicado por Elsevier España, S.L.U. All rights reserved.
Grossman, Paul
2011-12-01
The Buddhist construct of mindfulness is a central element of mindfulness-based interventions and derives from an age-old systematic phenomenological program to investigate subjective experience. Recent enthusiasm for "mindfulness" in psychology has resulted in proliferation of self-report inventories that purport to measure mindful awareness as a trait. This paper addresses a number of intractable issues regarding these scales, in general, and also specifically highlights vulnerabilities of the adult and adolescent forms of the Mindfulness Attention Awareness Scale. These problems include (a) lack of available external referents for determining the construct validity of these inventories, (b) inadequacy of content validity of measures, (c) lack of evidence that self-reports of mindfulness competencies correspond to actual behavior and evidence that they do not, (d) lack of convergent validity among different mindfulness scales, (e) inequivalence of semantic item interpretation among different groups, (f) response biases related to degree of experience with mindfulness practice, (g) conflation of perceived mindfulness competencies with valuations of importance or meaningfulness, and (h) inappropriateness of samples employed to validate questionnaires. Current self-report attempts to measure mindfulness may serve to denature, distort, and banalize the meaning of mindful awareness in psychological research and may adversely affect further development of mindfulness-based interventions. Opportunities to enrich positivist Western psychological paradigms with a detailed and complex Buddhist phenomenology of the mind are likely to require a depth of understanding of mindfulness that, in turn, depends upon direct and long-term experience with mindfulness practice. Psychologists should consider pursuing this avenue before attempting to characterize and quantify mindfulness.
Melfsen, Andreas; Hartung, Eberhard; Haeussermann, Angelika
2013-02-01
The robustness of in-line raw milk analysis with near-infrared spectroscopy (NIRS) was tested with respect to the prediction of the raw milk contents fat, protein and lactose. Near-infrared (NIR) spectra of raw milk (n = 3119) were acquired on three different farms during the milking process of 354 milkings over a period of six months. Calibration models were calculated for: a random data set of each farm (fully random internal calibration); first two thirds of the visits per farm (internal calibration); whole datasets of two of the three farms (external calibration), and combinations of external and internal datasets. Validation was done either on the remaining data set per farm (internal validation) or on data of the remaining farms (external validation). Excellent calibration results were obtained when fully randomised internal calibration sets were used for milk analysis. In this case, RPD values of around ten, five and three for the prediction of fat, protein and lactose content, respectively, were achieved. Farm internal calibrations achieved much poorer prediction results especially for the prediction of protein and lactose with RPD values of around two and one respectively. The prediction accuracy improved when validation was done on spectra of an external farm, mainly due to the higher sample variation in external calibration sets in terms of feeding diets and individual cow effects. The results showed that further improvements were achieved when additional farm information was added to the calibration set. One of the main requirements towards a robust calibration model is the ability to predict milk constituents in unknown future milk samples. The robustness and quality of prediction increases with increasing variation of, e.g., feeding and cow individual milk composition in the calibration model.
Wang, Dong-Yu; Done, Susan J; Mc Cready, David R; Leong, Wey L
2014-07-04
Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). The original training cohort reached a statistically significant difference (p < 0.05) in disease-free survivals between the three CMTC groups after an additional two years of follow-up (median = 55 months). The prognostic value of the triad classification was reproduced in the second independent internal cohort and the new external validation cohort. CMTC achieved even higher prognostic significance when all available patients were analyzed (n = 4,851). Oncogenic pathways Myc, E2F1, Ras and β-catenin were again implicated in the high-risk groups. Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments.
Does rational selection of training and test sets improve the outcome of QSAR modeling?
Martin, Todd M; Harten, Paul; Young, Douglas M; Muratov, Eugene N; Golbraikh, Alexander; Zhu, Hao; Tropsha, Alexander
2012-10-22
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets. Commonly, this splitting is performed using random division. Rational splitting methods can divide data sets into training and test sets in an intelligent fashion. The purpose of this study was to determine whether rational division methods lead to more predictive models compared to random division. A special data splitting procedure was used to facilitate the comparison between random and rational division methods. For each toxicity end point, the overall data set was divided into a modeling set (80% of the overall set) and an external evaluation set (20% of the overall set) using random division. The modeling set was then subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using rational division methods and by using random division. The Kennard-Stone, minimal test set dissimilarity, and sphere exclusion algorithms were used as the rational division methods. The hierarchical clustering, random forest, and k-nearest neighbor (kNN) methods were used to develop QSAR models based on the training sets. For kNN QSAR, multiple training and test sets were generated, and multiple QSAR models were built. The results of this study indicate that models based on rational division methods generate better statistical results for the test sets than models based on random division, but the predictive power of both types of models are comparable.
Religious belief as compensatory control.
Kay, Aaron C; Gaucher, Danielle; McGregor, Ian; Nash, Kyle
2010-02-01
The authors review experimental evidence that religious conviction can be a defensive source of compensatory control when personal or external sources of control are low. They show evidence that (a) belief in religious deities and secular institutions can serve as external forms of control that can compensate for manipulations that lower personal control and (b) religious conviction can also serve as compensatory personal control after experimental manipulations that lower other forms of personal or external control. The authors review dispositional factors that differentially orient individuals toward external or personal varieties of compensatory control and conclude that compensatory religious conviction can be a flexible source of personal and external control for relief from the anxiety associated with random and uncertain experiences.
Bor, Jacob; Geldsetzer, Pascal; Venkataramani, Atheendar; Bärnighausen, Till
2015-01-01
Purpose of review Randomized, population-representative trials of clinical interventions are rare. Quasi-experiments have been used successfully to generate causal evidence on the cascade of HIV care in a broad range of real-world settings. Recent findings Quasi-experiments exploit exogenous, or quasi-random, variation occurring naturally in the world or because of an administrative rule or policy change to estimate causal effects. Well designed quasi-experiments have greater internal validity than typical observational research designs. At the same time, quasi-experiments may also have potential for greater external validity than experiments and can be implemented when randomized clinical trials are infeasible or unethical. Quasi-experimental studies have established the causal effects of HIV testing and initiation of antiretroviral therapy on health, economic outcomes and sexual behaviors, as well as indirect effects on other community members. Recent quasi-experiments have evaluated specific interventions to improve patient performance in the cascade of care, providing causal evidence to optimize clinical management of HIV. Summary Quasi-experiments have generated important data on the real-world impacts of HIV testing and treatment and on interventions to improve the cascade of care. With the growth in large-scale clinical and administrative data, quasi-experiments enable rigorous evaluation of policies implemented in real-world settings. PMID:26371463
Bor, Jacob; Geldsetzer, Pascal; Venkataramani, Atheendar; Bärnighausen, Till
2015-11-01
Randomized, population-representative trials of clinical interventions are rare. Quasi-experiments have been used successfully to generate causal evidence on the cascade of HIV care in a broad range of real-world settings. Quasi-experiments exploit exogenous, or quasi-random, variation occurring naturally in the world or because of an administrative rule or policy change to estimate causal effects. Well designed quasi-experiments have greater internal validity than typical observational research designs. At the same time, quasi-experiments may also have potential for greater external validity than experiments and can be implemented when randomized clinical trials are infeasible or unethical. Quasi-experimental studies have established the causal effects of HIV testing and initiation of antiretroviral therapy on health, economic outcomes and sexual behaviors, as well as indirect effects on other community members. Recent quasi-experiments have evaluated specific interventions to improve patient performance in the cascade of care, providing causal evidence to optimize clinical management of HIV. Quasi-experiments have generated important data on the real-world impacts of HIV testing and treatment and on interventions to improve the cascade of care. With the growth in large-scale clinical and administrative data, quasi-experiments enable rigorous evaluation of policies implemented in real-world settings.
Saavedra Salinas, Miguel Ángel; Barrera Cruz, Antonio; Cabral Castañeda, Antonio Rafael; Jara Quezada, Luis Javier; Arce-Salinas, C Alejandro; Álvarez Nemegyei, José; Fraga Mouret, Antonio; Orozco Alcalá, Javier; Salazar Páramo, Mario; Cruz Reyes, Claudia Verónica; Andrade Ortega, Lilia; Vera Lastra, Olga Lidia; Mendoza Pinto, Claudia; Sánchez González, Antonio; Cruz Cruz, Polita Del Rocío; Morales Hernández, Sara; Portela Hernández, Margarita; Pérez Cristóbal, Mario; Medina García, Gabriela; Hernández Romero, Noé; Velarde Ochoa, María Del Carmen; Navarro Zarza, José Eduardo; Portillo Díaz, Verónica; Vargas Guerrero, Angélica; Goycochea Robles, María Victoria; García Figueroa, José Luis; Barreira Mercado, Eduardo; Amigo Castañeda, Mary Carmen
2015-01-01
Pregnancy in women with autoimmune rheumatic diseases is associated with several maternal and fetal complications. The development of clinical practice guidelines with the best available scientific evidence may help standardize the care of these patients. To provide recommendations regarding prenatal care, treatment, and a more effective monitoring of pregnancy in women with lupus erythematosus (SLE), rheumatoid arthritis (RA) and antiphospholipid antibody syndrome (APS). Nominal panels were formed for consensus, systematic search of information, development of clinical questions, processing and grading of recommendations, internal validation by peers, and external validation of the final document. The quality criteria of the AGREE II instrument were followed. The various panels answered the 37 questions related to maternal and fetal care in SLE, RA, and APS, as well as to the use of antirheumatic drugs during pregnancy and lactation. The recommendations were discussed and integrated into a final manuscript. Finally, the corresponding algorithms were developed. We present the recommendations for pregnant women with SLE in this first part. We believe that the Mexican clinical practice guidelines for the management of pregnancy in women with SLE integrate the best available evidence for the treatment and follow-up of patients with these conditions. Copyright © 2014 Elsevier España, S.L.U. All rights reserved.
20 CFR 404.727 - Evidence of a deemed valid marriage.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Evidence of a deemed valid marriage. 404.727... DISABILITY INSURANCE (1950- ) Evidence Evidence of Age, Marriage, and Death § 404.727 Evidence of a deemed valid marriage. (a) General. A deemed valid marriage is a ceremonial marriage we consider valid even...
Explanatory Versus Pragmatic Trials: An Essential Concept in Study Design and Interpretation.
Merali, Zamir; Wilson, Jefferson R
2017-11-01
Randomized clinical trials often represent the highest level of clinical evidence available to evaluate the efficacy of an intervention in clinical medicine. Although the process of randomization serves to maximize internal validity, the external validity, or generalizability, of such studies depends on several factors determined at the design phase of the trial including eligibility criteria, study setting, and outcomes of interest. In general, explanatory trials are optimized to demonstrate the efficacy of an intervention in a highly selected patient group; however, findings from these studies may not be generalizable to the larger clinical problem. In contrast, pragmatic trials attempt to understand the real-world benefit of an intervention by incorporating design elements that allow for greater generalizability and clinical applicability of study results. In this article we describe the explanatory-pragmatic continuum for clinical trials in greater detail. Further, a well-accepted tool for grading trials on this continuum is described, and applied, to 2 recently published trials pertaining to the surgical management of lumbar degenerative spondylolisthesis.
Koshy, Anson J.; Watkins, Marley W.; Cassano, Michael C.; Wahlberg, Andrea C.; Mautone, Jennifer A.; Blum, Nathan J.
2013-01-01
Objective To evaluate the construct validity of the Behavioral Health Checklist (BHCL) for children aged from 4 to 12 years from diverse backgrounds. Method The parents of 4–12-year-old children completed the BHCL in urban and suburban primary care practices affiliated with a tertiary-care children’s hospital. Across practices, 1,702 were eligible and 1,406 (82.6%) provided consent. Children of participating parents were primarily non-Hispanic black/African American and white/Caucasian from low- to middle-income groups. Confirmatory factor analyses examined model fit for the total sample and subsamples defined by demographic characteristics. Results The findings supported the hypothesized 3-factor structure: Internalizing Problems, Externalizing Problems, and Inattention/Hyperactivity. The model demonstrated adequate to good fit across age-groups, gender, races, income groups, and suburban versus urban practices. Conclusion The findings provide strong evidence of the construct validity, developmental appropriateness, and cultural sensitivity of the BHCL when used for screening in primary care. PMID:23978505
Best Practices: How to Evaluate Psychological Science for Use by Organizations
Fiske, Susan T.; Borgida, Eugene
2014-01-01
We discuss how organizations can evaluate psychological science for its potential usefulness to their own purposes. Common sense is often the default but inadequate alternative, and bench-marking supplies only collective hunches instead of validated principles. External validity is an empirical process of identifying moderator variables, not a simple yes-no judgment about whether lab results replicate in the field. Hence, convincing criteria must specify what constitutes high-quality empirical evidence for organizational use. First, we illustrate some theories and science that have potential use. Then we describe generally accepted criteria for scientific quality and consensus, starting with peer review for quality, and scientific agreement in forms ranging from surveys of experts to meta-analyses to National Research Council consensus reports. Linkages of basic science to organizations entail communicating expert scientific consensus, motivating managerial interest, and translating broad principles to specific contexts. We close with parting advice to both sides of the researcher-practitioner divide. PMID:24478533
Ultrasound comparison of external and internal neck anatomy with the LMA Unique.
Lee, Steven M; Wojtczak, Jacek A; Cattano, Davide
2017-12-01
Internal neck anatomy landmarks and their relation after placement of an extraglottic airway devices have not been studied extensively by the use of ultrasound. Based on our group experience with external landmarks as well as internal landmarks evaluation with other techniques, we aimed use ultrasound to analyze the internal neck anatomy landmarks and the related changes due to the placement of the Laryngeal Mask Airway Unique. Observational pilot investigation. Non-obese adult patients with no evidence of airway anomalies, were recruited. External neck landmarks were measured based on a validated and standardized method by tape. Eight internal anatomical landmarks, reciprocal by the investigational hypothesis to the external landmarks, were also measured by ultrasound guidance. The internal landmarks were re-measured after optimal placement and inflation of the extraglottic airway devices cuff Laryngeal Mask Airway Unique. Six subjects were recruited. Ultrasound measurements of hyoid-mental distance, thyroid-cricoid distance, thyroid height, and thyroid width were found to be significantly ( p < 0.05) overestimated using a tape measure. Sagittal neck landmark distances such as thyroid height, sternal-mental distance, and thyroid-cricoid distance significantly decreased after placement of the Laryngeal Mask Airway Unique. The laryngeal mask airway Unique resulted in significant changes in internal neck anatomy. The induced changes and respective specific internal neck anatomy landmarks could help to design devices that would modify their shape accordingly to areas of greatest displacement. Also, while external neck landmark measurements overestimate their respective internal neck landmarks, as we previously reported, the concordance of each measurement and their respective conversion factor could continue to be of help in sizing extraglottic airway devices. Due to the pilot nature of the study, more investigations are warranted.
Chriqui, Jamie F.; Burgeson, Charlene R.; Fisher, Megan C.; Ness, Roberta B.
2013-01-01
Childhood obesity is a serious public health problem, resulting from energy imbalance (when the intake of energy is greater than the amount of energy expended through physical activity). Numerous health authorities have identified policy interventions as promising strategies for creating population-wide improvements in physical activity. This case study focuses on energy expenditure through physical activity (with a particular emphasis on school-based physical education [PE]). Policy-relevant evidence for promoting physical activity in youth may take numerous forms including epidemiologic data and other supporting evidence (e.g., qualitative data). The implementation and evaluation of school PE interventions leads to a set of lessons related to epidemiology and evidence-based policy. These include the need to: 1) enhance the focus on external validity, 2) develop more policy-relevant evidence based on “natural experiments,” 3) understand that policymaking is political, 4) better articulate the factors that influence policy dissemination, 5) understand the real world constraints when implementing policy in school environments, and 6) build transdisciplinary teams for policy progress. The issues described in this case study provide leverage points for practitioners, policy makers, and researchers as they seek to translate epidemiology to policy. PMID:20470970
Shmulewitz, D.; Wall, M.M.; Aharonovich, E.; Spivak, B.; Weizman, A.; Frisch, A.; Grant, B. F.; Hasin, D.
2013-01-01
Background The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) proposes aligning nicotine use disorder (NUD) criteria with those for other substances, by including the current DSM fourth edition (DSM-IV) nicotine dependence (ND) criteria, three abuse criteria (neglect roles, hazardous use, interpersonal problems) and craving. Although NUD criteria indicate one latent trait, evidence is lacking on: (1) validity of each criterion; (2) validity of the criteria as a set; (3) comparative validity between DSM-5 NUD and DSM-IV ND criterion sets; and (4) NUD prevalence. Method Nicotine criteria (DSM-IV ND, abuse and craving) and external validators (e.g. smoking soon after awakening, number of cigarettes per day) were assessed with a structured interview in 734 lifetime smokers from an Israeli household sample. Regression analysis evaluated the association between validators and each criterion. Receiver operating characteristic analysis assessed the association of the validators with the DSM-5 NUD set (number of criteria endorsed) and tested whether DSM-5 or DSM-IV provided the most discriminating criterion set. Changes in prevalence were examined. Results Each DSM-5 NUD criterion was significantly associated with the validators, with strength of associations similar across the criteria. As a set, DSM-5 criteria were significantly associated with the validators, were significantly more discriminating than DSM-IV ND criteria, and led to increased prevalence of binary NUD (two or more criteria) over ND. Conclusions All findings address previous concerns about the DSM-IV nicotine diagnosis and its criteria and support the proposed changes for DSM-5 NUD, which should result in improved diagnosis of nicotine disorders. PMID:23312475
van Gestel, Aukje; Severens, Johan L; Webers, Carroll A B; Beckers, Henny J M; Jansonius, Nomdo M; Schouten, Jan S A G
2010-01-01
Discrete event simulation (DES) modeling has several advantages over simpler modeling techniques in health economics, such as increased flexibility and the ability to model complex systems. Nevertheless, these benefits may come at the cost of reduced transparency, which may compromise the model's face validity and credibility. We aimed to produce a transparent report on the construction and validation of a DES model using a recently developed model of ocular hypertension and glaucoma. Current evidence of associations between prognostic factors and disease progression in ocular hypertension and glaucoma was translated into DES model elements. The model was extended to simulate treatment decisions and effects. Utility and costs were linked to disease status and treatment, and clinical and health economic outcomes were defined. The model was validated at several levels. The soundness of design and the plausibility of input estimates were evaluated in interdisciplinary meetings (face validity). Individual patients were traced throughout the simulation under a multitude of model settings to debug the model, and the model was run with a variety of extreme scenarios to compare the outcomes with prior expectations (internal validity). Finally, several intermediate (clinical) outcomes of the model were compared with those observed in experimental or observational studies (external validity) and the feasibility of evaluating hypothetical treatment strategies was tested. The model performed well in all validity tests. Analyses of hypothetical treatment strategies took about 30 minutes per cohort and lead to plausible health-economic outcomes. There is added value of DES models in complex treatment strategies such as glaucoma. Achieving transparency in model structure and outcomes may require some effort in reporting and validating the model, but it is feasible.
Services for Adults With an Autism Spectrum Disorder
Shattuck, Paul T; Roux, Anne M; Hudson, Laura E; Taylor, Julie Lounds; Maenner, Matthew J; Trani, Jean-Francois
2012-01-01
The need for useful evidence about services is increasing as larger numbers of children identified with an autism spectrum disorder (ASD) age toward adulthood. The objective of this review was to characterize the topical and methodological aspects of research on services for supporting success in work, education, and social participation among adults with an ASD and to propose recommendations for moving this area of research forward. We reviewed the literature published in English from 2000 to 2010 and found that the evidence base about services for adults with an ASD is underdeveloped and can be considered a field of inquiry that is relatively unformed. Extant research does not reflect the demographic or impairment heterogeneity of the population, the range of services that adults with autism require to function with purposeful lives in the community, and the need for coordination across service systems and sectors. Future studies must examine issues related to cost and efficiency, given the broader sociopolitical and economic context of service provision. Further, future research needs to consider how demographic and impairment heterogeneity have implications for building an evidence base that will have greater external validity. PMID:22546060
CADASTER QSPR Models for Predictions of Melting and Boiling Points of Perfluorinated Chemicals.
Bhhatarai, Barun; Teetz, Wolfram; Liu, Tao; Öberg, Tomas; Jeliazkova, Nina; Kochev, Nikolay; Pukalov, Ognyan; Tetko, Igor V; Kovarich, Simona; Papa, Ester; Gramatica, Paola
2011-03-14
Quantitative structure property relationship (QSPR) studies on per- and polyfluorinated chemicals (PFCs) on melting point (MP) and boiling point (BP) are presented. The training and prediction chemicals used for developing and validating the models were selected from Syracuse PhysProp database and literatures. The available experimental data sets were split in two different ways: a) random selection on response value, and b) structural similarity verified by self-organizing-map (SOM), in order to propose reliable predictive models, developed only on the training sets and externally verified on the prediction sets. Individual linear and non-linear approaches based models developed by different CADASTER partners on 0D-2D Dragon descriptors, E-state descriptors and fragment based descriptors as well as consensus model and their predictions are presented. In addition, the predictive performance of the developed models was verified on a blind external validation set (EV-set) prepared using PERFORCE database on 15 MP and 25 BP data respectively. This database contains only long chain perfluoro-alkylated chemicals, particularly monitored by regulatory agencies like US-EPA and EU-REACH. QSPR models with internal and external validation on two different external prediction/validation sets and study of applicability-domain highlighting the robustness and high accuracy of the models are discussed. Finally, MPs for additional 303 PFCs and BPs for 271 PFCs were predicted for which experimental measurements are unknown. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sherlock Holmes and child psychopathology assessment approaches: the case of the false-positive.
Jensen, P S; Watanabe, H
1999-02-01
To explore the relative value of various methods of assessing childhood psychopathology, the authors compared 4 groups of children: those who met criteria for one or more DSM diagnoses and scored high on parent symptom checklists, those who met psychopathology criteria on either one of these two assessment approaches alone, and those who met no psychopathology assessment criterion. Parents of 201 children completed the Child Behavior Checklist (CBCL), after which children and parents were administered the Diagnostic Interview Schedule for Children (version 2.1). Children and parents also completed other survey measures and symptom report inventories. The 4 groups of children were compared against "external validators" to examine the merits of "false-positive" and "false-negative" cases. True-positive cases (those that met DSM criteria and scored high on the CBCL) differed significantly from the true-negative cases on most external validators. "False-positive" and "false-negative" cases had intermediate levels of most risk factors and external validators. "False-positive" cases were not normal per se because they scored significantly above the true-negative group on a number of risk factors and external validators. A similar but less marked pattern was noted for "false-negatives." Findings call into question whether cases with high symptom checklist scores despite no formal diagnoses should be considered "false-positive." Pending the availability of robust markers for mental illness, researchers and clinicians must resist the tendency to reify diagnostic categories or to engage in arcane debates about the superiority of one assessment approach over another.
Validation of the DECAF score to predict hospital mortality in acute exacerbations of COPD
Echevarria, C; Steer, J; Heslop-Marshall, K; Stenton, SC; Hickey, PM; Hughes, R; Wijesinghe, M; Harrison, RN; Steen, N; Simpson, AJ; Gibson, GJ; Bourke, SC
2016-01-01
Background Hospitalisation due to acute exacerbations of COPD (AECOPD) is common, and subsequent mortality high. The DECAF score was derived for accurate prediction of mortality and risk stratification to inform patient care. We aimed to validate the DECAF score, internally and externally, and to compare its performance to other predictive tools. Methods The study took place in the two hospitals within the derivation study (internal validation) and in four additional hospitals (external validation) between January 2012 and May 2014. Consecutive admissions were identified by screening admissions and searching coding records. Admission clinical data, including DECAF indices, and mortality were recorded. The prognostic value of DECAF and other scores were assessed by the area under the receiver operator characteristic (AUROC) curve. Results In the internal and external validation cohorts, 880 and 845 patients were recruited. Mean age was 73.1 (SD 10.3) years, 54.3% were female, and mean (SD) FEV1 45.5 (18.3) per cent predicted. Overall mortality was 7.7%. The DECAF AUROC curve for inhospital mortality was 0.83 (95% CI 0.78 to 0.87) in the internal cohort and 0.82 (95% CI 0.77 to 0.87) in the external cohort, and was superior to other prognostic scores for inhospital or 30-day mortality. Conclusions DECAF is a robust predictor of mortality, using indices routinely available on admission. Its generalisability is supported by consistent strong performance; it can identify low-risk patients (DECAF 0–1) potentially suitable for Hospital at Home or early supported discharge services, and high-risk patients (DECAF 3–6) for escalation planning or appropriate early palliation. Trial registration number UKCRN ID 14214. PMID:26769015
Absorption in Sport: A Cross-Validation Study
Koehn, Stefan; Stavrou, Nektarios A. M.; Cogley, Jeremy; Morris, Tony; Mosek, Erez; Watt, Anthony P.
2017-01-01
Absorption has been identified as readiness for experiences of deep involvement in the task. Conceptually, absorption is a key psychological construct, incorporating experiential, cognitive, and motivational components. Although, no operationalization of the construct has been provided to facilitate research in this area, the purpose of this research was the development and examination of the psychometric properties of a sport-specific measure of absorption that evolved from the use of the modified Tellegen Absorption Scale (MODTAS; Jamieson, 2005) in mainstream psychology. The study aimed to provide evidence of the psychometric properties, reliability, and validity of the Measure of Absorption in Sport Contexts (MASCs). The psychometric examination included a calibration sample from Scotland and a cross-validation sample from Australia using a cross-sectional design. The item pool was developed based on existing items from the modified Tellegen Absorption Scale (Jamieson, 2005). The MODTAS items were reworded and translated into a sport context. The Scottish sample consisted of 292 participants and the Australian sample of 314 participants. Congeneric model testing and confirmatory factor analysis for both samples and multi-group invariance testing across samples was used. In the cross-validation sample the MASC subscales showed acceptable internal consistency and construct reliability (≥0.70). Excellent fit indices were found for the final 18-item, six-factor measure in the cross-validation sample, χ(120)2 = 197.486, p < 0.001; CFI = 0.957; TLI = 0.945; RMSEA = 0.045; SRMR = 0.044. Multi-group invariance testing revealed no differences in item meaning, except for two items. The MASC and the Dispositional Flow Scale-2 showed moderate-to-strong positive correlations in both samples, r = 0.38, p < 0.001 and r = 0.42, p < 0.001, supporting the external validity of the MASC. This article provides initial evidence in support of the psychometric properties, reliability, and validity of the sport-specific measure of absorption. The MASC provides rich research opportunities in sport psychology that can enhance the theoretical understanding between absorption and related constructs and facilitate future intervention studies. PMID:28883802
van Stiphout, Ruud G P M; Valentini, Vincenzo; Buijsen, Jeroen; Lammering, Guido; Meldolesi, Elisa; van Soest, Johan; Leccisotti, Lucia; Giordano, Alessandro; Gambacorta, Maria A; Dekker, Andre; Lambin, Philippe
2014-11-01
To develop and externally validate a predictive model for pathologic complete response (pCR) for locally advanced rectal cancer (LARC) based on clinical features and early sequential (18)F-FDG PETCT imaging. Prospective data (i.a. THUNDER trial) were used to train (N=112, MAASTRO Clinic) and validate (N=78, Università Cattolica del S. Cuore) the model for pCR (ypT0N0). All patients received long-course chemoradiotherapy (CRT) and surgery. Clinical parameters were age, gender, clinical tumour (cT) stage and clinical nodal (cN) stage. PET parameters were SUVmax, SUVmean, metabolic tumour volume (MTV) and maximal tumour diameter, for which response indices between pre-treatment and intermediate scan were calculated. Using multivariate logistic regression, three probability groups for pCR were defined. The pCR rates were 21.4% (training) and 23.1% (validation). The selected predictive features for pCR were cT-stage, cN-stage, response index of SUVmean and maximal tumour diameter during treatment. The models' performances (AUC) were 0.78 (training) and 0.70 (validation). The high probability group for pCR resulted in 100% correct predictions for training and 67% for validation. The model is available on the website www.predictcancer.org. The developed predictive model for pCR is accurate and externally validated. This model may assist in treatment decisions during CRT to select complete responders for a wait-and-see policy, good responders for extra RT boost and bad responders for additional chemotherapy. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Validation of External Corrosion Growth-Rate Using Polarization Resistance and Soil Properties
DOT National Transportation Integrated Search
2010-08-01
The research project evaluated the use of the Linear Polarization Resistance (LPR) and the Electric Resistance (ER) technologies in estimating the external corrosion growth rates of buried steel pipelines. This was achieved by performing laboratory a...
van der Ploeg, Tjeerd; Nieboer, Daan; Steyerberg, Ewout W
2016-10-01
Prediction of medical outcomes may potentially benefit from using modern statistical modeling techniques. We aimed to externally validate modeling strategies for prediction of 6-month mortality of patients suffering from traumatic brain injury (TBI) with predictor sets of increasing complexity. We analyzed individual patient data from 15 different studies including 11,026 TBI patients. We consecutively considered a core set of predictors (age, motor score, and pupillary reactivity), an extended set with computed tomography scan characteristics, and a further extension with two laboratory measurements (glucose and hemoglobin). With each of these sets, we predicted 6-month mortality using default settings with five statistical modeling techniques: logistic regression (LR), classification and regression trees, random forests (RFs), support vector machines (SVM) and neural nets. For external validation, a model developed on one of the 15 data sets was applied to each of the 14 remaining sets. This process was repeated 15 times for a total of 630 validations. The area under the receiver operating characteristic curve (AUC) was used to assess the discriminative ability of the models. For the most complex predictor set, the LR models performed best (median validated AUC value, 0.757), followed by RF and support vector machine models (median validated AUC value, 0.735 and 0.732, respectively). With each predictor set, the classification and regression trees models showed poor performance (median validated AUC value, <0.7). The variability in performance across the studies was smallest for the RF- and LR-based models (inter quartile range for validated AUC values from 0.07 to 0.10). In the area of predicting mortality from TBI, nonlinear and nonadditive effects are not pronounced enough to make modern prediction methods beneficial. Copyright © 2016 Elsevier Inc. All rights reserved.
Egea-Valenzuela, Juan; González Suárez, Begoña; Sierra Bernal, Cristian; Juanmartiñena Fernández, José Francisco; Luján-Sanchís, Marisol; San Juan Acosta, Mileidis; Martínez Andrés, Blanca; Pons Beltrán, Vicente; Sastre Lozano, Violeta; Carretero Ribón, Cristina; de Vera Almenar, Félix; Sánchez Cuenca, Joaquín; Alberca de Las Parras, Fernando; Rodríguez de Miguel, Cristina; Valle Muñoz, Julio; Férnandez-Urién Sainz, Ignacio; Torres González, Carolina; Borque Barrera, Pilar; Pérez-Cuadrado Robles, Enrique; Alonso Lázaro, Noelia; Martínez García, Pilar; Prieto de Frías, César; Carballo Álvarez, Fernando
2018-05-01
Capsule endoscopy (CE) is the first-line investigation in cases of suspected Crohn's disease (CD) of the small bowel, but the factors associated with a higher diagnostic yield remain unclear. Our aim is to develop and validate a scoring index to assess the risk of the patients in this setting on the basis of biomarkers. Data on fecal calprotectin, C-reactive protein, and other biomarkers from a population of 124 patients with suspected CD of the small bowel studied by CE and included in a PhD study were used to build a scoring index. This was first used on this population (internal validation process) and after that on a different set of patients from a multicenter study (external validation process). An index was designed in which every biomarker is assigned a score. Three risk groups have been established (low, intermediate, and high). In the internal validation analysis (124 individuals), patients had a 10, 46.5, and 81% probability of showing inflammatory lesions in CE in the low-risk, intermediate-risk, and high-risk groups, respectively. In the external validation analysis, including 410 patients from 12 Spanish hospitals, this probability was 15.8, 49.7, and 80.6% for the low-risk, intermediate-risk, and high-risk groups, respectively. Results from the internal validation process show that the scoring index is coherent, and results from the external validation process confirm its reliability. This index can be a useful tool for selecting patients before CE studies in cases of suspected CD of the small bowel.
How Sharp is a Unicorn's Horn?
ERIC Educational Resources Information Center
Johnston, Peter H.; Allignton, Richard L.
1983-01-01
Criticizes a study of the reliability and validity of curriculum-based reading inventories by L. S. Fuchs, D. Fuchs, and S. L. Deno and raises questions regarding the study's internal and external validity. (AEA)
2013-01-01
Background A Drug Influence Evaluation (DIE) is a formal assessment of an impaired driving suspect, performed by a trained law enforcement officer who uses circumstantial facts, questioning, searching, and a physical exam to form an unstandardized opinion as to whether a suspect’s driving was impaired by drugs. This paper first identifies the scientific studies commonly cited in American criminal trials as evidence of DIE accuracy, and second, uses the QUADAS tool to investigate whether the methodologies used by these studies allow them to correctly quantify the diagnostic accuracy of the DIEs currently administered by US law enforcement. Results Three studies were selected for analysis. For each study, the QUADAS tool identified biases that distorted reported accuracies. The studies were subject to spectrum bias, selection bias, misclassification bias, verification bias, differential verification bias, incorporation bias, and review bias. The studies quantified DIE performance with prevalence-dependent accuracy statistics that are internally but not externally valid. Conclusion The accuracies reported by these studies do not quantify the accuracy of the DIE process now used by US law enforcement. These studies do not validate current DIE practice. PMID:24188398
NASA Astrophysics Data System (ADS)
Correa, M. A.; Bohn, F.
2018-05-01
We perform a theoretical and experimental investigation of the magnetic properties and magnetization dynamics of a ferromagnetic magnetostrictive multilayer grown onto a flexible substrate and submitted to external stress. We calculate the magnetic behavior and magnetoimpedance effect for a trilayered system from an approach that considers a magnetic permeability model for planar geometry and a magnetic free energy density which takes into account induced uniaxial and magnetoelastic anisotropy contributions. We verify remarkable modifications of the magnetic anisotropy with external stress, as well as we show that the dynamic magnetic response is strongly affected by these changes. We discuss the magnetic features that lead to modifications of the frequency limits where distinct mechanisms are responsible by the magnetoimpedance variations, enabling us to manipulate the resonance fields. To test the robustness of the approach, we directly compare theoretical results with experimental data. Thus, we provide experimental evidence to confirm the validity of the theoretical approach, as well as to manipulate the resonance fields to tune the MI response according to real applications in devices.
The Impact of Overreporting on MMPI-2-RF Substantive Scale Score Validity
ERIC Educational Resources Information Center
Burchett, Danielle L.; Ben-Porath, Yossef S.
2010-01-01
This study examined the impact of overreporting on the validity of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) substantive scale scores by comparing correlations with relevant external criteria (i.e., validity coefficients) of individuals who completed the instrument under instructions to (a) feign psychopathology…
Quantifying prognosis with risk predictions.
Pace, Nathan L; Eberhart, Leopold H J; Kranke, Peter R
2012-01-01
Prognosis is a forecast, based on present observations in a patient, of their probable outcome from disease, surgery and so on. Research methods for the development of risk probabilities may not be familiar to some anaesthesiologists. We briefly describe methods for identifying risk factors and risk scores. A probability prediction rule assigns a risk probability to a patient for the occurrence of a specific event. Probability reflects the continuum between absolute certainty (Pi = 1) and certified impossibility (Pi = 0). Biomarkers and clinical covariates that modify risk are known as risk factors. The Pi as modified by risk factors can be estimated by identifying the risk factors and their weighting; these are usually obtained by stepwise logistic regression. The accuracy of probabilistic predictors can be separated into the concepts of 'overall performance', 'discrimination' and 'calibration'. Overall performance is the mathematical distance between predictions and outcomes. Discrimination is the ability of the predictor to rank order observations with different outcomes. Calibration is the correctness of prediction probabilities on an absolute scale. Statistical methods include the Brier score, coefficient of determination (Nagelkerke R2), C-statistic and regression calibration. External validation is the comparison of the actual outcomes to the predicted outcomes in a new and independent patient sample. External validation uses the statistical methods of overall performance, discrimination and calibration and is uniformly recommended before acceptance of the prediction model. Evidence from randomised controlled clinical trials should be obtained to show the effectiveness of risk scores for altering patient management and patient outcomes.
Identification of Distinct Psychosis Biotypes Using Brain-Based Biomarkers.
Clementz, Brett A; Sweeney, John A; Hamm, Jordan P; Ivleva, Elena I; Ethridge, Lauren E; Pearlson, Godfrey D; Keshavan, Matcheri S; Tamminga, Carol A
2016-04-01
Clinical phenomenology remains the primary means for classifying psychoses despite considerable evidence that this method incompletely captures biologically meaningful differentiations. Rather than relying on clinical diagnoses as the gold standard, this project drew on neurobiological heterogeneity among psychosis cases to delineate subgroups independent of their phenomenological manifestations. A large biomarker panel (neuropsychological, stop signal, saccadic control, and auditory stimulation paradigms) characterizing diverse aspects of brain function was collected on individuals with schizophrenia, schizoaffective disorder, and bipolar disorder with psychosis (N=711), their first-degree relatives (N=883), and demographically comparable healthy subjects (N=278). Biomarker variance across paradigms was exploited to create nine integrated variables that were used to capture neurobiological variance among the psychosis cases. Data on external validating measures (social functioning, structural magnetic resonance imaging, family biomarkers, and clinical information) were collected. Multivariate taxometric analyses identified three neurobiologically distinct psychosis biotypes that did not respect clinical diagnosis boundaries. The same analysis procedure using clinical DSM diagnoses as the criteria was best described by a single severity continuum (schizophrenia worse than schizoaffective disorder worse than bipolar psychosis); this was not the case for biotypes. The external validating measures supported the distinctiveness of these subgroups compared with clinical diagnosis, highlighting a possible advantage of neurobiological versus clinical categorization schemes for differentiating psychotic disorders. These data illustrate how multiple pathways may lead to clinically similar psychosis manifestations, and they provide explanations for the marked heterogeneity observed across laboratories on the same biomarker variables when DSM diagnoses are used as the gold standard.
Validation of a dynamic linked segment model to calculate joint moments in lifting.
de Looze, M P; Kingma, I; Bussmann, J B; Toussaint, H M
1992-08-01
A two-dimensional dynamic linked segment model was constructed and applied to a lifting activity. Reactive forces and moments were calculated by an instantaneous approach involving the application of Newtonian mechanics to individual adjacent rigid segments in succession. The analysis started once at the feet and once at a hands/load segment. The model was validated by comparing predicted external forces and moments at the feet or at a hands/load segment to actual values, which were simultaneously measured (ground reaction force at the feet) or assumed to be zero (external moments at feet and hands/load and external forces, beside gravitation, at hands/load). In addition, results of both procedures, in terms of joint moments, including the moment at the intervertebral disc between the fifth lumbar and first sacral vertebra (L5-S1), were compared. A correlation of r = 0.88 between calculated and measured vertical ground reaction forces was found. The calculated external forces and moments at the hands showed only minor deviations from the expected zero level. The moments at L5-S1, calculated starting from feet compared to starting from hands/load, yielded a coefficient of correlation of r = 0.99. However, moments calculated from hands/load were 3.6% (averaged values) and 10.9% (peak values) higher. This difference is assumed to be due mainly to erroneous estimations of the positions of centres of gravity and joint rotation centres. The estimation of the location of L5-S1 rotation axis can affect the results significantly. Despite the numerous studies estimating the load on the low back during lifting on the basis of linked segment models, only a few attempts to validate these models have been made. This study is concerned with the validity of the presented linked segment model. The results support the model's validity. Effects of several sources of error threatening the validity are discussed. Copyright © 1992. Published by Elsevier Ltd.
Construction and validation of a Tamil logMAR chart.
Varadharajan, Srinivasa; Srinivasan, Krithica; Kumaresan, Brindha
2009-09-01
To design, construct and validate a new Tamil logMAR visual acuity chart based on current recommendations. Ten Tamil letters of equal legibility were identified experimentally and were used in the chart. Two charts, one internally illuminated and one externally illuminated, were constructed for testing at 4 m distance. The repeatability of the two charts was tested. For validation, the two charts were compared with a standard English logMAR chart (ETDRS). When compared to the ETDRS chart, a difference of 0.06 +/- 0.07 and 0.07 +/- 0.07 logMAR was found for the internally and externally illuminated charts respectively. Limits of agreement between the internally illuminated Tamil logMAR chart and ETDRS chart were found to be (-0.08, 0.19), and (-0.07, 0.20) for the externally illuminated chart. The test - retest results showed a difference of 0.02 +/- 0.04 and 0.02 +/- 0.06 logMAR for the internally and externally illuminated charts respectively. Limits of agreement for repeated measurements for the internally illuminated Tamil logMAR chart were found to be (-0.06, 0.10), and (-0.10, 0.14) for the externally illuminated chart. The newly constructed Tamil logMAR charts have good repeatability. The difference in visual acuity scores between the newly constructed Tamil logMAR chart and the standard English logMAR chart was within acceptable limits. This new chart can be used for measuring visual acuity in the literate Tamil population.
2015-06-12
27 viii Threats to Validity and Biases ...draw conclusions and make recommendations for future research. Threats to Validity and Biases There are a several issues that pose a threat to...validity and bias to the research. Threats to validity affect the accuracy of the research and soundness of the conclusion. Threats to external validity
Küçükdeveci, Ayse A; Sahin, Hülya; Ataman, Sebnem; Griffiths, Bridget; Tennant, Alan
2004-02-15
Guidelines have been established for cross-cultural adaptation of outcome measures. However, invariance across cultures must also be demonstrated through analysis of Differential Item Functioning (DIF). This is tested in the context of a Turkish adaptation of the Health Assessment Questionnaire (HAQ). Internal construct validity of the adapted HAQ is assessed by Rasch analysis; reliability, by internal consistency and the intraclass correlation coefficient; external construct validity, by association with impairments and American College of Rheumatology functional stages. Cross-cultural validity is tested through DIF by comparison with data from the UK version of the HAQ. The adapted version of the HAQ demonstrated good internal construct validity through fit of the data to the Rasch model (mean item fit 0.205; SD 0.998). Reliability was excellent (alpha = 0.97) and external construct validity was confirmed by expected associations. DIF for culture was found in only 1 item. Cross-cultural validity was found to be sufficient for use in international studies between the UK and Turkey. Future adaptation of instruments should include analysis of DIF at the field testing stage in the adaptation process.
Carosella, Victorio C; Navia, Jose L; Al-Ruzzeh, Sharif; Grancelli, Hugo; Rodriguez, Walter; Cardenas, Cesar; Bilbao, Jorge; Nojek, Carlos
2009-08-01
This study aims to develop the first Latin-American risk model that can be used as a simple, pocket-card graphic score at bedside. The risk model was developed on 2903 patients who underwent cardiac surgery at the Spanish Hospital of Buenos Aires, Argentina, between June 1994 and December 1999. Internal validation was performed on 708 patients between January 2000 and June 2001 at the same center. External validation was performed on 1087 patients between February 2000 and January 2007 at three other centers in Argentina. In the development dataset the area under receiver operating characteristics (ROC) curve was 0.73 and the Hosmer-Lemeshow (HL) test was P=0.88. In the internal validation ROC curve was 0.77. In the external validation ROC curve was 0.81, but imperfect calibration was detected because the observed in-hospital mortality (3.96%) was significantly lower than the development dataset (8.20%) (P<0.0001). Recalibration was done in 2007, showing excellent level of agreement between the observed and predicted mortality rates on all patients (P=0.92). This is the first risk model for cardiac surgery developed in a population of Latin-America with both internal and external validation. A simple graphic pocket-card score allows an easy bedside application with acceptable statistic precision.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsao, May N.; Mehta, Minesh P.; Whelan, Timothy J.
2005-09-01
Purpose: To systematically review the evidence for the use of stereotactic radiosurgery or stereotactic fractionated radiation therapy in adult patients with malignant glioma. Methods: Key clinical questions to be addressed in this evidence-based review were identified. Outcomes considered were overall survival, quality of life or symptom control, brain tumor control or response and toxicity. MEDLINE (1990-2004 June Week 2), CANCERLIT (1990-2003), CINAHL (1990-2004 June Week 2), EMBASE (1990-2004 Week 25), and the Cochrane library (2004 issue 2) databases were searched using OVID. In addition, the Physician Data Query clinical trials database, the proceedings of the American Society of Clinical Oncologymore » (1997-2004), ASTRO (1997-2004), and the European Society of Therapeutic Radiology and Oncology (ESTRO) (1997-2003) were searched. Data from the literature search were reviewed and tabulated. This process included an assessment of the level of evidence. Results: For patients with newly diagnosed malignant glioma, radiosurgery as boost therapy with conventional external beam radiation was examined in one randomized trial, five prospective cohort studies, and seven retrospective series. There is Level I evidence that the use of radiosurgery boost followed by external beam radiotherapy and carmustine (BCNU) does not confer benefit with respect to overall survival, quality of life, or patterns of failure as compared with external beam radiotherapy and BCNU. There is Level I-III evidence of toxicity associated with radiosurgery boost as compared with external beam radiotherapy alone. The results of the prospective and retrospective studies may be influenced by selection bias. Radiosurgery used as salvage for recurrent or progressive malignant glioma after conventional external beam radiotherapy failure was reported in zero randomized trials, three prospective cohort studies, and five retrospective series. The available data are sparse and insufficient to make absolute recommendations. Stereotactic fractionated radiation therapy has been reported as boost therapy with external beam radiotherapy for patients with newly diagnosed malignant glioma in only three prospective studies. As primary therapy alone without conventional external beam radiotherapy for newly diagnosed malignant glioma patients, stereotactic fractionated radiation therapy has been reported in only one prospective study. There were only three prospective series and two retrospective studies reported for patients with recurrent or progressive malignant glioma. Conclusions: For patients with malignant glioma, there is Level I-III evidence that the use of radiosurgery boost followed by external beam radiotherapy and BCNU does not confer benefit in terms of overall survival, local brain control, or quality of life as compared with external beam radiotherapy and BCNU. The use of radiosurgery boost is associated with increased toxicity. For patients with malignant glioma, there is insufficient evidence regarding the benefits/harms of using radiosurgery at the time progression or recurrence. There is also insufficient evidence regarding the benefits/harms in the use of stereotactic fractionated radiation therapy for patients with newly diagnosed or progressive/recurrent malignant glioma.« less
[What else is Evidence-based Medicine?].
Hauswaldt, Johannes
2010-01-01
The practice of evidence-based medicine means integrating individual clinical expertise with the best available external clinical evidence. Strange enough, scientific discussion focuses on external evidence from systematic research, but neglects its counterpart, i.e., individual clinical expertise. Apart from a lack of appropriate intellectual tools for approaching the latter, this might be due to the mutual concealment of thought and action, of sensor and motor activity (Viktor von Weizsaecker's principle of the revolving door). Behind this, and incommensurably different from each other, lie the world of physics and the world of biology with an ego animal, that is, the dilemma of the self-conscious subject in a world of objects. When practicing medicine, this dilemma of self-reference is being resolved but only through a holistic approach combining rational and external evidence with biographical, spiritual, emotional and pre-rational elements represented in the physician's individual clinical expertise. Copyright © 2010. Published by Elsevier GmbH.
78 FR 1162 - Cardiovascular Devices; Reclassification of External Cardiac Compressor
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-08
... safety and electromagnetic compatibility; For devices containing software, software verification... electromagnetic compatibility; For devices containing software, software verification, validation, and hazard... electrical components, appropriate analysis and testing must validate electrical safety and electromagnetic...
Tsugawa, Yusuke; Ohbu, Sadayoshi; Cruess, Richard; Cruess, Sylvia; Okubo, Tomoya; Takahashi, Osamu; Tokuda, Yasuharu; Heist, Brian S; Bito, Seiji; Itoh, Toshiyuki; Aoki, Akiko; Chiba, Tsutomu; Fukui, Tsuguya
2011-08-01
Despite the growing importance of and interest in medical professionalism, there is no standardized tool for its measurement. The authors sought to verify the validity, reliability, and generalizability of the Professionalism Mini-Evaluation Exercise (P-MEX), a previously developed and tested tool, in the context of Japanese hospitals. A multicenter, cross-sectional evaluation study was performed to investigate the validity, reliability, and generalizability of the P-MEX in seven Japanese hospitals. In 2009-2010, 378 evaluators (attending physicians, nurses, peers, and junior residents) completed 360-degree assessments of 165 residents and fellows using the P-MEX. The content validity and criterion-related validity were examined, and the construct validity of the P-MEX was investigated by performing confirmatory factor analysis through a structural equation model. The reliability was tested using generalizability analysis. The contents of the P-MEX achieved good acceptance in a preliminary working group, and the poststudy survey revealed that 302 (79.9%) evaluators rated the P-MEX items as appropriate, indicating good content validity. The correlation coefficient between P-MEX scores and external criteria was 0.78 (P < .001), demonstrating good criterion-related validity. Confirmatory factor analysis verified high path coefficient (0.60-0.99) and adequate goodness of fit of the model. The generalizability analysis yielded a high dependability coefficient, suggesting good reliability, except when evaluators were peers or junior residents. Findings show evidence of adequate validity, reliability, and generalizability of the P-MEX in Japanese hospital settings. The P-MEX is the only evaluation tool for medical professionalism verified in both a Western and East Asian cultural context.
Preference on cash-choice task predicts externalizing outcomes in 17-year-olds.
Sparks, Jordan C; Isen, Joshua D; Iacono, William G
2014-03-01
Delay-discounting, the tendency to prefer a smaller-sooner reward to a larger-later reward, has been associated with a range of externalizing behaviors. Laboratory delay-discounting tasks have emerged as a useful measure to index impulsivity and a proclivity towards externalizing pyschopathology. While many studies demonstrate the existence of a latent externalizing factor that is heritable, there have been few genetic studies of delay-discounting. Further, the increased vulnerability for risky behavior in adolescence makes adolescent samples an attractive target for future research, and expeditious, ecologically-valid delay-discounting measures are helpful in this regard. The primary goal of this study was to help validate the utility of a "cash-choice" measure for use in a sample of older adolescents. We used a sample of 17-year-old twins (n = 791) from the Minnesota Twin Family Enrichment study. Individuals who chose the smaller-sooner reward were more likely to have used a range of addictive substances, engaged in sexual intercourse, and earned lower GPAs. Best fitting biometric models from univariate analyses supported the heritability of cash-choice and externalizing, but bivariate modeling results indicated that the correlation between cash-choice and externalizing was determined largely by shared environmental influences, thus failing to support cash-choice as a possible endophenotype for externalizing in this age group. Our findings lend further support to the utility of cash-choice as a measure of individual differences in decision making and suggest that, by late adolescence, this task indexes shared environmental risk for externalizing behavior.
20 CFR 404.725 - Evidence of a valid ceremonial marriage.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Evidence of a valid ceremonial marriage. 404... DISABILITY INSURANCE (1950- ) Evidence Evidence of Age, Marriage, and Death § 404.725 Evidence of a valid ceremonial marriage. (a) General. A valid ceremonial marriage is one that follows procedures set by law in...
Hengartner, Michael P; Graf, Markus; Schreiber, Marc
2017-05-01
There is increasing interest in the construct validity of higher-order domains of the Big Five personality traits. A total of 831 persons from the Swiss population completed the International Personality Item Pool and an adaptation of the Positive and Negative Affect Scales. Using Goldberg's bass-ackwards method, we found evidence for the general factor of personality (GFP) and the two meta-traits of positive emotionality (blend of low neuroticism and high extraversion) and constraint (blend of high agreeableness and conscientiousness). In association with positive affect, the explanatory power of the GFP (r = 0.43) and positive emotionality (r = 0.37) was largely superior to extraversion (r = 0.24), conscientiousness (r = 0.18), agreeableness (r = 0.09) and openness (r = 0.04), although not neuroticism (r = -0.34). In association with negative affect, neuroticism (r = 0.41), the GFP (r = -0.36) and positive emotionality (r = -0.35) were the most powerful single predictors. We conclude that the higher-order structure of personality is best explained by the meta-traits of positive emotionality and constraint, which correspond closely to the well-established superfactors of internalizing and externalizing. We further demonstrate that these have substantial criterion validity when broad positive and negative affect is the outcome of interest. These findings help to relate Big Five meta-traits to pathological personality. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Wing, Coady; Bello-Gomez, Ricardo A.
2018-01-01
Treatment effect estimates from a "regression discontinuity design" (RDD) have high internal validity. However, the arguments that support the design apply to a subpopulation that is narrower and usually different from the population of substantive interest in evaluation research. The disconnect between RDD population and the…
Helping Students Evaluate the Validity of a Research Study.
ERIC Educational Resources Information Center
Morgan, George A.; Gliner, Jeffrey A.
Students often have difficulty in evaluating the validity of a study. A conceptually and linguistically meaningful framework for evaluating research studies is proposed that is based on the discussion of internal and external validity of T. D. Cook and D. T. Campbell (1979). The proposal includes six key dimensions, three related to internal…
42 CFR 438.358 - Activities related to external quality review.
Code of Federal Regulations, 2012 CFR
2012-10-01
...) Validation of performance improvement projects required by the State to comply with requirements set forth in § 438.240(b)(1) and that were underway during the preceding 12 months. (2) Validation of MCO or PIHP... derived during the preceding 12 months from the following optional activities: (1) Validation of encounter...
42 CFR 438.358 - Activities related to external quality review.
Code of Federal Regulations, 2014 CFR
2014-10-01
...) Validation of performance improvement projects required by the State to comply with requirements set forth in § 438.240(b)(1) and that were underway during the preceding 12 months. (2) Validation of MCO or PIHP... derived during the preceding 12 months from the following optional activities: (1) Validation of encounter...
42 CFR 438.358 - Activities related to external quality review.
Code of Federal Regulations, 2011 CFR
2011-10-01
...) Validation of performance improvement projects required by the State to comply with requirements set forth in § 438.240(b)(1) and that were underway during the preceding 12 months. (2) Validation of MCO or PIHP... derived during the preceding 12 months from the following optional activities: (1) Validation of encounter...
42 CFR 438.358 - Activities related to external quality review.
Code of Federal Regulations, 2013 CFR
2013-10-01
...) Validation of performance improvement projects required by the State to comply with requirements set forth in § 438.240(b)(1) and that were underway during the preceding 12 months. (2) Validation of MCO or PIHP... derived during the preceding 12 months from the following optional activities: (1) Validation of encounter...
Translational Medicine Guide transforms drug development processes: the recent Merck experience.
Dolgos, Hugues; Trusheim, Mark; Gross, Dietmar; Halle, Joern-Peter; Ogden, Janet; Osterwalder, Bruno; Sedman, Ewen; Rossetti, Luciano
2016-03-01
Merck is implementing a question-based Translational Medicine Guide (TxM Guide) beginning as early as lead optimization into its stage-gate drug development process. Initial experiences with the TxM Guide, which is embedded into an integrated development plan tailored to each development program, demonstrated opportunities to improve target understanding, dose setting (i.e., therapeutic index), and patient subpopulation selection with more robust and relevant early human-based evidence, and increased use of biomarkers and simulations. The TxM Guide is also helping improve organizational learning, costs, and governance. It has also shown the need for stronger external resources for validating biomarkers, demonstrating clinical utility, tracking natural disease history, and biobanking. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Igual, Laura; Soliva, Joan Carles; Escalera, Sergio; Gimeno, Roger; Vilarroya, Oscar; Radeva, Petia
2012-12-01
We present a fully automatic diagnostic imaging test for Attention-Deficit/Hyperactivity Disorder diagnosis assistance based on previously found evidences of caudate nucleus volumetric abnormalities. The proposed method consists of different steps: a new automatic method for external and internal segmentation of caudate based on Machine Learning methodologies; the definition of a set of new volume relation features, 3D Dissociated Dipoles, used for caudate representation and classification. We separately validate the contributions using real data from a pediatric population and show precise internal caudate segmentation and discrimination power of the diagnostic test, showing significant performance improvements in comparison to other state-of-the-art methods. Copyright © 2012 Elsevier Ltd. All rights reserved.
Quantitative 1H NMR: Development and Potential of an Analytical Method – an Update
Pauli, Guido F.; Gödecke, Tanja; Jaki, Birgit U.; Lankin, David C.
2012-01-01
Covering the literature from mid-2004 until the end of 2011, this review continues a previous literature overview on quantitative 1H NMR (qHNMR) methodology and its applications in the analysis of natural products (NPs). Among the foremost advantages of qHNMR is its accurate function with external calibration, the lack of any requirement for identical reference materials, a high precision and accuracy when properly validated, and an ability to quantitate multiple analytes simultaneously. As a result of the inclusion of over 170 new references, this updated review summarizes a wealth of detailed experiential evidence and newly developed methodology that supports qHNMR as a valuable and unbiased analytical tool for natural product and other areas of research. PMID:22482996
Validation of the Dutch Eating Behaviour Questionnaire (DEBQ) among Maltese women.
Dutton, Elaine; Dovey, Terence M
2016-12-01
The main aim of this study was to assess the dimensional structure of the Maltese version of the Dutch Eating Behaviour Questionnaire (DEBQ) and evaluate the instrument's validity and reliability among Maltese women (N = 586). Exploratory factor analysis reflected the theoretical structure of three factors; emotional, restrained and external eating which was supported by a Confirmatory Factor analysis. Minor issues with specific items in the Emotional and External eating scale were identified and discussed. Criterion-related validity was ascertained through correlations with the EAT-26. The study also assessed the DEBQ's predictive value in differentiating between BMI groups and between dieters and weight maintainers. The results suggest that the Maltese DEBQ is a psychometrically valid and reliable instrument for assessing eating behaviours with women in the Maltese community. The study also highlights the critical role of Emotional and Restrained eating in dieting and overweight Maltese women. Copyright © 2016 Elsevier Ltd. All rights reserved.
Parvizi, Javad; Tan, Timothy L; Goswami, Karan; Higuera, Carlos; Della Valle, Craig; Chen, Antonia F; Shohat, Noam
2018-05-01
The introduction of the Musculoskeletal Infection Society (MSIS) criteria for periprosthetic joint infection (PJI) in 2011 resulted in improvements in diagnostic confidence and research collaboration. The emergence of new diagnostic tests and the lessons we have learned from the past 7 years using the MSIS definition, prompted us to develop an evidence-based and validated updated version of the criteria. This multi-institutional study of patients undergoing revision total joint arthroplasty was conducted at 3 academic centers. For the development of the new diagnostic criteria, PJI and aseptic patient cohorts were stringently defined: PJI cases were defined using only major criteria from the MSIS definition (n = 684) and aseptic cases underwent one-stage revision for a noninfective indication and did not fail within 2 years (n = 820). Serum C-reactive protein (CRP), D-dimer, erythrocyte sedimentation rate were investigated, as well as synovial white blood cell count, polymorphonuclear percentage, leukocyte esterase, alpha-defensin, and synovial CRP. Intraoperative findings included frozen section, presence of purulence, and isolation of a pathogen by culture. A stepwise approach using random forest analysis and multivariate regression was used to generate relative weights for each diagnostic marker. Preoperative and intraoperative definitions were created based on beta coefficients. The new definition was then validated on an external cohort of 222 patients with PJI who subsequently failed with reinfection and 200 aseptic patients. The performance of the new criteria was compared to the established MSIS and the prior International Consensus Meeting definitions. Two positive cultures or the presence of a sinus tract were considered as major criteria and diagnostic of PJI. The calculated weights of an elevated serum CRP (>1 mg/dL), D-dimer (>860 ng/mL), and erythrocyte sedimentation rate (>30 mm/h) were 2, 2, and 1 points, respectively. Furthermore, elevated synovial fluid white blood cell count (>3000 cells/μL), alpha-defensin (signal-to-cutoff ratio >1), leukocyte esterase (++), polymorphonuclear percentage (>80%), and synovial CRP (>6.9 mg/L) received 3, 3, 3, 2, and 1 points, respectively. Patients with an aggregate score of greater than or equal to 6 were considered infected, while a score between 2 and 5 required the inclusion of intraoperative findings for confirming or refuting the diagnosis. Intraoperative findings of positive histology, purulence, and single positive culture were assigned 3, 3, and 2 points, respectively. Combined with the preoperative score, a total of greater than or equal to 6 was considered infected, a score between 4 and 5 was inconclusive, and a score of 3 or less was not infected. The new criteria demonstrated a higher sensitivity of 97.7% compared to the MSIS (79.3%) and International Consensus Meeting definition (86.9%), with a similar specificity of 99.5%. This study offers an evidence-based definition for diagnosing hip and knee PJI, which has shown excellent performance on formal external validation. Copyright © 2018 Elsevier Inc. All rights reserved.
Collins, Dave; Carson, Howie J; Toner, John
2016-01-01
Abdollahipour, Wulf, Psotta, and Nieto (2015) recently published data in the Journal of Sports Sciences to show that an external focus of attention promotes superior performance effects (gymnastics jump height and judged movement form score) when compared to internal or control foci during skill execution without an implement involved. While we do not contest the veracity of findings reported, nor others that have been used to support beneficial effects of an external focus of attention, in this Letter to the Editor we comment on considerable methodological limitations associated with this and previous studies that, we suggest, have resulted in serious theoretical oversights regarding the control of movement and, most crucially from our practitioner perspective, suboptimal recommendations for applied coaching practice. Specifically, we discuss the lack of consideration towards translational research in this area, the problematic nature of attentional focus cues employed, interpretation of findings in relation to other applied recommendations and coherence with mechanistic underpinning and, finally, the representative nature of task involved. In summary, while (laboratory) research evidence may appear to be conclusive, we suggest that the focus of attention effects are in need of more ecologically valid and rigorous testing as well as consideration of current coaching practices if it is to optimally serve the applied sporting domain that it purportedly aims to.
Validation and Adjustment of the Leipzig-Halifax Acute Aortic Dissection Type A Scorecard.
Mejàre-Berggren, Hanna; Olsson, Christian
2017-11-01
The novel Leipzig-Halifax (LH) scorecard for acute aortic dissection type A (AADA) stratifies risk of in-hospital death based on age, malperfusion syndromes, critical preoperative state, and coronary disease. The study aim was to externally validate the LH scorecard performance and, if adequate, propose adjustments. All consecutive AADA patients operated on from 1996 to 2016 (n = 509) were included to generate an external validation cohort. Variables related to in-hospital death were analyzed using univariable and multivariable analysis. The LH scorecard was applied to the validation cohort, compared with the original study, and variable selection was adjusted using validation measures for discrimination and calibration. In-hospital mortality rate was 17.7% (LH cohort 18.7%). Critical preoperative state and Penn class non-Aa were independent predictors (odds ratio [OR] 2.42 and 2.45, respectively) of in-hospital death. The LH scorecard was adjusted to include Penn class non-Aa, critical preoperative state, and coronary disease. Assessing discrimination, area under receiver operator characteristic curve for the LH scorecard was 0.61 versus 0.66 for the new scorecard (p = 0.086). In-hospital mortality rates in low-, medium-, and high-risk groups were 14%, 15%, and 48%, respectively (LH scorecard) versus 11%, 23%, and 43%, respectively (new scorecard), and goodness-of-fit p value was 0.01 versus 0.86, indicating better calibration by the new scorecard. A lower Akaike information criterion value, 464 versus 448, favored the new scorecard. Through adjustment of the LH scorecard after external validation, prognostic performance improved. Further validated, the LH scorecard could be a valuable risk prediction tool. Copyright © 2017 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
49 CFR 192.459 - External corrosion control: Examination of buried pipeline when exposed.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 3 2011-10-01 2011-10-01 false External corrosion control: Examination of buried... Requirements for Corrosion Control § 192.459 External corrosion control: Examination of buried pipeline when... portion must be examined for evidence of external corrosion if the pipe is bare, or if the coating is...
49 CFR 192.459 - External corrosion control: Examination of buried pipeline when exposed.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 3 2010-10-01 2010-10-01 false External corrosion control: Examination of buried... Requirements for Corrosion Control § 192.459 External corrosion control: Examination of buried pipeline when... portion must be examined for evidence of external corrosion if the pipe is bare, or if the coating is...
49 CFR 192.459 - External corrosion control: Examination of buried pipeline when exposed.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 49 Transportation 3 2012-10-01 2012-10-01 false External corrosion control: Examination of buried... Requirements for Corrosion Control § 192.459 External corrosion control: Examination of buried pipeline when... portion must be examined for evidence of external corrosion if the pipe is bare, or if the coating is...
49 CFR 192.459 - External corrosion control: Examination of buried pipeline when exposed.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 49 Transportation 3 2014-10-01 2014-10-01 false External corrosion control: Examination of buried... Requirements for Corrosion Control § 192.459 External corrosion control: Examination of buried pipeline when... portion must be examined for evidence of external corrosion if the pipe is bare, or if the coating is...
49 CFR 192.459 - External corrosion control: Examination of buried pipeline when exposed.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 49 Transportation 3 2013-10-01 2013-10-01 false External corrosion control: Examination of buried... Requirements for Corrosion Control § 192.459 External corrosion control: Examination of buried pipeline when... portion must be examined for evidence of external corrosion if the pipe is bare, or if the coating is...
Tomoaia-Cotisel, Andrada; Scammon, Debra L; Waitzman, Norman J; Cronholm, Peter F; Halladay, Jacqueline R; Driscoll, David L; Solberg, Leif I; Hsu, Clarissa; Tai-Seale, Ming; Hiratsuka, Vanessa; Shih, Sarah C; Fetters, Michael D; Wise, Christopher G; Alexander, Jeffrey A; Hauser, Diane; McMullen, Carmit K; Scholle, Sarah Hudson; Tirodkar, Manasi A; Schmidt, Laura; Donahue, Katrina E; Parchman, Michael L; Stange, Kurt C
2013-01-01
We aimed to advance the internal and external validity of research by sharing our empirical experience and recommendations for systematically reporting contextual factors. Fourteen teams conducting research on primary care practice transformation retrospectively considered contextual factors important to interpreting their findings (internal validity) and transporting or reinventing their findings in other settings/situations (external validity). Each team provided a table or list of important contextual factors and interpretive text included as appendices to the articles in this supplement. Team members identified the most important contextual factors for their studies. We grouped the findings thematically and developed recommendations for reporting context. The most important contextual factors sorted into 5 domains: (1) the practice setting, (2) the larger organization, (3) the external environment, (4) implementation pathway, and (5) the motivation for implementation. To understand context, investigators recommend (1) engaging diverse perspectives and data sources, (2) considering multiple levels, (3) evaluating history and evolution over time, (4) looking at formal and informal systems and culture, and (5) assessing the (often nonlinear) interactions between contextual factors and both the process and outcome of studies. We include a template with tabular and interpretive elements to help study teams engage research participants in reporting relevant context. These findings demonstrate the feasibility and potential utility of identifying and reporting contextual factors. Involving diverse stakeholders in assessing context at multiple stages of the research process, examining their association with outcomes, and consistently reporting critical contextual factors are important challenges for a field interested in improving the internal and external validity and impact of health care research.
2012-01-01
Background There is a need for more Comparative Effectiveness Research (CER) to strengthen the evidence base for clinical and policy decision-making. Effectiveness Guidance Documents (EGD) are targeted to clinical researchers. The aim of this EGD is to provide specific recommendations for the design of prospective acupuncture studies to support optimal use of resources for generating evidence that will inform stakeholder decision-making. Methods Document development based on multiple systematic consensus procedures (written Delphi rounds, interactive consensus workshop, international expert review). To balance aspects of internal and external validity, multiple stakeholders including patients, clinicians and payers were involved. Results Recommendations focused mainly on randomized studies and were developed for the following areas: overall research strategy, treatment protocol, expertise and setting, outcomes, study design and statistical analyses, economic evaluation, and publication. Conclusion The present EGD, based on an international consensus developed with multiple stakeholder involvement, provides the first systematic methodological guidance for future CER on acupuncture. PMID:22953730
2012-01-01
Background Facilitation is emerging as an important strategy in the uptake of evidence. However, it is not entirely clear from a practical perspective how facilitation occurs to help move research evidence into nursing practice. The Canadian Partnership Against Cancer, also known as the 'Partnership,' is a Pan-Canadian initiative supporting knowledge translation activity for improved care through guideline use. In this case-series study, five self-identified groups volunteered to use a systematic methodology to adapt existing clinical practice guidelines for Canadian use. With 'Partnership' support, local and external facilitators provided assistance for groups to begin the process by adapting the guidelines and planning for implementation. Methods To gain a more comprehensive understanding of the nature of facilitation, we conducted a mixed-methods study. Specifically, we examined the role and skills of individuals actively engaged in facilitation as well as the actual facilitation activities occurring within the 'Partnership.' The study was driven by and builds upon a focused literature review published in 2010 that examined facilitation as a role and process in achieving evidence-based practice in nursing. An audit tool outlining 46 discrete facilitation activities based on results of this review was used to examine the facilitation noted in the documents (emails, meeting minutes, field notes) of three nursing-related cases participating in the 'Partnership' case-series study. To further examine the concept, six facilitators were interviewed about their practical experiences. The case-audit data were analyzed through a simple content analysis and triangulated with participant responses from the focus group interview to understand what occurred as these cases undertook guideline adaptation. Results The analysis of the three cases revealed that almost all of the 46 discrete, practical facilitation activities from the literature were evidenced. Additionally, case documents exposed five other facilitation-related activities, and a combination of external and local facilitation was apparent. Individuals who were involved in the case or group adapting the guideline(s) also performed facilitation activities, both formally and informally, in conjunction with or in addition to appointed external and local facilitators. Conclusions Facilitation of evidence-based practice is a multifaceted process and a team effort. Communication and relationship-building are key components. The practical aspects of facilitation explicated in this study validate what has been previously noted in the literature and expand what is known about facilitation process and activity. PMID:22309743
Paterson, Charlotte; Karatzias, Thanos; Dickson, Adele; Harper, Sean; Dougall, Nadine; Hutton, Paul
2018-04-16
The effectiveness of psychological therapies for those receiving acute adult mental health inpatient care remains unclear, partly because of the difficulty in conducting randomized controlled trials (RCTs) in this setting. The aim of this meta-analysis was to synthesize evidence from all controlled trials of psychological therapy carried out with this group, to estimate its effects on a number of important outcomes and examine whether the presence of randomization and rater blinding moderated these estimates. A systematic review and meta-analysis of all controlled trials of psychological therapy delivered in acute inpatient settings was conducted, with a focus on psychotic symptoms, readmissions or emotional distress (anxiety and depression). Studies were identified through ASSIA, EMBASE, CINAHL, Cochrane, MEDLINE, and PsycINFO using a combination of the key terms 'inpatient', 'psychological therapy', and 'acute'. No restriction was placed on diagnosis. The moderating effect of the use of assessor-blind RCT methodology was examined via subgroup and sensitivity analyses. Overall, psychological therapy was associated with small-to-moderate improvements in psychotic symptoms at end of therapy but the effect was smaller and not significant at follow-up. Psychological therapy was also associated with reduced readmissions, depression, and anxiety. The use of single-blind randomized controlled trial methodology was associated with significantly reduced benefits on psychotic symptoms and was also associated with reduced benefits on readmission and depression; however, these reductions were not statistically significant. The provision of psychological therapy to acute psychiatric inpatients is associated with improvements; however, the use of single-blind RCT methodology was associated with reduced therapy-attributable improvements. Whether this is a consequence of increased internal validity or reduced external validity is unclear. Trials with both high internal and external validity are now required to establish what type, format, and intensity of brief psychological therapy is required to achieve sustained benefits. Clinical implications: This review provides the first meta-analytical synthesis of brief psychological therapy delivered in acute psychiatric inpatient settings. This review suggests that brief psychological therapy may be associated with reduced emotional distress and readmissions. The evidence in this review is of limited quality. The type, format, and intensity of brief psychological therapy required to achieve sustained benefits are yet to be established. © 2018 The British Psychological Society.
ERIC Educational Resources Information Center
Olino, Thomas M.; Seeley, John R.; Lewinsohn, Peter M.
2010-01-01
Conduct disorder (CD) is associated with a number of adverse psychosocial outcomes in adulthood. There is consistent evidence that CD is predictive of antisocial behavior, but mixed evidence that CD is predictive of other externalizing and internalizing disorders. Further, externalizing and internalizing disorders are often associated with similar…
Lindberg, Ann-Sofie; Oksa, Juha; Antti, Henrik; Malm, Christer
2015-01-01
Physical capacity has previously been deemed important for firefighters physical work capacity, and aerobic fitness, muscular strength, and muscular endurance are the most frequently investigated parameters of importance. Traditionally, bivariate and multivariate linear regression statistics have been used to study relationships between physical capacities and work capacities among firefighters. An alternative way to handle datasets consisting of numerous correlated variables is to use multivariate projection analyses, such as Orthogonal Projection to Latent Structures. The first aim of the present study was to evaluate the prediction and predictive power of field and laboratory tests, respectively, on firefighters' physical work capacity on selected work tasks. Also, to study if valid predictions could be achieved without anthropometric data. The second aim was to externally validate selected models. The third aim was to validate selected models on firefighters' and on civilians'. A total of 38 (26 men and 12 women) + 90 (38 men and 52 women) subjects were included in the models and the external validation, respectively. The best prediction (R2) and predictive power (Q2) of Stairs, Pulling, Demolition, Terrain, and Rescue work capacities included field tests (R2 = 0.73 to 0.84, Q2 = 0.68 to 0.82). The best external validation was for Stairs work capacity (R2 = 0.80) and worst for Demolition work capacity (R2 = 0.40). In conclusion, field and laboratory tests could equally well predict physical work capacities for firefighting work tasks, and models excluding anthropometric data were valid. The predictive power was satisfactory for all included work tasks except Demolition.
Oh, Ein; Yoo, Tae Keun; Park, Eun-Cheol
2013-09-13
Blindness due to diabetic retinopathy (DR) is the major disability in diabetic patients. Although early management has shown to prevent vision loss, diabetic patients have a low rate of routine ophthalmologic examination. Hence, we developed and validated sparse learning models with the aim of identifying the risk of DR in diabetic patients. Health records from the Korea National Health and Nutrition Examination Surveys (KNHANES) V-1 were used. The prediction models for DR were constructed using data from 327 diabetic patients, and were validated internally on 163 patients in the KNHANES V-1. External validation was performed using 562 diabetic patients in the KNHANES V-2. The learning models, including ridge, elastic net, and LASSO, were compared to the traditional indicators of DR. Considering the Bayesian information criterion, LASSO predicted DR most efficiently. In the internal and external validation, LASSO was significantly superior to the traditional indicators by calculating the area under the curve (AUC) of the receiver operating characteristic. LASSO showed an AUC of 0.81 and an accuracy of 73.6% in the internal validation, and an AUC of 0.82 and an accuracy of 75.2% in the external validation. The sparse learning model using LASSO was effective in analyzing the epidemiological underlying patterns of DR. This is the first study to develop a machine learning model to predict DR risk using health records. LASSO can be an excellent choice when both discriminative power and variable selection are important in the analysis of high-dimensional electronic health records.
Oliveira, Flavia C C; Brandão, Christian R R; Ramalho, Hugo F; da Costa, Leonardo A F; Suarez, Paulo A Z; Rubim, Joel C
2007-03-28
In this work it has been shown that the routine ASTM methods (ASTM 4052, ASTM D 445, ASTM D 4737, ASTM D 93, and ASTM D 86) recommended by the ANP (the Brazilian National Agency for Petroleum, Natural Gas and Biofuels) to determine the quality of diesel/biodiesel blends are not suitable to prevent the adulteration of B2 or B5 blends with vegetable oils. Considering the previous and actual problems with fuel adulterations in Brazil, we have investigated the application of vibrational spectroscopy (Fourier transform (FT) near infrared spectrometry and FT-Raman) to identify adulterations of B2 and B5 blends with vegetable oils. Partial least square regression (PLS), principal component regression (PCR), and artificial neural network (ANN) calibration models were designed and their relative performances were evaluated by external validation using the F-test. The PCR, PLS, and ANN calibration models based on the Fourier transform (FT) near infrared spectrometry and FT-Raman spectroscopy were designed using 120 samples. Other 62 samples were used in the validation and external validation, for a total of 182 samples. The results have shown that among the designed calibration models, the ANN/FT-Raman presented the best accuracy (0.028%, w/w) for samples used in the external validation.
Hamadache, Mabrouk; Benkortbi, Othmane; Hanini, Salah; Amrane, Abdeltif; Khaouane, Latifa; Si Moussa, Cherif
2016-02-13
Quantitative Structure Activity Relationship (QSAR) models are expected to play an important role in the risk assessment of chemicals on humans and the environment. In this study, we developed a validated QSAR model to predict acute oral toxicity of 329 pesticides to rats because a few QSAR models have been devoted to predict the Lethal Dose 50 (LD50) of pesticides on rats. This QSAR model is based on 17 molecular descriptors, and is robust, externally predictive and characterized by a good applicability domain. The best results were obtained with a 17/9/1 Artificial Neural Network model trained with the Quasi Newton back propagation (BFGS) algorithm. The prediction accuracy for the external validation set was estimated by the Q(2)ext and the root mean square error (RMS) which are equal to 0.948 and 0.201, respectively. 98.6% of external validation set is correctly predicted and the present model proved to be superior to models previously published. Accordingly, the model developed in this study provides excellent predictions and can be used to predict the acute oral toxicity of pesticides, particularly for those that have not been tested as well as new pesticides. Copyright © 2015 Elsevier B.V. All rights reserved.
Sveen, Unni; Andelic, Nada; Bautz-Holter, Erik; Røe, Cecilie
2015-01-01
To evaluate the psychometric properties of the Norwegian version of the Patient Competency Rating Scale (PCRS) in patients with traumatic brain injury (TBI) at 12 months post-injury. Demographic and injury-related data were registered upon admission to the hospital in 148 TBI patients with mild, moderate, or severe TBI. At 12 months post-injury, competency in activities and global functioning were measured using the PCRS patient version and the Glasgow Outcome Scale-Extended (GOSE). Descriptive reliability statistics, factor analysis and Rasch modeling were applied to explore the psychometric properties of the PCRS. External validity was evaluated using the GOSE. The PCRS can be divided into three subscales that reflect interpersonal/emotional, cognitive, and activities of daily living competency. The three-factor solution explained 56.6% of the variance in functioning. The internal consistency was very good, with a Cronbach's α of 0.95. Item 30, "controlling my laughter", did not load above 0.40 on any factors and did not fit the Rasch model. The external validity of the subscales was acceptable, with correlations between 0.50 and 0.52 with the GOSE. The Norwegian version of the PCRS is reliable, has an acceptable construct and external validity, and can be recommended for use during the later phases of TBI.
Janssen, Daniël M C; van Kuijk, Sander M J; d'Aumerie, Boudewijn B; Willems, Paul C
2018-05-16
A prediction model for surgical site infection (SSI) after spine surgery was developed in 2014 by Lee et al. This model was developed to compute an individual estimate of the probability of SSI after spine surgery based on the patient's comorbidity profile and invasiveness of surgery. Before any prediction model can be validly implemented in daily medical practice, it should be externally validated to assess how the prediction model performs in patients sampled independently from the derivation cohort. We included 898 consecutive patients who underwent instrumented thoracolumbar spine surgery. To quantify overall performance using Nagelkerke's R 2 statistic, the discriminative ability was quantified as the area under the receiver operating characteristic curve (AUC). We computed the calibration slope of the calibration plot, to judge prediction accuracy. Sixty patients developed an SSI. The overall performance of the prediction model in our population was poor: Nagelkerke's R 2 was 0.01. The AUC was 0.61 (95% confidence interval (CI) 0.54-0.68). The estimated slope of the calibration plot was 0.52. The previously published prediction model showed poor performance in our academic external validation cohort. To predict SSI after instrumented thoracolumbar spine surgery for the present population, a better fitting prediction model should be developed.
Pat, Lucio; Ali, Bassam; Guerrero, Armando; Córdova, Atl V.; Garduza, José P.
2016-01-01
Attenuated total reflectance-Fourier transform infrared spectrometry and chemometrics model was used for determination of physicochemical properties (pH, redox potential, free acidity, electrical conductivity, moisture, total soluble solids (TSS), ash, and HMF) in honey samples. The reference values of 189 honey samples of different botanical origin were determined using Association Official Analytical Chemists, (AOAC), 1990; Codex Alimentarius, 2001, International Honey Commission, 2002, methods. Multivariate calibration models were built using partial least squares (PLS) for the measurands studied. The developed models were validated using cross-validation and external validation; several statistical parameters were obtained to determine the robustness of the calibration models: (PCs) optimum number of components principal, (SECV) standard error of cross-validation, (R 2 cal) coefficient of determination of cross-validation, (SEP) standard error of validation, and (R 2 val) coefficient of determination for external validation and coefficient of variation (CV). The prediction accuracy for pH, redox potential, electrical conductivity, moisture, TSS, and ash was good, while for free acidity and HMF it was poor. The results demonstrate that attenuated total reflectance-Fourier transform infrared spectrometry is a valuable, rapid, and nondestructive tool for the quantification of physicochemical properties of honey. PMID:28070445
Linley, Warren G; Hughes, Dyfrig A
2013-04-01
Few studies to date have explored the stated preferences of national decision makers for health technology adoption criteria, and none of these have compared stated decision-making behaviours against actual behaviours. Assessment of the external validity of stated preference studies, such as discrete-choice experiments (DCEs), remains an under-researched area. The primary aim was to explore the preferences of All Wales Medicines Strategy Group (AWMSG) appraisal committee and appraisal sub-committee (the New Medicines Group) members ('appraisal committees') for specific new medicines adoption criteria. Secondary aims were to explore the external validity of respondents' stated preferences and the impact of question choice options upon preference structures in DCEs. A DCE was conducted to estimate appraisal committees members' preferences for incremental cost effectiveness, quality-adjusted life-years (QALYs) gained, annual number of patients expected to be treated, the impact of the disease on patients before treatment, and the assessment of uncertainty in the economic evidence submitted for new medicines compared with current UK NHS treatment. Respondents evaluated 28 pairs of hypothetical new medicines, making a primary forced choice between each pair and a more flexible secondary choice, which permitted either, neither or both new medicines to be chosen. The performance of the resultant models was compared against previous AWMSG decisions. Forty-one out of a total of 80 past and present members of AWMSG appraisal committees completed the DCE. The incremental cost effectiveness of new medicines, and the QALY gains they provide, significantly (p < 0.0001) influence recommendations. Committee members were willing to accept higher incremental cost-effectiveness ratios and lower QALY gains for medicines that treat disease impacting primarily upon survival rather than quality of life, and where uncertainty in the cost-effectiveness estimates has been thoroughly explored. The number of patients to be treated by the new medicine did not exert a significant influence upon recommendations. The use of a flexible-choice question format revealed a different preference structure to the forced-choice format, but the performance of the two models was similar. Aggregate decisions of the AWMSG were well predicted by both models, but their sensitivity (64 %, 68 %) and specificity (55 %, 64 %) were limited. A willingness to trade the cost effectiveness and QALY gains against other factors indicates that economic efficiency and QALY maximisation are not the only considerations of committee members when making recommendations on the use of medicines in Wales. On average, appraisal committee members' stated preferences appear consistent with their actual decision-making behaviours, providing support for the external validity of our DCEs. However, as health technology assessment involves complex decision-making processes, and each individual recommendation may be influenced to varying degrees by a multitude of different considerations, the ability of our models to predict individual medicine recommendations is more limited.
Blaser, Klaus; Zlabinger, Milena; Hinterberger, Thilo
2014-01-01
The Interpersonal Attention Management Inventory (IAMI) represents a new instrument to capture self- and external perception skills. The underlying theoretical model assumes 3 mental locations of attention (the intrapersonal space, the extrapersonal space, and the external intrapersonal space) of the other. The IAMI was studied regarding its factor structure; it was shortened and statistical values as well as first reference values were calculated based on a larger sample (n = 1089). By factor analysis, the superordinate scales could be widely validated. The shortened version with 31 items and 3 superordinate scales shows a high reliability of the global value (Cronbach's α = 0.81) and, regarding the convergent validity, a modest correlation (r = 0.41) of the global value and mindfulness, measured with the Freiburg Mindfulness Inventory (FMI). Further validation studies are invited so that the IAMI can be used as an instrument for (course) diagnosis in the therapy of psychiatric disorders as well as for research in social neuroscience, e.g., in investigations on mindfulness, compassion, empathy, theory of mind, and self-boundaries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coleman, Justin Leigh; Smith, Curtis Lee; Burns, Douglas Edward
This report describes the development plan for a new multi-partner External Hazards Experimental Group (EHEG) coordinated by Idaho National Laboratory (INL) within the Risk-Informed Safety Margin Characterization (RISMC) technical pathway of the Light Water Reactor Sustainability Program. Currently, there is limited data available for development and validation of the tools and methods being developed in the RISMC Toolkit. The EHEG is being developed to obtain high-quality, small- and large-scale experimental data validation of RISMC tools and methods in a timely and cost-effective way. The group of universities and national laboratories that will eventually form the EHEG (which is ultimately expectedmore » to include both the initial participants and other universities and national laboratories that have been identified) have the expertise and experimental capabilities needed to both obtain and compile existing data archives and perform additional seismic and flooding experiments. The data developed by EHEG will be stored in databases for use within RISMC. These databases will be used to validate the advanced external hazard tools and methods.« less
Strategies for Validating and Directions for Employing SMOS Data, in the Cal-Val Project SWEX (3275)
NASA Astrophysics Data System (ADS)
Marczewski, Wojciech; Usowicz, Boguslaw; Usowicz, Jerzy; Romanov, Sergey; Maryskevych, Oksana; Nastula, Jolanta; Slominski, Jan; Zawadzki, Jaroslaw
2009-11-01
Earth land surface target of observations is naturally diversified in its physical and bio-physical properties. SMOS observation of SM (Soil Moisture) is highly dependent on proper physical and environmental data necessary, because SM is retrieved from the directly observable BT (Brightness Temperature) on the basis of these external data. That way, SMOS realizes a real data fusion performed NRT (Nearly Real Time) and thus needs validating. Global range of SMOS observations makes it generalizing the diversity on complex way engaging technical, modelling and organizational means. That is a new quality of EO (Earth Observations) in the matter of managing diversity of the target. The paper presents several proofs on employing external data by means of the SMOS software tools, for L1c and L2 data levels. Authors take validation in few selected sites in Poland, and describe their strategy for employing external data from ASAR, MERIS, and other auxiliary sources. Finally the conclusions come to understanding of a use of SMOS data, and seek ways of referencing SM in large scales to known results of the gravitational Mission GRACE.
Mindfulness: A systematic review of instruments to measure an emergent patientreported outcome (PRO)
Park, Taehwan; Reilly-Spong, Maryanne
2013-01-01
Purpose Mindfulness has emerged as an important health concept based on evidence that mindfulness interventions reduce symptoms and improve health-related quality of life. The objectives of this study were to systematically assess and compare the properties of instruments to measure self-reported mindfulness. Methods Ovid Medline®, CINAHL®, and PsycINFO® were searched through May 2012, and articles were selected if their primary purpose was development or evaluation of the measurement properties (validity, reliability, responsiveness) of a self-report mindfulness scale. Two reviewers independently evaluated the methodological quality of the selected studies using the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) checklist. Discrepancies were discussed with a third reviewer, and scored by consensus. Finally, a level of evidence approach was used to synthesize results and study quality. Results Our search strategy identified a total of 2,588 articles. Forty-six articles, reporting 79 unique studies, met inclusion criteria. Ten instruments quantifying mindfulness as a unidimensional scale (n=5) or as a set of 2 to 5 subscales (n=5) were reviewed. The Mindful Attention Awareness Scale (MAAS) was evaluated by the most studies (n=27), and had positive overall quality ratings for most of the psychometric properties reviewed. The Five Facet Mindfulness Questionnaire (FFMQ) received the highest possible rating (“consistent findings in multiple studies of good methodological quality”) for two properties, internal consistency and construct validation by hypothesis testing. However, none of the instruments had sufficient evidence of content validity. Comprehensiveness of construct coverage had not been assessed; qualitative methods to confirm understanding and relevance were absent. In addition, estimates of test-retest reliability, responsiveness, or measurement error to guide users in protocol development or interpretation of scores were lacking. Conclusions Current mindfulness scales have important conceptual differences, and none can be strongly recommended based solely on superior psychometric properties. Important limitations in the field are the absence of qualitative evaluations and accepted external referents to support construct validity. Investigators need to proceed cautiously before optimizing any mindfulness intervention based on the existing scales. PMID:23539467
Validation of the measure automobile emissions model : a statistical analysis
DOT National Transportation Integrated Search
2000-09-01
The Mobile Emissions Assessment System for Urban and Regional Evaluation (MEASURE) model provides an external validation capability for hot stabilized option; the model is one of several new modal emissions models designed to predict hot stabilized e...
LDR vs. HDR brachytherapy for localized prostate cancer: the view from radiobiological models.
King, Christopher R
2002-01-01
Permanent LDR brachytherapy and temporary HDR brachytherapy are competitive techniques for clinically localized prostate radiotherapy. Although a randomized trial will likely never be conducted comparing these two forms of brachytherapy, a comparative radiobiological modeling analysis proves useful in understanding some of their intrinsic differences, several of which could be exploited to improve outcomes. Radiobiological models based upon the linear quadratic equations are presented for fractionated external beam, fractionated (192)Ir HDR brachytherapy, and (125)I and (103)Pd LDR brachytherapy. These models incorporate the dose heterogeneities present in brachytherapy based upon patient-derived dose volume histograms (DVH) as well as tumor doubling times and repair kinetics. Radiobiological parameters are normalized to correspond to three accepted clinical risk factors based upon T-stage, PSA, and Gleason score to compare models with clinical series. Tumor control probabilities (TCP) for LDR and HDR brachytherapy (as monotherapy or combined with external beam) are compared with clinical bNED survival rates. Predictions are made for dose escalation with HDR brachytherapy regimens. Model predictions for dose escalation with external beam agree with clinical data and validate the models and their underlying assumptions. Both LDR and HDR brachytherapy achieve superior tumor control when compared with external beam at conventional doses (<70 Gy), but similar to results from dose escalation series. LDR brachytherapy as boost achieves superior tumor control than when used as monotherapy. Stage for stage, both LDR and current HDR regimens achieve similar tumor control rates, in agreement with current clinical data. HDR monotherapy with large-dose fraction sizes might achieve superior tumor control compared with LDR, especially if prostate cancer possesses a high sensitivity to dose fractionation (i.e., if the alpha/beta ratio is low). Radiobiological models support the current clinical evidence for equivalent outcomes in localized prostate cancer with either LDR or HDR brachytherapy using current dose regimens. However, HDR brachytherapy dose escalation regimens might be able to achieve higher biologically effective doses of irradiation in comparison to LDR, and hence improved outcomes. This advantage over LDR would be amplified should prostate cancer possess a high sensitivity to dose fractionation (i.e., a low alpha/beta ratio) as the current evidence suggests.
Markopoulou, Catherine K; Kouskoura, Maria G; Koundourellis, John E
2011-06-01
Twenty-five descriptors and 61 structurally different analytes have been used on a partial least squares (PLS) to latent structure technique in order to study chromatographically their interaction mechanism on a phenyl column. According to the model, 240 different retention times of the analytes, expressed as Y variable (log k), at different % MeOH mobile-phase concentrations have been correlated with their theoretical most important structural or molecular descriptors. The goodness-of-fit was estimated by the coefficient of multiple determinations r(2) (0.919), and the root mean square error of estimation (RMSEE=0.1283) values with a predictive ability (Q(2)) of 0.901. The model was further validated using cross-validation (CV), validated by 20 response permutations r(2) (0.0, 0.0146), Q(2) (0.0, -0.136) and validated by external prediction. The contribution of certain mechanism interactions between the analytes, the mobile phase and the column, proportional or counterbalancing is also studied. Trying to evaluate the influence on Y of every variable in a PLS model, VIP (variables importance in the projection) plot provides evidence that lipophilicity (expressed as Log D, Log P), polarizability, refractivity and the eluting power of the mobile phase are dominant in the retention mechanism on a phenyl column. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cheung, Y; Sawant, A
Purpose: Most clinically-deployed strategies for respiratory motion management in lung radiotherapy (e.g., gating, tracking) use external markers that serve as surrogates for tumor motion. However, typical lung phantoms used to validate these strategies are rigid-exterior+rigid-interior or rigid-exterior+deformable-interior. Neither class adequately represents the human anatomy, which is deformable internally as well as externally. We describe the construction and experimental validation of a more realistic, externally- and internally-deformable, programmable lung phantom. Methods: The outer shell of a commercially-available lung phantom (RS- 1500, RSD Inc.) was used. The shell consists of a chest cavity with a flexible anterior surface, and embedded vertebrae, rib-cagemore » and sternum. A 3-axis platform was programmed with sinusoidal and six patient-recorded lung tumor trajectories. The platform was used to drive a rigid foam ‘diaphragm’ that compressed/decompressed the phantom interior. Experimental characterization comprised of mapping the superior-inferior (SI) and anterior-posterior (AP) trajectories of external and internal radioopaque markers with kV x-ray fluoroscopy and correlating these with optical surface monitoring using the in-room VisionRT system. Results: The phantom correctly reproduced the programmed motion as well as realistic effects such as hysteresis. The reproducibility of marker trajectories over multiple runs for sinusoidal as well as patient traces, as characterized by fluoroscopy, was within 0.4 mm RMS error for internal as well as external markers. The motion trajectories of internal and external markers as measured by fluoroscopy were found to be highly correlated (R=0.97). Furthermore, motion trajectories of arbitrary points on the deforming phantom surface, as recorded by the VisionRT system also showed a high correlation with respect to the fluoroscopically-measured trajectories of internal markers (R=0.92). Conclusion: We have developed a realistic externally- and internally-deformable lung phantom that will serve as a valuable tool for clinical QA and motion management research. This work was supported through funding from the NIH and VisionRT Ltd. Amit Sawant has research funding from Varian Medical Systems, VisionRT and Elekta.« less
Crouse, Cecelia A; Yeung, Stephanie; Greenspoon, Susan; McGuckian, Amy; Sikorsky, Julie; Ban, Jeff; Mathies, Richard
2005-08-01
To present validation studies performed for the implementation of existing and new technologies to increase the efficiency in the forensic DNA Section of the Palm Beach County Sheriff's Office (PBSO) Crime Laboratory. Using federally funded grants, internal support, and an external Process Mapping Team, the PBSO collaborated with forensic vendors, universities, and other forensic laboratories to enhance DNA testing procedures, including validation of the DNA IQ magnetic bead extraction system, robotic DNA extraction using the BioMek2000, the ABI7000 Sequence Detection System, and is currently evaluating a micro Capillary Array Electrophoresis device. The PBSO successfully validated and implemented both manual and automated Promega DNA IQ magnetic bead extractions system, which have increased DNA profile results from samples with low DNA template concentrations. The Beckman BioMek2000 DNA robotic workstation has been validated for blood, tissue, bone, hair, epithelial cells (touch evidence), and mixed stains such as semen. There has been a dramatic increase in the number of samples tested per case since implementation of the robotic extraction protocols. The validation of the ABI7000 real-time quantitative polymerase chain reaction (qPCR) technology and the single multiplex short tandem repeat (STR) PowerPlex16 BIO amplification system has provided both a time and a financial benefit. In addition, the qPCR system allows more accurate DNA concentration data and the PowerPlex 16 BIO multiplex generates DNA profiles data in half the time when compared to PowerPlex1.1 and PowerPlex2.1 STR systems. The PBSO's future efficiency requirements are being addressed through collaboration with the University of California at Berkeley and the Virginia Division of Forensic Science to validate microcapillary array electrophoresis instrumentation. Initial data demonstrated the electrophoresis of 96 samples in less than twenty minutes. The PBSO demonstrated, through the validation of more efficient extraction and quantification technology, an increase in the number of evidence samples tested using robotic/DNA IQ magnetic bead DNA extraction, a decrease in the number of negative samples amplified due to qPCR and implementation of a single multiplex amplification system. In addition, initial studies show the microcapillary array electrophoresis device (microCAE) evaluation results provide greater sensitivity and faster STR analysis output than current platforms.
2014-01-01
Introduction Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. Methods An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). Results The original training cohort reached a statistically significant difference (p < 0.05) in disease-free survivals between the three CMTC groups after an additional two years of follow-up (median = 55 months). The prognostic value of the triad classification was reproduced in the second independent internal cohort and the new external validation cohort. CMTC achieved even higher prognostic significance when all available patients were analyzed (n = 4,851). Oncogenic pathways Myc, E2F1, Ras and β-catenin were again implicated in the high-risk groups. Conclusions Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments. PMID:24996446
Validation of educational assessments: a primer for simulation and beyond.
Cook, David A; Hatala, Rose
2016-01-01
Simulation plays a vital role in health professions assessment. This review provides a primer on assessment validation for educators and education researchers. We focus on simulation-based assessment of health professionals, but the principles apply broadly to other assessment approaches and topics. Validation refers to the process of collecting validity evidence to evaluate the appropriateness of the interpretations, uses, and decisions based on assessment results. Contemporary frameworks view validity as a hypothesis, and validity evidence is collected to support or refute the validity hypothesis (i.e., that the proposed interpretations and decisions are defensible). In validation, the educator or researcher defines the proposed interpretations and decisions, identifies and prioritizes the most questionable assumptions in making these interpretations and decisions (the "interpretation-use argument"), empirically tests those assumptions using existing or newly-collected evidence, and then summarizes the evidence as a coherent "validity argument." A framework proposed by Messick identifies potential evidence sources: content, response process, internal structure, relationships with other variables, and consequences. Another framework proposed by Kane identifies key inferences in generating useful interpretations: scoring, generalization, extrapolation, and implications/decision. We propose an eight-step approach to validation that applies to either framework: Define the construct and proposed interpretation, make explicit the intended decision(s), define the interpretation-use argument and prioritize needed validity evidence, identify candidate instruments and/or create/adapt a new instrument, appraise existing evidence and collect new evidence as needed, keep track of practical issues, formulate the validity argument, and make a judgment: does the evidence support the intended use? Rigorous validation first prioritizes and then empirically evaluates key assumptions in the interpretation and use of assessment scores. Validation science would be improved by more explicit articulation and prioritization of the interpretation-use argument, greater use of formal validation frameworks, and more evidence informing the consequences and implications of assessment.
External validity of post-stroke interventional gait rehabilitation studies.
Kafri, Michal; Dickstein, Ruth
2017-01-01
Gait rehabilitation is a major component of stroke rehabilitation, and is supported by extensive research. The objective of this review was to examine the external validity of intervention studies aimed at improving gait in individuals post-stroke. To that end, two aspects of these studies were assessed: subjects' exclusion criteria and the ecological validity of the intervention, as manifested by the intervention's technological complexity and delivery setting. Additionally, we examined whether the target population as inferred from the titles/abstracts is broader than the population actually represented by the reported samples. We systematically researched PubMed for intervention studies to improve gait post-stroke, working backwards from the beginning of 2014. Exclusion criteria, the technological complexity of the intervention (defined as either elaborate or simple), setting, and description of the target population in the titles/abstracts were recorded. Fifty-two studies were reviewed. The samples were exclusive, with recurrent stroke, co-morbidities, cognitive status, walking level, and residency being major reasons for exclusion. In one half of the studies, the intervention was elaborate. Descriptions of participants in the title/abstract in almost one half of the studies included only the diagnosis (stroke or comparable terms) and its stage (acute, subacute, and chronic). The external validity of a substantial number of intervention studies about rehabilitation of gait post-stroke appears to be limited by exclusivity of the samples as well as by deficiencies in ecological validity of the interventions. These limitations are not accurately reflected in the titles or abstracts of the studies.
Saavedra Salinas, Miguel Ángel; Barrera Cruz, Antonio; Cabral Castañeda, Antonio Rafael; Jara Quezada, Luis Javier; Arce-Salinas, C Alejandro; Álvarez Nemegyei, José; Fraga Mouret, Antonio; Orozco Alcalá, Javier; Salazar Páramo, Mario; Cruz Reyes, Claudia Verónica; Andrade Ortega, Lilia; Vera Lastra, Olga Lidia; Mendoza Pinto, Claudia; Sánchez González, Antonio; Cruz Cruz, Polita Del Rocío; Morales Hernández, Sara; Portela Hernández, Margarita; Pérez Cristóbal, Mario; Medina García, Gabriela; Hernández Romero, Noé; Velarde Ochoa, María Del Carmen; Navarro Zarza, José Eduardo; Portillo Díaz, Verónica; Vargas Guerrero, Angélica; Goycochea Robles, María Victoria; García Figueroa, José Luis; Barreira Mercado, Eduardo; Amigo Castañeda, Mary Carmen
2015-01-01
Pregnancy in women with autoimmune rheumatic diseases is associated with several maternal and fetal complications. The development of clinical practice guidelines with the best available scientific evidence may help standardize the care of these patients. To provide recommendations regarding prenatal care, treatment, and a more effective monitoring of pregnancy in women with lupus erythematosus, rheumatoid arthritis (RA) and antiphospholipid syndrome (APS). Nominal panels were formed for consensus, systematic search of information, development of clinical questions, processing and staging of recommendations, internal validation by peers and external validation of the final document. The quality criteria of the AGREE II instrument were followed. The panels answered 37 questions related to maternal and fetal care in lupus erythematosus, RA and APS, as well as for use of antirheumatic drugs during pregnancy and lactation. The recommendations were discussed and integrated into a final manuscript. Finally, the corresponding algorithms were developed. In this second part, the recommendations for pregnant women with RA, APS and the use of antirheumatic drugs during pregnancy and lactation are presented. We believe that the Mexican clinical practice guidelines for the management of pregnancy in women with RA and APS integrate the best available evidence for the treatment and follow-up of patients with these conditions. Copyright © 2014 Elsevier España, S.L.U. All rights reserved.
Narendra, P L; Hegde, Harihar V; Vijaykumar, T K; Nallamilli, Samson
2015-01-01
Betel quid is used by 10-20% of world of population. Oral submucus fibrosis (OSF) is a chronic premalignant disease common in South Asian countries where betel quid is chewed. It is characterized by juxtaepithelial fibrosis of oral cavity and limited mouth opening, which can cause difficult intubation. A recent study in Taiwan has revealed long-term betel nut chewing is not predictor of difficult intubation. We describe two cases of OSF and critically analyze this study and its implications for clinical practice. OSF is now seen in Saudi Arabia and western countries with use of commercial betel quid substitutes. Although betel quid without tobacco is used in Taiwan, available evidence suggests rapid and early development of OSF where commercial chewing products like Pan Masala are used in India. Effects of betel quid may vary depending on the composition of quid and chewing habits. Studies where personal habits are involved must be analyzed carefully for external validity. Even though, Taiwan study is controlled, its validity outside Taiwan is highly questionable. Since OSF can cause unanticipated difficult intubation, thus during preanesthetic assessment, history of betel quid chewing, more importantly use of commercial chewing products is more likely to give clues to severity of OSF and possible difficult intubation. Further controlled trails in populations where commercial chewing products are used is necessary to detect association of chewing habits and difficult intubation.
Lardas, Michael; Liew, Matthew; van den Bergh, Roderick C; De Santis, Maria; Bellmunt, Joaquim; Van den Broeck, Thomas; Cornford, Philip; Cumberbatch, Marcus G; Fossati, Nicola; Gross, Tobias; Henry, Ann M; Bolla, Michel; Briers, Erik; Joniau, Steven; Lam, Thomas B; Mason, Malcolm D; Mottet, Nicolas; van der Poel, Henk G; Rouvière, Olivier; Schoots, Ivo G; Wiegel, Thomas; Willemse, Peter-Paul M; Yuan, Cathy Yuhong; Bourke, Liam
2017-12-01
Current evidence-based management for clinically localised prostate cancer includes active surveillance, surgery, external beam radiotherapy (EBRT) and brachytherapy. The impact of these treatment modalities on quality of life (QoL) is uncertain. To systematically review comparative studies investigating disease-specific QoL outcomes as assessed by validated cancer-specific patient-reported outcome measures with at least 1 yr of follow-up after primary treatment for clinically localised prostate cancer. MEDLINE, EMBASE, AMED, PsycINFO, and Cochrane Library were searched to identify relevant studies. Studies were critically appraised for the risk of bias. A narrative synthesis was undertaken. Of 11486 articles identified, 18 studies were eligible for inclusion, including three randomised controlled trials (RCTs; follow-up range: 60-72 mo) and 15 nonrandomised comparative studies (follow-up range: 12-180 mo) recruiting a total of 13604 patients. Two RCTs recruited small cohorts and only one was judged to have a low risk of bias. The quality of evidence from observational studies was low to moderate. For a follow-up of up to 6 yr, active surveillance was found to have the lowest impact on cancer-specific QoL, surgery had a negative impact on urinary and sexual function when compared with active surveillance and EBRT, and EBRT had a negative impact on bowel function when compared with active surveillance and surgery. Data from one small RCT reported that brachytherapy has a negative impact on urinary function 1 yr post-treatment, but no significant urinary toxicity was reported at 5 yr. This is the first systematic review comparing the impact of different primary treatments on cancer-specific QoL for men with clinically localised prostate cancer, using validated cancer-specific patient-reported outcome measures only. There is robust evidence that choice of primary treatment for localised prostate cancer has distinct impacts on patients' QoL. This should be discussed in detail with patients during pretreatment counselling. Our review of the current evidence suggests that for a period of up to 6 yr after treatment, men with localised prostate cancer who were managed with active surveillance reported high levels of quality of life (QoL). Men treated with surgery reported mainly urinary and sexual problems, while those treated with external beam radiotherapy reported mainly bowel problems. Men eligible for brachytherapy reported urinary problems up to a year after therapy, but then their QoL returned gradually to as it was before treatment. Copyright © 2017 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Zachrisson, Henrik Daae; Dearing, Eric; Lekhal, Ratib; Toppelberg, Claudio O.
2012-01-01
Associations between maternal reports of hours in child care and children’s externalizing problems at 18 and 36 months of age were examined in a population-based Norwegian sample (n = 75,271). Within a sociopolitical context of homogenously high-quality child care, there was little evidence that high quantity of care causes externalizing problems. Using conventional approaches to handling selection bias and listwise deletion for substantial attrition in this sample, more hours in care predicted higher problem levels, yet with small effect sizes. The finding, however, was not robust to using multiple imputation for missing values. Moreover, when sibling and individual fixed-effects models for handling selection bias were used, no relation between hours and problems was evident. PMID:23311645
Brownson, Ross C; Chriqui, Jamie F; Burgeson, Charlene R; Fisher, Megan C; Ness, Roberta B
2010-06-01
Childhood obesity is a serious public health problem resulting from energy imbalance (when the intake of energy is greater than the amount of energy expended through physical activity). Numerous health authorities have identified policy interventions as promising strategies for creating population-wide improvements in physical activity. This case study focuses on energy expenditure through physical activity (with a particular emphasis on school-based physical education [PE]). Policy-relevant evidence for promoting physical activity in youth may take numerous forms, including epidemiologic data and other supporting evidence (e.g., qualitative data). The implementation and evaluation of school PE interventions leads to a set of lessons related to epidemiology and evidence-based policy. These include the need to: (i) enhance the focus on external validity, (ii) develop more policy-relevant evidence on the basis of "natural experiments," (iii) understand that policy making is political, (iv) better articulate the factors that influence policy dissemination, (v) understand the real-world constraints when implementing policy in school environments, and (vi) build transdisciplinary teams for policy progress. The issues described in this case study provide leverage points for practitioners, policy makers, and researchers as they seek to translate epidemiology to policy. Copyright 2010 Elsevier Inc. All rights reserved.
Cross-Cultural Adaptation and Validation of the SWAL-QoL Questionnaire in Greek.
Georgopoulos, Voula C; Perdikogianni, Myrto; Mouskenteri, Myrto; Psychogiou, Loukia; Oikonomou, Maria; Malandraki, Georgia A
2018-02-01
The purpose of this study was to translate and adapt the 44-item SWAL-QoL into Greek and examine its internal consistency, test-retest reliability, external construct validity, and discriminant validity in order to provide a validated dysphagia-specific QoL instrument in the Greek language. The instrument was translated into Greek using the back translation to ensure linguistic validity and was culturally adapted resulting in the SWAL-QoL-GR. Two groups of participants were included: a patient group of 86 adults (48 males; age range: 18-87 years) diagnosed with oropharyngeal dysphagia, and an age-matched healthy control group (39 adults; 19 males; age range: 18-84 years). The Greek 30-item version of the WHOQOL-BREF was used for assessment of construct validity. Overall, the questionnaire achieved good to excellent psychometric values. Internal consistency of all 10 subscales and the physical symptoms scale of the SWAL-QoL-GR assessed by Cronbach's α was good to excellent (0.811 < α < 0.940). Test-retest validity was found to be good to excellent as well. In addition, moderate to strong correlations were found between seven of the ten subscales of the SWAL-QoL-GR with limited items of the WHOQΟL-BREF (0.401 < ρ < 0.65), supporting good construct validity of the SWAL-QoL-GR. The SWAL-QoL-GR also correctly differentiated between patients with dysphagia and age-matched healthy controls (p < 0.001) on all 11 scales, further indicating excellent discriminant validity. Finally, no significant differences were found between the two sexes. This cultural adaptation and validation allows the use of this tool in Greece, further enhancing our clinical and scientific efforts to increase the evidence-based practice resources for dysphagia rehabilitation in Greece.
Ensor, Joie; Riley, Richard D; Jowett, Sue; Monahan, Mark; Snell, Kym Ie; Bayliss, Susan; Moore, David; Fitzmaurice, David
2016-02-01
Unprovoked first venous thromboembolism (VTE) is defined as VTE in the absence of a temporary provoking factor such as surgery, immobility and other temporary factors. Recurrent VTE in unprovoked patients is highly prevalent, but easily preventable with oral anticoagulant (OAC) therapy. The unprovoked population is highly heterogeneous in terms of risk of recurrent VTE. The first aim of the project is to review existing prognostic models which stratify individuals by their recurrence risk, therefore potentially allowing tailored treatment strategies. The second aim is to enhance the existing research in this field, by developing and externally validating a new prognostic model for individual risk prediction, using a pooled database containing individual patient data (IPD) from several studies. The final aim is to assess the economic cost-effectiveness of the proposed prognostic model if it is used as a decision rule for resuming OAC therapy, compared with current standard treatment strategies. Standard systematic review methodology was used to identify relevant prognostic model development, validation and cost-effectiveness studies. Bibliographic databases (including MEDLINE, EMBASE and The Cochrane Library) were searched using terms relating to the clinical area and prognosis. Reviewing was undertaken by two reviewers independently using pre-defined criteria. Included full-text articles were data extracted and quality assessed. Critical appraisal of included full texts was undertaken and comparisons made of model performance. A prognostic model was developed using IPD from the pooled database of seven trials. A novel internal-external cross-validation (IECV) approach was used to develop and validate a prognostic model, with external validation undertaken in each of the trials iteratively. Given good performance in the IECV approach, a final model was developed using all trials data. A Markov patient-level simulation was used to consider the economic cost-effectiveness of using a decision rule (based on the prognostic model) to decide on resumption of OAC therapy (or not). Three full-text articles were identified by the systematic review. Critical appraisal identified methodological and applicability issues; in particular, all three existing models did not have external validation. To address this, new prognostic models were sought with external validation. Two potential models were considered: one for use at cessation of therapy (pre D-dimer), and one for use after cessation of therapy (post D-dimer). Model performance measured in the external validation trials showed strong calibration performance for both models. The post D-dimer model performed substantially better in terms of discrimination (c = 0.69), better separating high- and low-risk patients. The economic evaluation identified that a decision rule based on the final post D-dimer model may be cost-effective for patients with predicted risk of recurrence of over 8% annually; this suggests continued therapy for patients with predicted risks ≥ 8% and cessation of therapy otherwise. The post D-dimer model performed strongly and could be useful to predict individuals' risk of recurrence at any time up to 2-3 years, thereby aiding patient counselling and treatment decisions. A decision rule using this model may be cost-effective for informing clinical judgement and patient opinion in treatment decisions. Further research may investigate new predictors to enhance model performance and aim to further externally validate to confirm performance in new, non-trial populations. Finally, it is essential that further research is conducted to develop a model predicting bleeding risk on therapy, to manage the balance between the risks of recurrence and bleeding. This study is registered as PROSPERO CRD42013003494. The National Institute for Health Research Health Technology Assessment programme.
Prediction of prostate cancer in unscreened men: external validation of a risk calculator.
van Vugt, Heidi A; Roobol, Monique J; Kranse, Ries; Määttänen, Liisa; Finne, Patrik; Hugosson, Jonas; Bangma, Chris H; Schröder, Fritz H; Steyerberg, Ewout W
2011-04-01
Prediction models need external validation to assess their value beyond the setting where the model was derived from. To assess the external validity of the European Randomized study of Screening for Prostate Cancer (ERSPC) risk calculator (www.prostatecancer-riskcalculator.com) for the probability of having a positive prostate biopsy (P(posb)). The ERSPC risk calculator was based on data of the initial screening round of the ERSPC section Rotterdam and validated in 1825 and 531 men biopsied at the initial screening round in the Finnish and Swedish sections of the ERSPC respectively. P(posb) was calculated using serum prostate specific antigen (PSA), outcome of digital rectal examination (DRE), transrectal ultrasound and ultrasound assessed prostate volume. The external validity was assessed for the presence of cancer at biopsy by calibration (agreement between observed and predicted outcomes), discrimination (separation of those with and without cancer), and decision curves (for clinical usefulness). Prostate cancer was detected in 469 men (26%) of the Finnish cohort and in 124 men (23%) of the Swedish cohort. Systematic miscalibration was present in both cohorts (mean predicted probability 34% versus 26% observed, and 29% versus 23% observed, both p<0.001). The areas under the curves were 0.76 and 0.78, and substantially lower for the model with PSA only (0.64 and 0.68 respectively). The model proved clinically useful for any decision threshold compared with a model with PSA only, PSA and DRE, or biopsying all men. A limitation is that the model is based on sextant biopsies results. The ERSPC risk calculator discriminated well between those with and without prostate cancer among initially screened men, but overestimated the risk of a positive biopsy. Further research is necessary to assess the performance and applicability of the ERSPC risk calculator when a clinical setting is considered rather than a screening setting. Copyright © 2010 Elsevier Ltd. All rights reserved.
Cardemil, Felipe; Esquivel, Patricia; Aguayo, Lorena; Barría, Tamara; Fuente, Adrian; Carvajal, Rocío; Fromín, Rose; Villalobos, Iván; Yueh, Bevan
2013-01-01
It is becoming increasingly important to have reliable and valid questionnaires. This becomes especially important when evaluating hearing loss. the "Effectiveness of Auditory Rehabilitation" (EAR) questionnaire for the Spanish-speaking population. This instrument assesses quality of life and hearing aspects in patients using hearing aids. Cross-sectional validation study. A cultural adaptation through the use of English to Spanish translations and re-translations was carried out. The validity and reliability of the newly adapted instrument were evaluated. A total of 69 individuals (44 older adults and 25 younger adults) were examined. The pure-tone averages (PTA, 500, 1,000 and 2,000 Hz) were 47.3 dB HL and 47.1 dB HL for the left and right ears, respectively. The mean maximum speech discrimination in silence for monosyllables were 83.3% and 82.9% for the left and right ears, respectively. Internal consistency presented Cronbach alpha values of 0.85 and 0.77 for the internal and external dimensions, respectively. The intraclass correlation coefficients were 0.80 for the internal module and 0.85 for the external module. Construct validity reported a correlation coefficient of 0.71 at baseline and 0.76 at 3 months after the initial assessment for the internal module, and 0.62 at baseline and 0.74 at 3 months after the initial assessment for the external module. The size effects were 1.3 and 1.1 for the internal and external modules, respectively. The Spanish version of the EAR questionnaire seems to be a reliable and valid instrument. The evaluation of audiological aspects, as well as aspects relating to aesthetics and comfort are the main strengths of this instrument. Finally, the EAR scale is more sensitive to change than other scales. Copyright © 2013 Elsevier España, S.L. All rights reserved.
Lamain-de Ruiter, Marije; Kwee, Anneke; Naaktgeboren, Christiana A; de Groot, Inge; Evers, Inge M; Groenendaal, Floris; Hering, Yolanda R; Huisjes, Anjoke J M; Kirpestein, Cornel; Monincx, Wilma M; Siljee, Jacqueline E; Van 't Zelfde, Annewil; van Oirschot, Charlotte M; Vankan-Buitelaar, Simone A; Vonk, Mariska A A W; Wiegers, Therese A; Zwart, Joost J; Franx, Arie; Moons, Karel G M; Koster, Maria P H
2016-08-30
To perform an external validation and direct comparison of published prognostic models for early prediction of the risk of gestational diabetes mellitus, including predictors applicable in the first trimester of pregnancy. External validation of all published prognostic models in large scale, prospective, multicentre cohort study. 31 independent midwifery practices and six hospitals in the Netherlands. Women recruited in their first trimester (<14 weeks) of pregnancy between December 2012 and January 2014, at their initial prenatal visit. Women with pre-existing diabetes mellitus of any type were excluded. Discrimination of the prognostic models was assessed by the C statistic, and calibration assessed by calibration plots. 3723 women were included for analysis, of whom 181 (4.9%) developed gestational diabetes mellitus in pregnancy. 12 prognostic models for the disorder could be validated in the cohort. C statistics ranged from 0.67 to 0.78. Calibration plots showed that eight of the 12 models were well calibrated. The four models with the highest C statistics included almost all of the following predictors: maternal age, maternal body mass index, history of gestational diabetes mellitus, ethnicity, and family history of diabetes. Prognostic models had a similar performance in a subgroup of nulliparous women only. Decision curve analysis showed that the use of these four models always had a positive net benefit. In this external validation study, most of the published prognostic models for gestational diabetes mellitus show acceptable discrimination and calibration. The four models with the highest discriminative abilities in this study cohort, which also perform well in a subgroup of nulliparous women, are easy models to apply in clinical practice and therefore deserve further evaluation regarding their clinical impact. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Romero, Isabella E; Toorabally, Nasreen; Burchett, Danielle; Tarescavage, Anthony M; Glassmire, David M
2017-01-01
Contemporary models of psychopathology-encompassing internalizing, externalizing, and thought dysfunction factors-have gained significant support. Although research indicates the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008 /2011) measures these domains of psychopathology, this study addresses extant limitations in MMPI-2-RF diagnostic validity research by examining associations between all MMPI-2-RF substantive scales and broad dichotomous indicators of internalizing, externalizing, and thought dysfunction diagnoses in a sample of 1,110 forensic inpatients. Comparing those with and without internalizing diagnoses, notable effects were observed for Negative Emotionality/Neuroticism-Revised (NEGE-r), Emotional/Internalizing Dysfunction (EID), Dysfunctional Negative Emotions (RC7), Demoralization (RCd), and several other internalizing and somatic/cognitive scales. Comparing those with and without thought dysfunction diagnoses, the largest hypothesized differences occurred for Thought Dysfunction (THD), Aberrant Experiences (RC8), and Psychoticism-Revised (PSYC-r), although unanticipated differences were observed on internalizing and interpersonal scales, likely reflecting the high prevalence of internalizing dysfunction in forensic inpatients not experiencing thought dysfunction. Comparing those with and without externalizing diagnoses, the largest effects were for Substance Abuse (SUB), Antisocial Behavior (RC4), Behavioral/Externalizing Dysfunction (BXD), Juvenile Conduct Problems (JCP), and Disconstraint-Revised (DISC-r). Multivariate models evidenced similar results. Findings support the construct validity of MMPI-2-RF scales as measures of internalizing, thought, and externalizing dysfunction.
McClelland, Robyn L; Jorgensen, Neal W; Budoff, Matthew; Blaha, Michael J; Post, Wendy S; Kronmal, Richard A; Bild, Diane E; Shea, Steven; Liu, Kiang; Watson, Karol E; Folsom, Aaron R; Khera, Amit; Ayers, Colby; Mahabadi, Amir-Abbas; Lehmann, Nils; Jöckel, Karl-Heinz; Moebus, Susanne; Carr, J Jeffrey; Erbel, Raimund; Burke, Gregory L
2015-10-13
Several studies have demonstrated the tremendous potential of using coronary artery calcium (CAC) in addition to traditional risk factors for coronary heart disease (CHD) risk prediction. However, to date, no risk score incorporating CAC has been developed. The goal of this study was to derive and validate a novel risk score to estimate 10-year CHD risk using CAC and traditional risk factors. Algorithm development was conducted in the MESA (Multi-Ethnic Study of Atherosclerosis), a prospective community-based cohort study of 6,814 participants age 45 to 84 years, who were free of clinical heart disease at baseline and followed for 10 years. MESA is sex balanced and included 39% non-Hispanic whites, 12% Chinese Americans, 28% African Americans, and 22% Hispanic Americans. External validation was conducted in the HNR (Heinz Nixdorf Recall Study) and the DHS (Dallas Heart Study). Inclusion of CAC in the MESA risk score offered significant improvements in risk prediction (C-statistic 0.80 vs. 0.75; p < 0.0001). External validation in both the HNR and DHS studies provided evidence of very good discrimination and calibration. Harrell's C-statistic was 0.779 in HNR and 0.816 in DHS. Additionally, the difference in estimated 10-year risk between events and nonevents was approximately 8% to 9%, indicating excellent discrimination. Mean calibration, or calibration-in-the-large, was excellent for both studies, with average predicted 10-year risk within one-half of a percent of the observed event rate. An accurate estimate of 10-year CHD risk can be obtained using traditional risk factors and CAC. The MESA risk score, which is available online on the MESA web site for easy use, can be used to aid clinicians when communicating risk to patients and when determining risk-based treatment strategies. Copyright © 2015 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
Tran, Alexandre; Matar, Maher; Steyerberg, Ewout W; Lampron, Jacinthe; Taljaard, Monica; Vaillancourt, Christian
2017-04-13
Hemorrhage is a major cause of early mortality following a traumatic injury. The progression and consequences of significant blood loss occur quickly as death from hemorrhagic shock or exsanguination often occurs within the first few hours. The mainstay of treatment therefore involves early identification of patients at risk for hemorrhagic shock in order to provide blood products and control of the bleeding source if necessary. The intended scope of this review is to identify and assess combinations of predictors informing therapeutic decision-making for clinicians during the initial trauma assessment. The primary objective of this systematic review is to identify and critically assess any existing multivariable models predicting significant traumatic hemorrhage that requires intervention, defined as a composite outcome comprising massive transfusion, surgery for hemostasis, or angiography with embolization for the purpose of external validation or updating in other study populations. If no suitable existing multivariable models are identified, the secondary objective is to identify candidate predictors to inform the development of a new prediction rule. We will search the EMBASE and MEDLINE databases for all randomized controlled trials and prospective and retrospective cohort studies developing or validating predictors of intervention for traumatic hemorrhage in adult patients 16 years of age or older. Eligible predictors must be available to the clinician during the first hour of trauma resuscitation and may be clinical, lab-based, or imaging-based. Outcomes of interest include the need for surgical intervention, angiographic embolization, or massive transfusion within the first 24 h. Data extraction will be performed independently by two reviewers. Items for extraction will be based on the CHARMS checklist. We will evaluate any existing models for relevance, quality, and the potential for external validation and updating in other populations. Relevance will be described in terms of appropriateness of outcomes and predictors. Quality criteria will include variable selection strategies, adequacy of sample size, handling of missing data, validation techniques, and measures of model performance. This systematic review will describe the availability of multivariable prediction models and summarize evidence regarding predictors that can be used to identify the need for intervention in patients with traumatic hemorrhage. PROSPERO CRD42017054589.
Natsuaki, Misaki N.; Ge, Xiaojia; Reiss, David; Neiderhiser, Jenae M.
2011-01-01
This study investigated the prospective links between sibling aggression and the development of externalizing problems using a multilevel modeling approach with a genetically sensitive design. The sample consisted of 780 adolescents (390 sibling pairs) who participated in two waves of the Nonshared Environment for Adolescent Development (NEAD) project. Sibling pairs with varying degree of genetic relatedness, including monozygotic twins, dizygotic twins, full siblings, half siblings, and genetically unrelated siblings, were included. The results showed that sibling aggression at Time 1 was significantly associated with the focal child’s externalizing problems at Time 2 after accounting for the intra-class correlations between siblings. Sibling aggression remained significant in predicting subsequent externalizing problems even after controlling for the levels of pre-existing externalizing problems and mothers’ punitive parenting. This pattern of results was fairly robust across models using different informants. The findings provide converging evidence for the unique contribution of sibling aggression in understanding changes in externalizing problems during adolescence. PMID:19586176
van Abbema, Renske; Bielderman, Annemiek; De Greef, Mathieu; Hobbelen, Hans; Krijnen, Wim; van der Schans, Cees
2015-09-01
To develop and psychometrically test the Groningen Ageing Resilience Inventory. Ageing is a process that is often accompanied by functional limitation, disabilities and losses. Instead of focusing on these negative events of ageing, there are opportunities in focusing on adaptation mechanisms, like resilience, that are helpful to cope with those adversities. Cross-sectional study. The study was conducted from 2011-2012. First, a conceptual model of resilience during the ageing process was constructed. Next, items were formulated that made up a comprehensive template questionnaire reflecting the model. Finally, a cross-sectional study was performed to evaluate the construct validity and internal consistency of this template 16-item questionnaire. Participants (N = 229) with a mean age of 71·5 years, completed the template 16-item Groningen Ageing Resilience Inventory, and performance based tests and psychological questionnaires. Exploratory factor analysis resulted in a two factor solution of internal and external resources of resilience. Three items did not discriminate well between the two factors and were deleted, remaining a final 13-item questionnaire that shows evidence of good internal consistency. The direction and magnitude of the correlations with other measures support the construct validity. The Groningen Ageing Resilience Inventory is a useful instrument that can help nurses, other healthcare workers, researchers and providers of informal care to identify the internal and external resources of resilience in individuals and groups. In a multidisciplinary biopsychosocial approach this knowledge provides tools for empowering older patients in performing health promoting behaviors and self-care tasks. © 2015 John Wiley & Sons Ltd.
Identification of Distinct Psychosis Biotypes Using Brain-Based Biomarkers
Clementz, Brett A.; Sweeney, John A.; Hamm, Jordan P.; Ivleva, Elena I.; Ethridge, Lauren E.; Pearlson, Godfrey D.; Keshavan, Matcheri S.; Tamminga, Carol A.
2017-01-01
Objective Clinical phenomenology remains the primary means for classifying psychoses despite considerable evidence that this method incompletely captures biologically meaningful differentiations. Rather than relying on clinical diagnoses as the gold standard, this project drew on neurobiological heterogeneity among psychosis cases to delineate subgroups independent of their phenomenological manifestations. Method A large biomarker panel (neuropsychological, stop signal, saccadic control, and auditory stimulation paradigms) characterizing diverse aspects of brain function was collected on individuals with schizophrenia, schizoaffective disorder, and bipolar disorder with psychosis (N=711), their first-degree relatives (N=883), and demographically comparable healthy subjects (N=278). Biomarker variance across paradigms was exploited to create nine integrated variables that were used to capture neurobiological variance among the psychosis cases. Data on external validating measures (social functioning, structural magnetic resonance imaging, family biomarkers, and clinical information) were collected. Results Multivariate taxometric analyses identified three neurobiologically distinct psychosis biotypes that did not respect clinical diagnosis boundaries. The same analysis procedure using clinical DSM diagnoses as the criteria was best described by a single severity continuum (schizophrenia worse than schizoaffective disorder worse than bipolar psychosis); this was not the case for biotypes. The external validating measures supported the distinctiveness of these subgroups compared with clinical diagnosis, highlighting a possible advantage of neurobiological versus clinical categorization schemes for differentiating psychotic disorders. Conclusions These data illustrate how multiple pathways may lead to clinically similar psychosis manifestations, and they provide explanations for the marked heterogeneity observed across laboratories on the same biomarker variables when DSM diagnoses are used as the gold standard. PMID:26651391
Overgaard-Steensen, Christian; Larsson, Anders; Bluhme, Henrik; Tønnesen, Else; Frøkiaer, Jørgen; Ring, Troels
2010-01-01
Acute hyponatremia is a serious condition, which poses major challenges. Of particular importance is what determines plasma sodium concentration ([Na(+)]). Edelman introduced an explicit model to describe plasma [Na(+)] in a population as [Na(+)] = alpha.(exchangeable Na(+) + exchangeable K(+))/(total body water) - beta. Evidence for the clinical utility of the model in the individual and in acute hyponatremia is sparse. We, therefore, investigated how the measured plasma [Na(+)] could be predicted in a porcine model of hyponatremia. Plasma [Na(+)] was estimated from in vivo-determined balances of water, Na(+), and K(+), according to Edelman's equation. Acute hyponatremia was induced with desmopressin acetate and infusion of a 2.5% glucose solution in anesthetized pigs. During 480 min, plasma [Na(+)] and osmolality were reduced from 136 (SD 2) to 120 mmol/l (SD 3) and from 284 (SD 4) to 252 mosmol/kgH(2)O (SD 5), respectively. The following interpretations were made. First, Edelman's model, which, besides dilution, takes into account Na(+) and K(+), fits plasma [Na(+)] significantly better than dilution alone. Second, a common value of alpha = 1.33 (SD 0.08) and beta = -13.04 mmol/l (SD 7.68) for all pigs explains well the plasma [Na(+)] in the individual animal. Third, measured exchangeable Na(+) and calculated exchangeable Na(+) + K(+) per weight in the pigs are close to Edelman's findings in humans, whereby the methods are cross-validated. In conclusion, plasma [Na(+)] can be explained in the individual animal by external balances, according to Edelman's construct in acute hyponatremia.
Hodgson, Luke Eliot; Sarnowski, Alexander; Roderick, Paul J; Dimitrov, Borislav D; Venn, Richard M; Forni, Lui G
2017-09-27
Critically appraise prediction models for hospital-acquired acute kidney injury (HA-AKI) in general populations. Systematic review. Medline, Embase and Web of Science until November 2016. Studies describing development of a multivariable model for predicting HA-AKI in non-specialised adult hospital populations. Published guidance followed for data extraction reporting and appraisal. 14 046 references were screened. Of 53 HA-AKI prediction models, 11 met inclusion criteria (general medicine and/or surgery populations, 474 478 patient episodes) and five externally validated. The most common predictors were age (n=9 models), diabetes (5), admission serum creatinine (SCr) (5), chronic kidney disease (CKD) (4), drugs (diuretics (4) and/or ACE inhibitors/angiotensin-receptor blockers (3)), bicarbonate and heart failure (4 models each). Heterogeneity was identified for outcome definition. Deficiencies in reporting included handling of predictors, missing data and sample size. Admission SCr was frequently taken to represent baseline renal function. Most models were considered at high risk of bias. Area under the receiver operating characteristic curves to predict HA-AKI ranged 0.71-0.80 in derivation (reported in 8/11 studies), 0.66-0.80 for internal validation studies (n=7) and 0.65-0.71 in five external validations. For calibration, the Hosmer-Lemeshow test or a calibration plot was provided in 4/11 derivations, 3/11 internal and 3/5 external validations. A minority of the models allow easy bedside calculation and potential electronic automation. No impact analysis studies were found. AKI prediction models may help address shortcomings in risk assessment; however, in general hospital populations, few have external validation. Similar predictors reflect an elderly demographic with chronic comorbidities. Reporting deficiencies mirrors prediction research more broadly, with handling of SCr (baseline function and use as a predictor) a concern. Future research should focus on validation, exploration of electronic linkage and impact analysis. The latter could combine a prediction model with AKI alerting to address prevention and early recognition of evolving AKI. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
QSAR Modeling of Rat Acute Toxicity by Oral Exposure
Zhu, Hao; Martin, Todd M.; Ye, Lin; Sedykh, Alexander; Young, Douglas M.; Tropsha, Alexander
2009-01-01
Few Quantitative Structure-Activity Relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity endpoints. In this study, a comprehensive dataset of 7,385 compounds with their most conservative lethal dose (LD50) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire dataset was selected that included all 3,472 compounds used in the TOPKAT’s training set. The remaining 3,913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R2 of linear regression between actual and predicted LD50 values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R2 ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD50 for every compound using all 5 models. The consensus models afforded higher prediction accuracy for the external validation dataset with the higher coverage as compared to individual constituent models. The validated consensus LD50 models developed in this study can be used as reliable computational predictors of in vivo acute toxicity. PMID:19845371
Petersen, Japke F; Stuiver, Martijn M; Timmermans, Adriana J; Chen, Amy; Zhang, Hongzhen; O'Neill, James P; Deady, Sandra; Vander Poorten, Vincent; Meulemans, Jeroen; Wennerberg, Johan; Skroder, Carl; Day, Andrew T; Koch, Wayne; van den Brekel, Michiel W M
2018-05-01
TNM-classification inadequately estimates patient-specific overall survival (OS). We aimed to improve this by developing a risk-prediction model for patients with advanced larynx cancer. Cohort study. We developed a risk prediction model to estimate the 5-year OS rate based on a cohort of 3,442 patients with T3T4N0N+M0 larynx cancer. The model was internally validated using bootstrapping samples and externally validated on patient data from five external centers (n = 770). The main outcome was performance of the model as tested by discrimination, calibration, and the ability to distinguish risk groups based on tertiles from the derivation dataset. The model performance was compared to a model based on T and N classification only. We included age, gender, T and N classification, and subsite as prognostic variables in the standard model. After external validation, the standard model had a significantly better fit than a model based on T and N classification alone (C statistic, 0.59 vs. 0.55, P < .001). The model was able to distinguish well among three risk groups based on tertiles of the risk score. Adding treatment modality to the model did not decrease the predictive power. As a post hoc analysis, we tested the added value of comorbidity as scored by American Society of Anesthesiologists score in a subsample, which increased the C statistic to 0.68. A risk prediction model for patients with advanced larynx cancer, consisting of readily available clinical variables, gives more accurate estimations of the estimated 5-year survival rate when compared to a model based on T and N classification alone. 2c. Laryngoscope, 128:1140-1145, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Tomoaia-Cotisel, Andrada; Scammon, Debra L.; Waitzman, Norman J.; Cronholm, Peter F.; Halladay, Jacqueline R.; Driscoll, David L.; Solberg, Leif I.; Hsu, Clarissa; Tai-Seale, Ming; Hiratsuka, Vanessa; Shih, Sarah C.; Fetters, Michael D.; Wise, Christopher G.; Alexander, Jeffrey A.; Hauser, Diane; McMullen, Carmit K.; Scholle, Sarah Hudson; Tirodkar, Manasi A.; Schmidt, Laura; Donahue, Katrina E.; Parchman, Michael L.; Stange, Kurt C.
2013-01-01
PURPOSE We aimed to advance the internal and external validity of research by sharing our empirical experience and recommendations for systematically reporting contextual factors. METHODS Fourteen teams conducting research on primary care practice transformation retrospectively considered contextual factors important to interpreting their findings (internal validity) and transporting or reinventing their findings in other settings/situations (external validity). Each team provided a table or list of important contextual factors and interpretive text included as appendices to the articles in this supplement. Team members identified the most important contextual factors for their studies. We grouped the findings thematically and developed recommendations for reporting context. RESULTS The most important contextual factors sorted into 5 domains: (1) the practice setting, (2) the larger organization, (3) the external environment, (4) implementation pathway, and (5) the motivation for implementation. To understand context, investigators recommend (1) engaging diverse perspectives and data sources, (2) considering multiple levels, (3) evaluating history and evolution over time, (4) looking at formal and informal systems and culture, and (5) assessing the (often nonlinear) interactions between contextual factors and both the process and outcome of studies. We include a template with tabular and interpretive elements to help study teams engage research participants in reporting relevant context. CONCLUSIONS These findings demonstrate the feasibility and potential utility of identifying and reporting contextual factors. Involving diverse stakeholders in assessing context at multiple stages of the research process, examining their association with outcomes, and consistently reporting critical contextual factors are important challenges for a field interested in improving the internal and external validity and impact of health care research. PMID:23690380
Quantitative structure-activity relationship modeling of rat acute toxicity by oral exposure.
Zhu, Hao; Martin, Todd M; Ye, Lin; Sedykh, Alexander; Young, Douglas M; Tropsha, Alexander
2009-12-01
Few quantitative structure-activity relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity end points. In this study, a comprehensive data set of 7385 compounds with their most conservative lethal dose (LD(50)) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire data set was selected that included all 3472 compounds used in TOPKAT's training set. The remaining 3913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R(2) of linear regression between actual and predicted LD(50) values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R(2) ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD(50) for every compound using all five models. The consensus models afforded higher prediction accuracy for the external validation data set with the higher coverage as compared to individual constituent models. The validated consensus LD(50) models developed in this study can be used as reliable computational predictors of in vivo acute toxicity.
Is protective equipment useful in preventing concussion? A systematic review of the literature.
Benson, B W; Hamilton, G M; Meeuwisse, W H; McCrory, P; Dvorak, J
2009-05-01
To determine if there is evidence that equipment use reduces sport concussion risk and/or severity. 12 electronic databases were searched using a combination of Medical Subject Headings and text words to identify relevant articles. Specific inclusion and exclusion criteria were used to select studies for review. Data extracted included design, study population, exposure/outcome measures and results. The quality of evidence was assessed based on epidemiologic criteria regarding internal and external validity (ie, strength of design, sample size/power calculation, selection bias, misclassification bias, control of potential confounding and effect modification). In total, 51 studies were selected for review. A comparison between studies was difficult due to the variability in research designs, definition of concussion, mouthguard/helmet/headgear/face shield types, measurements used to assess exposure and outcomes, and variety of sports assessed. The majority of studies were observational, with 23 analytical epidemiologic designs related to the subject area. Selection bias was a concern in the reviewed studies, as was the lack of measurement and control for potentially confounding variables. There is evidence that helmet use reduces head injury risk in skiing, snowboarding and bicycling, but the effect on concussion risk is inconclusive. No strong evidence exists for the use of mouthguards or face shields to reduce concussion risk. Evidence is provided to suggest that full facial protection in ice hockey may reduce concussion severity, as measured by time loss from competition.
Phase 4 Studies in Heart Failure - What is Done and What is Needed?
Iyngkaran, Pupalan; Liew, Danny; McDonald, Peter; Thomas, Merlin C; Reid, Christopher; Chew, Derek; Hare, David L
Congestive heart failure (CHF) therapeutics is generated through a well-described evidence generating process. Phases 1 - 3 of this process are required prior to approval and widespread clinical use. Phase 3 in almost all cases is a methodologically sound randomized controlled trial (RCT). After this phase it is generally accepted that the treatment has a significant, independent and prognostically beneficial effect on the pathophysiological process. A major criticism of RCTs is the population to whom the result is applicable. When this population is significantly different from the trial cohort the external validity comes into question. Should the continuation of the evidence generating process continue these problems might be identified. Post marketing surveillance through phase 4 and comparative effectiveness studies through phase 5 trials are often underperformed in comparison to the RCT. These processes can help identify remote adverse events and define new hypotheses for community level benefits. This review is aimed at exploring the post-marketing scene for CHF therapeutics from an Australian health system perspective. We explore the phases of clinical trials, the level of evidence currently available and options for ensuring greater accountability for community level CHF clinical outcomes.
Phase 4 Studies in Heart Failure - What is Done and What is Needed?
Iyngkaran, Pupalan; Liew, Danny; McDonald, Peter; Thomas, Merlin C.; Reid, Christopher; Chew, Derek; Hare, David L.
2016-01-01
Congestive heart failure (CHF) therapeutics is generated through a well-described evidence generating process. Phases 1 – 3 of this process are required prior to approval and widespread clinical use. Phase 3 in almost all cases is a methodologically sound randomized controlled trial (RCT). After this phase it is generally accepted that the treatment has a significant, independent and prognostically beneficial effect on the pathophysiological process. A major criticism of RCTs is the population to whom the result is applicable. When this population is significantly different from the trial cohort the external validity comes into question. Should the continuation of the evidence generating process continue these problems might be identified. Post marketing surveillance through phase 4 and comparative effectiveness studies through phase 5 trials are often underperformed in comparison to the RCT. These processes can help identify remote adverse events and define new hypotheses for community level benefits. This review is aimed at exploring the post-marketing scene for CHF therapeutics from an Australian health system perspective. We explore the phases of clinical trials, the level of evidence currently available and options for ensuring greater accountability for community level CHF clinical outcomes. PMID:27280303
Educational testing validity and reliability in pharmacy and medical education literature.
Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J
2013-12-16
To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
ERIC Educational Resources Information Center
Whaley, Arthur L.
2018-01-01
Over the past two decades, there have been significant advances in stereotype threat research on African Americans. The current article reviews general issues of internal validity and external validity (or generalizability) beyond college laboratories in stereotype threat studies, and as they are revealed specifically in the context of advances in…
ERIC Educational Resources Information Center
Mihura, Joni L.; Meyer, Gregory J.; Dumitrascu, Nicolae; Bombel, George
2013-01-01
We systematically evaluated the peer-reviewed Rorschach validity literature for the 65 main variables in the popular Comprehensive System (CS). Across 53 meta-analyses examining variables against externally assessed criteria (e.g., observer ratings, psychiatric diagnosis), the mean validity was r = 0.27 (k = 770) as compared to r = 0.08 (k = 386)…
Dearing, Chey G; Kilburn, Sally; Lindsay, Kevin S
2014-03-01
Sperm counts have been linked to several fertility outcomes making them an essential parameter of semen analysis. It has become increasingly recognised that Computer-Assisted Semen Analysis (CASA) provides improved precision over manual methods but that systems are seldom validated robustly for use. The objective of this study was to gather the evidence to validate or reject the Sperm Class Analyser (SCA) as a tool for routine sperm counting in a busy laboratory setting. The criteria examined were comparison with the Improved Neubauer and Leja 20-μm chambers, within and between field precision, sperm concentration linearity from a stock diluted in semen and media, accuracy against internal and external quality material, assessment of uneven flow effects and a receiver operating characteristic (ROC) analysis to predict fertility in comparison with the Neubauer method. This work demonstrates that SCA CASA technology is not a standalone 'black box', but rather a tool for well-trained staff that allows rapid, high-number sperm counting providing errors are identified and corrected. The system will produce accurate, linear, precise results, with less analytical variance than manual methods that correlate well against the Improved Neubauer chamber. The system provides superior predictive potential for diagnosing fertility problems.
Haley, David W
2011-09-01
The current study examined whether the psychological stress of the still-face (SF) task (i.e. stress resulting from a parent's unresponsiveness) is a valid laboratory stress paradigm for evaluating infant cortisol reactivity. Given that factors external to the experimental paradigm, such as arriving at a new place, may cause an elevation in cortisol secretion; we tested the hypothesis that infants would show a cortisol response to the SF task but not to a normal FF task (control). Saliva was collected for cortisol measurement from 6-month-old infants (n = 31) randomly assigned to either a repeated SF task or to a continuous FF task. Parent-infant dyads were videotaped. Salivary cortisol concentration was measured at baseline, 20, and 30 min after the start of the procedure. Infant salivary cortisol concentrations showed a significant increase over time for the SF task but not for the FF task. The results provide new evidence that the repeated SF task provides a psychological challenge that is due to the SF condition rather than to some non-task related factor; these results provide internal validity for the paradigm. The study offers new insight into the role of parent-infant interactions in the activation of the infant stress response system.
NASA Astrophysics Data System (ADS)
Cánovas-García, Fulgencio; Alonso-Sarría, Francisco; Gomariz-Castillo, Francisco; Oñate-Valdivieso, Fernando
2017-06-01
Random forest is a classification technique widely used in remote sensing. One of its advantages is that it produces an estimation of classification accuracy based on the so called out-of-bag cross-validation method. It is usually assumed that such estimation is not biased and may be used instead of validation based on an external data-set or a cross-validation external to the algorithm. In this paper we show that this is not necessarily the case when classifying remote sensing imagery using training areas with several pixels or objects. According to our results, out-of-bag cross-validation clearly overestimates accuracy, both overall and per class. The reason is that, in a training patch, pixels or objects are not independent (from a statistical point of view) of each other; however, they are split by bootstrapping into in-bag and out-of-bag as if they were really independent. We believe that putting whole patch, rather than pixels/objects, in one or the other set would produce a less biased out-of-bag cross-validation. To deal with the problem, we propose a modification of the random forest algorithm to split training patches instead of the pixels (or objects) that compose them. This modified algorithm does not overestimate accuracy and has no lower predictive capability than the original. When its results are validated with an external data-set, the accuracy is not different from that obtained with the original algorithm. We analysed three remote sensing images with different classification approaches (pixel and object based); in the three cases reported, the modification we propose produces a less biased accuracy estimation.
Amariles, Pedro; Pino-Marín, Daniel; Sabater-Hernández, Daniel; García-Jiménez, Emilio; Roig-Sánchez, Inés; Faus, María José
2016-11-01
To determine the test-retest reliability of a questionnaire, with a validation preliminary, to assess knowledge of cardiovascular risk (CVR) and cardiovascular disease in patients attending community pharmacies in Spain. To complement the external validity, establishing the relationship between an educational activity and the increase in knowledge about CVR and cardiovascular disease. Sub-analysis of a controlled clinical study, EMDADER-CV, in which a questionnaire about knowledge concerning CVR was applied at 4 different times. Spanish Community Pharmacies. There were 323 patients in the control group, from the 640 who completed the study. Intraclass correlation coefficient to assess the reliability in 3 comparisons (post-educational activity with week 16, post-educational activity with week 32, and week 16 with week 32); and the non-parametric Friedman test to establish the relationship between an oral and written educational activity with increasing knowledge. For the 323 patients in the 3 comparisons, the intraclass correlation coefficient values were 0.624; 0.608 and 0.801, respectively (fair-good to excellent reliability). So, the Friedman test showed a statistically significant relationship between educational activity and increased knowledge (p < .0001). According to the intraclass correlation coefficient, the questionnaire aimed at assessing the knowledge on CVR and cardiovascular disease has a reliability between acceptable and excellent, which added to the previous validation, shows that the instrument meets the criteria of validity and reliability. Furthermore, the questionnaire showed the ability to relate an increase in knowledge with an educational intervention, feature that complements its external validity. Copyright © 2016 Elsevier España, S.L.U. All rights reserved.
Risk prediction models of breast cancer: a systematic review of model performances.
Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin
2012-05-01
The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.
Roozenbeek, Bob; Lingsma, Hester F.; Lecky, Fiona E.; Lu, Juan; Weir, James; Butcher, Isabella; McHugh, Gillian S.; Murray, Gordon D.; Perel, Pablo; Maas, Andrew I.R.; Steyerberg, Ewout W.
2012-01-01
Objective The International Mission on Prognosis and Analysis of Clinical Trials (IMPACT) and Corticoid Randomisation After Significant Head injury (CRASH) prognostic models predict outcome after traumatic brain injury (TBI) but have not been compared in large datasets. The objective of this is study is to validate externally and compare the IMPACT and CRASH prognostic models for prediction of outcome after moderate or severe TBI. Design External validation study. Patients We considered 5 new datasets with a total of 9036 patients, comprising three randomized trials and two observational series, containing prospectively collected individual TBI patient data. Measurements Outcomes were mortality and unfavourable outcome, based on the Glasgow Outcome Score (GOS) at six months after injury. To assess performance, we studied the discrimination of the models (by AUCs), and calibration (by comparison of the mean observed to predicted outcomes and calibration slopes). Main Results The highest discrimination was found in the TARN trauma registry (AUCs between 0.83 and 0.87), and the lowest discrimination in the Pharmos trial (AUCs between 0.65 and 0.71). Although differences in predictor effects between development and validation populations were found (calibration slopes varying between 0.58 and 1.53), the differences in discrimination were largely explained by differences in case-mix in the validation studies. Calibration was good, the fraction of observed outcomes generally agreed well with the mean predicted outcome. No meaningful differences were noted in performance between the IMPACT and CRASH models. More complex models discriminated slightly better than simpler variants. Conclusions Since both the IMPACT and the CRASH prognostic models show good generalizability to more recent data, they are valid instruments to quantify prognosis in TBI. PMID:22511138
Classification based upon gene expression data: bias and precision of error rates.
Wood, Ian A; Visscher, Peter M; Mengersen, Kerrie L
2007-06-01
Gene expression data offer a large number of potentially useful predictors for the classification of tissue samples into classes, such as diseased and non-diseased. The predictive error rate of classifiers can be estimated using methods such as cross-validation. We have investigated issues of interpretation and potential bias in the reporting of error rate estimates. The issues considered here are optimization and selection biases, sampling effects, measures of misclassification rate, baseline error rates, two-level external cross-validation and a novel proposal for detection of bias using the permutation mean. Reporting an optimal estimated error rate incurs an optimization bias. Downward bias of 3-5% was found in an existing study of classification based on gene expression data and may be endemic in similar studies. Using a simulated non-informative dataset and two example datasets from existing studies, we show how bias can be detected through the use of label permutations and avoided using two-level external cross-validation. Some studies avoid optimization bias by using single-level cross-validation and a test set, but error rates can be more accurately estimated via two-level cross-validation. In addition to estimating the simple overall error rate, we recommend reporting class error rates plus where possible the conditional risk incorporating prior class probabilities and a misclassification cost matrix. We also describe baseline error rates derived from three trivial classifiers which ignore the predictors. R code which implements two-level external cross-validation with the PAMR package, experiment code, dataset details and additional figures are freely available for non-commercial use from http://www.maths.qut.edu.au/profiles/wood/permr.jsp
Multi-Informant Assessment of Temperament in Children with Externalizing Behavior Problems
ERIC Educational Resources Information Center
Copeland, William; Landry, Kerry; Stanger, Catherine; Hudziak, James J.
2004-01-01
We examined the criterion validity of parent and self-report versions of the Junior Temperament and Character Inventory (JTCI) in children with high levels of externalizing problems. The sample included 412 children (206 participants and 206 siblings) participating in a family study of attention and aggressive behavior problems. Criterion validity…
Development and validation of the goal content for exercise questionnaire.
Sebire, Simon J; Standage, Martyn; Vansteenkiste, Maarten
2008-08-01
Self-determination theory (SDT; Deci & Ryan, 2000) proposes that intrinsic, relative to extrinsic, goal content is a critical predictor of the quality of an individual's behavior and psychological well-being. Through three studies, we developed and psychometrically tested a measure of intrinsic and extrinsic goal content in the exercise context: the Goal Content for Exercise Questionnaire (GCEQ). In adults, exploratory (N = 354; Study 1) and confirmatory factor analyses (N = 312; Study 2) supported a 20-item solution consisting of 5 lower order factors (i.e., social affiliation, health management, skill development, image and social recognition) that could be subsumed within a 2-factor higher order structure (i.e., intrinsic and extrinsic). Evidence for external validity, temporal stability, gender invariance, and internal consistency of the GCEQ was found. An independent sample (N = 475; Study 3) provided further support for the lower order structure of the GCEQ and some support for the higher order structure. The GCEQ was supported as a measure of exercise-based goal content, which may help understand how intrinsic and extrinsic goals can motivate exercise behavior.
Lunke, Katrin; Meier, Beat
2016-01-01
The goal of the present study was to take a new look at the relationship between creativity and cognitive functioning. Based on models that have postulated domain- and sub-domain-structures for different forms of creativity, like scientific, technical or artistic creativity with cognitive functions as important basis, we developed a new questionnaire. The Artistic Creativity Domains Compendium (ACDC) assesses interest, ability and performance in a distinct way for different domains of artistic creativity. We present the data of 270 adults tested with the ACDC, standard tests of divergent and convergent thinking, and tests of cognitive functions. We present fine-grained analyses on the internal and external validity of the ACDC and on the relationships between creativity, working memory, attention, and intelligence. Our results indicate domain-specific associations between creativity and attention as well as working memory. We conclude that the ACDC is a valid instrument to assess artistic creativity and that a fine-grained analysis reveals distinct patterns of relationships between separate domains of creativity and cognition. PMID:27516745
Lunke, Katrin; Meier, Beat
2016-01-01
The goal of the present study was to take a new look at the relationship between creativity and cognitive functioning. Based on models that have postulated domain- and sub-domain-structures for different forms of creativity, like scientific, technical or artistic creativity with cognitive functions as important basis, we developed a new questionnaire. The Artistic Creativity Domains Compendium (ACDC) assesses interest, ability and performance in a distinct way for different domains of artistic creativity. We present the data of 270 adults tested with the ACDC, standard tests of divergent and convergent thinking, and tests of cognitive functions. We present fine-grained analyses on the internal and external validity of the ACDC and on the relationships between creativity, working memory, attention, and intelligence. Our results indicate domain-specific associations between creativity and attention as well as working memory. We conclude that the ACDC is a valid instrument to assess artistic creativity and that a fine-grained analysis reveals distinct patterns of relationships between separate domains of creativity and cognition.
Link-Gelles, Ruth; Westreich, Daniel; Aiello, Allison E; Shang, Nong; Weber, David J; Rosen, Jennifer B; Motala, Tasneem; Mascola, Laurene; Eason, Jeffery; Scherzinger, Karen; Holtzman, Corinne; Reingold, Arthur L; Barnes, Meghan; Petit, Susan; Farley, Monica M; Harrison, Lee H; Zansky, Shelley; Thomas, Ann; Schaffner, William; McGee, Lesley; Whitney, Cynthia G; Moore, Matthew R
2017-01-01
Objectives External validity, or generalisability, is the measure of how well results from a study pertain to individuals in the target population. We assessed generalisability, with respect to socioeconomic status, of estimates from a matched case–control study of 13-valent pneumococcal conjugate vaccine effectiveness for the prevention of invasive pneumococcal disease in children in the USA. Design Matched case–control study. Setting Thirteen active surveillance sites for invasive pneumococcal disease in the USA. Participants Cases were identified from active surveillance and controls were age and zip code matched. Outcome measures Socioeconomic status was assessed at the individual level via parent interview (for enrolled individuals only) and birth certificate data (for both enrolled and unenrolled individuals) and at the neighbourhood level by geocoding to the census tract (for both enrolled and unenrolled individuals). Prediction models were used to determine if socioeconomic status was associated with enrolment. Results We enrolled 54.6% of 1211 eligible cases and found a trend toward enrolled cases being more affluent than unenrolled cases. Enrolled cases were slightly more likely to have private insurance at birth (p=0.08) and have mothers with at least some college education (p<0.01). Enrolled cases also tended to come from more affluent census tracts. Despite these differences, our best predictive model for enrolment yielded a concordance statistic of only 0.703, indicating mediocre predictive value. Variables retained in the final model were assessed for effect measure modification, and none were found to be significant modifiers of vaccine effectiveness. Conclusions We conclude that although enrolled cases are somewhat more affluent than unenrolled cases, our estimates are externally valid with respect to socioeconomic status. Our analysis provides evidence that this study design can yield valid estimates and the assessing generalisability of observational data is feasible, even when unenrolled individuals cannot be contacted. PMID:28851801
Validation of the German Version of the Social Functioning Scale (SFS) for schizophrenia.
Iffland, Jona R; Lockhofen, Denise; Gruppe, Harald; Gallhofer, Bernd; Sammer, Gebhard; Hanewald, Bernd
2015-01-01
Deficits in social functioning are a core symptom of schizophrenia and an important criterion for evaluating the success of treatment. However, there is little agreement regarding its measurement. A common, often cited instrument for assessing self-reported social functioning is the Social Functioning Scale (SFS). The study aimed to investigate the reliability and validity of the German translation. 101 patients suffering from schizophrenia (SZ) and 101 matched controls (C) (60 male / 41 female, 35.8 years in both groups) completed the German version. In addition, demographic, clinical, and functional data were collected. Internal consistency was investigated calculating Cronbach's alpha for SFS full scale (α: .81) and all subscales (α: .59-.88). Significant bivariate correlation coefficients were found between all subscales as well as between all subscales and full scale (p <.01). For the total sample, principal component analysis gave evidence to prefer a single-factor solution (eigenvalue ≥ 1) accounting for 48.5 % of the variance. For the subsamples, a two-component solution (SZ; 57.0 %) and a three-component solution (C; 65.6 %) fitted best, respectively. For SZ and C, significant associations were found between SFS and external criteria. The main factor "group" emerged as being significant. C showed higher values on both subscales and full scale. The sensitivity of the SFS was examined using discriminant analysis. 86.5% of the participants could be categorized correctly to their actual group. The German translation of the SFS turned out to be a reliable and valid questionnaire comparable to the original English version. This is in line with Spanish and Norwegian translations of the SFS. Concluding, the German version of the SFS is well suited to become a useful and practicable instrument for the assessment of social functioning in both clinical practice and research. It accomplishes commonly used external assessment scales.
O'Mahony, Constantinos; Jichi, Fatima; Ommen, Steve R; Christiaans, Imke; Arbustini, Eloisa; Garcia-Pavia, Pablo; Cecchi, Franco; Olivotto, Iacopo; Kitaoka, Hiroaki; Gotsman, Israel; Carr-White, Gerald; Mogensen, Jens; Antoniades, Loizos; Mohiddin, Saidi A; Maurer, Mathew S; Tang, Hak Chiaw; Geske, Jeffrey B; Siontis, Konstantinos C; Mahmoud, Karim D; Vermeer, Alexa; Wilde, Arthur; Favalli, Valentina; Guttmann, Oliver P; Gallego-Delgado, Maria; Dominguez, Fernando; Tanini, Ilaria; Kubo, Toru; Keren, Andre; Bueser, Teofila; Waters, Sarah; Issa, Issa F; Malcolmson, James; Burns, Tom; Sekhri, Neha; Hoeger, Christopher W; Omar, Rumana Z; Elliott, Perry M
2018-03-06
Identification of people with hypertrophic cardiomyopathy (HCM) who are at risk of sudden cardiac death (SCD) and require a prophylactic implantable cardioverter defibrillator is challenging. In 2014, the European Society of Cardiology proposed a new risk stratification method based on a risk prediction model (HCM Risk-SCD) that estimates the 5-year risk of SCD. The aim was to externally validate the 2014 European Society of Cardiology recommendations in a geographically diverse cohort of patients recruited from the United States, Europe, the Middle East, and Asia. This was an observational, retrospective, longitudinal cohort study. The cohort consisted of 3703 patients. Seventy three (2%) patients reached the SCD end point within 5 years of follow-up (5-year incidence, 2.4% [95% confidence interval {CI}, 1.9-3.0]). The validation study revealed a calibration slope of 1.02 (95% CI, 0.93-1.12), C-index of 0.70 (95% CI, 0.68-0.72), and D-statistic of 1.17 (95% CI, 1.05-1.29). In a complete case analysis (n= 2147; 44 SCD end points at 5 years), patients with a predicted 5-year risk of <4% (n=1524; 71%) had an observed 5-year SCD incidence of 1.4% (95% CI, 0.8-2.2); patients with a predicted risk of ≥6% (n=297; 14%) had an observed SCD incidence of 8.9% (95% CI, 5.96-13.1) at 5 years. For every 13 (297/23) implantable cardioverter defibrillator implantations in patients with an estimated 5-year SCD risk ≥6%, 1 patient can potentially be saved from SCD. This study confirms that the HCM Risk-SCD model provides accurate prognostic information that can be used to target implantable cardioverter defibrillator therapy in patients at the highest risk of SCD. © 2017 American Heart Association, Inc.
Collective Evidence for Inverse Compton Emission from External Photons in High-Power Blazars
NASA Technical Reports Server (NTRS)
Meyer, Eileen T.; Fossati, Giovanni; Georganopoulos, Markos; Lister, Matthew L.
2012-01-01
We present the first collective evidence that Fermi-detected jets of high kinetic power (L(sub kin)) are dominated by inverse Compton emission from upscattered external photons. Using a sample with a broad range in orientation angle, including radio galaxies and blazars, we find that very high power sources (L(sub kin) > 10(exp 45.5) erg/s) show a significant increase in the ratio of inverse Compton to synchrotron power (Compton dominance) with decreasing orientation angle, as measured by the radio core dominance and confirmed by the distribution of superluminal speeds. This increase is consistent with beaming expectations for external Compton (EC) emission, but not for synchrotron self Compton (SSC) emission. For the lowest power jets (L(sub kin) < 10(exp 43.5) erg /s), no trend between Compton and radio core dominance is found, consistent with SSC. Importantly, the EC trend is not seen for moderately high power flat spectrum radio quasars with strong external photon fields. Coupled with the evidence that jet power is linked to the jet speed, this finding suggests that external photon fields become the dominant source of seed photons in the jet comoving frame only for the faster and therefore more powerful jets.
Propagation of Data Dependency through Distributed Cooperating Processes
1988-09-01
12 The External Data Dependency Analyzer ( EDDA ) .................................................. 12 The new EPL...47 EDDA Patch Files for the Dining Philosophers Example [Figure 23] ................... 49 L im itations...dependencies is evident. The External Data Dependency Analyzer ( EDDA ) The EDDA derives external data dependencies by performing two levels of analysis
Chen, Jing; Wang, Shu-Mei; Meng, Jiang; Sun, Fei; Liang, Sheng-Wang
2013-05-01
To establish a new method for quality evaluation and validate its feasibilities by simultaneous quantitative assay of five alkaloids in Sophora flavescens. The new quality evaluation method, quantitative analysis of multi-components by single marker (QAMS), was established and validated with S. flavescens. Five main alkaloids, oxymatrine, sophocarpine, matrine, oxysophocarpine and sophoridine, were selected as analytes to evaluate the quality of rhizome of S. flavescens, and the relative correction factor has good repeatibility. Their contents in 21 batches of samples, collected from different areas, were determined by both external standard method and QAMS. The method was evaluated by comparison of the quantitative results between external standard method and QAMS. No significant differences were found in the quantitative results of five alkaloids in 21 batches of S. flavescens determined by external standard method and QAMS. It is feasible and suitable to evaluate the quality of rhizome of S. flavescens by QAMS.
Burris, Silas E.; Brown, Danielle D.
2014-01-01
Narratives, also called stories, can be found in conversations, children's play interactions, reading material, and television programs. From infancy to adulthood, narrative comprehension processes interpret events and inform our understanding of physical and social environments. These processes have been extensively studied to ascertain the multifaceted nature of narrative comprehension. From this research we know that three overlapping processes (i.e., knowledge integration, goal structure understanding, and causal inference generation) proposed by the constructionist paradigm are necessary for narrative comprehension, narrative comprehension has a predictive relationship with children's later reading performance, and comprehension processes are generalizable to other contexts. Much of the previous research has emphasized internal and predictive validity; thus, limiting the generalizability of previous findings. We are concerned these limitations may be excluding underrepresented populations from benefits and implications identified by early comprehension processes research. This review identifies gaps in extant literature regarding external validity and argues for increased emphasis on externally valid research. We highlight limited research on narrative comprehension processes in children from low-income and minority populations, and argue for changes in comprehension assessments. Specifically, we argue both on- and off-line assessments should be used across various narrative types (e.g., picture books, televised narratives) with traditionally underserved and underrepresented populations. We propose increasing the generalizability of narrative comprehension processes research can inform persistent reading achievement gaps, and have practical implications for how children learn from narratives. PMID:24659973
Afshar, Majid; Press, Valerie G; Robison, Rachel G; Kho, Abel N; Bandi, Sindhura; Biswas, Ashvini; Avila, Pedro C; Kumar, Harsha Vardhan Madan; Yu, Byung; Naureckas, Edward T; Nyenhuis, Sharmilee M; Codispoti, Christopher D
2017-10-13
Comprehensive, rapid, and accurate identification of patients with asthma for clinical care and engagement in research efforts is needed. The original development and validation of a computable phenotype for asthma case identification occurred at a single institution in Chicago and demonstrated excellent test characteristics. However, its application in a diverse payer mix, across different health systems and multiple electronic health record vendors, and in both children and adults was not examined. The objective of this study is to externally validate the computable phenotype across diverse Chicago institutions to accurately identify pediatric and adult patients with asthma. A cohort of 900 asthma and control patients was identified from the electronic health record between January 1, 2012 and November 30, 2014. Two physicians at each site independently reviewed the patient chart to annotate cases. The inter-observer reliability between the physician reviewers had a κ-coefficient of 0.95 (95% CI 0.93-0.97). The accuracy, sensitivity, specificity, negative predictive value, and positive predictive value of the computable phenotype were all above 94% in the full cohort. The excellent positive and negative predictive values in this multi-center external validation study establish a useful tool to identify asthma cases in in the electronic health record for research and care. This computable phenotype could be used in large-scale comparative-effectiveness trials.
Self-Other Knowledge Asymmetries in Personality Pathology
Carlson, Erika N.; Vazire, Simine; Oltmanns, Thomas F.
2012-01-01
Objective Self-reports of personality provide valid information about personality disorders (PDs). However, informant-reports provide information about PDs that self-reports alone do not provide. The current paper examines if and when one perspective is more valid than the other in identifying PDs. Method Using a representative sample of adults 55 to 65 year of age (N = 991; 45% males), we compared the validity of self- and informant- (e.g., spouse, family, or friend) reports of the FFM traits in predicting PD scores (i.e., composite of interviewer, self-, and informant-reports of PDs). Results Self-reports (particularly of neuroticism) were more valid than informant-reports for most internalizing PDs (i.e., PDs defined by high neuroticism). Informant-reports (particularly of agreeableness and conscientiousness) were more valid than self-reports for externalizing and/or antagonistic PDs (i.e., PDs defined by low agreeableness, conscientiousness). Neither report was consistently more valid for thought disorder PDs (i.e., PDs defined by low extraversion). However, informant-reports (particularly of agreeableness) were more valid than self-reports for PDs that were both internalizing and externalizing (i.e., PDs defined by high neuroticism and low agreeableness). Conclusions The intrapersonal and interpersonal manifestations of PDs differ, and these differences influence who knows more about pathology. PMID:22583054
Evaluation of the ADHD rating scale in youth with autism
Yerys, Benjamin E.; Nissley-Tsiopinis, Jenelle; de Marchena, Ashley; Watkins, Marley W.; Antezana, Ligia; Power, Thomas J.; Schultz, Robert T.
2016-01-01
Scientists and clinicians regularly use clinical screening tools for attention deficit/hyperactivity disorder (ADHD) to assess comorbidity without empirical evidence that these measures are valid in youth with autism spectrum disorder (ASD). We examined the prevalence of youth meeting ADHD criteria on the ADHD rating scale fourth edition (ADHD-RS-IV), the relationship of ADHD-RS-IV ratings with participant characteristics and behaviors, and its underlying factor structure in 386 7-17 year olds with ASD without intellectual disability. Expected parent prevalence rates, relationships with age and externalizing behaviors were observed, but confirmatory factor analyses revealed unsatisfactory fits for one-, two-, three-factor models. Exploratory analyses revealed several items cross-loading on multiple factors. Implications of screening ADHD in youth with ASD using current diagnostic criteria are discussed. PMID:27738853
Nelson, Lindsay D.; Patrick, Christopher J.; Bernat, Edward M.
2010-01-01
The externalizing dimension is viewed as a broad dispositional factor underlying risk for numerous disinhibitory disorders. Prior work has documented deficits in event-related brain potential (ERP) responses in individuals prone to externalizing problems. Here, we constructed a direct physiological index of externalizing vulnerability from three ERP indicators and evaluated its validity in relation to criterion measures in two distinct domains: psychometric and physiological. The index was derived from three ERP measures that covaried in their relations with externalizing proneness the error-related negativity and two variants of the P3. Scores on this ERP composite predicted psychometric criterion variables and accounted for externalizing-related variance in P3 response from a separate task. These findings illustrate how a diagnostic construct can be operationalized as a composite (multivariate) psychophysiological variable (phenotype). PMID:20573054
Dakanalis, Antonios; Bartoli, Francesco; Caslini, Manuela; Crocamo, Cristina; Zanetti, Maria Assunta; Riva, Giuseppe; Clerici, Massimo; Carrà, Giuseppe
2017-12-01
A new "severity specifier" for bulimia nervosa (BN), based on the frequency of inappropriate weight compensatory behaviours (IWCBs), was added to the DSM-5 as a means of documenting heterogeneity and variability in the severity of the disorder. Yet, evidence for its validity in clinical populations, including prognostic significance for treatment outcome, is currently lacking. Existing data from 281 treatment-seeking patients with DSM-5 BN, who received the best available treatment for their disorder (manual-based cognitive behavioural therapy; CBT) in an outpatient setting, were re-analysed to examine whether these patients subgrouped based on the DSM-5 severity levels would show meaningful and consistent differences on (a) a range of clinical variables assessed at pre-treatment and (b) post-treatment abstinence from IWCBs. Results highlight that the mild, moderate, severe, and extreme severity groups were statistically distinguishable on 22 variables assessed at pre-treatment regarding eating disorder pathological features, maintenance factors of BN, associated (current) and lifetime psychopathology, social maladjustment and illness-specific functional impairment, and abstinence outcome. Mood intolerance, a maintenance factor of BN but external to eating disorder pathological features (typically addressed within CBT), emerged as the primary clinical variable distinguishing the severity groups showing a differential treatment response. Overall, the findings speak to the concurrent and predictive validity of the new DSM-5 severity criterion for BN and are important because a common benchmark informing patients, clinicians, and researchers about severity of the disorder and allowing severity fluctuation and patient's progress to be tracked does not exist so far. Implications for future research are outlined.
The Practice and Products of Communication Inquiry and Education.
ERIC Educational Resources Information Center
Warren, Clay
1982-01-01
The ability to communicate effectively is fundamental to communication education. For internal validity, communication educators need to concentrate on knowledge-building (competence) and skills training (performance). For external validity, the speech communication discipline must establish a common understanding of its work and send clear…
Berry, Anne S.; Demeter, Elise; Sabhapathy, Surya; English, Brett A.; Blakely, Randy D.; Sarter, Martin; Lustig, Cindy
2015-01-01
Both the passage of time and external distraction make it difficult to keep attention on the task at hand. We tested the hypothesis that time-on-task and external distraction pose independent challenges to attention, and that the brain’s cholinergic system selectively modulates our ability to resist distraction. Participants with a polymorphism limiting cholinergic capacity (Ile89Val variant (rs1013940) of the choline transporter gene SLC5A7) and matched controls completed self-report measures of attention and a laboratory task that measured decrements in sustained attention with and without distraction. We found evidence that distraction and time-on-task effects are independent and that the cholinergic system is strongly linked to greater vulnerability to distraction. Ile89Val participants reported more distraction during everyday life than controls, and their task performance was more severely impacted by the presence of an ecologically valid video distractor (similar to a television playing in the background). These results are the first to demonstrate a specific impairment in cognitive control associated with the Ile89Val polymorphism, and add to behavioral and cognitive neuroscience studies indicating the cholinergic system’s critical role in overcoming distraction. PMID:24666128
Development and validation of a piloted simulation of a helicopter and external sling load
NASA Technical Reports Server (NTRS)
Shaughnessy, J. D.; Deaux, T. N.; Yenni, K. R.
1979-01-01
A generalized, real time, piloted, visual simulation of a single rotor helicopter, suspension system, and external load is described and validated for the full flight envelope of the U.S. Army CH-54 helicopter and cargo container as an example. The mathematical model described uses modified nonlinear classical rotor theory for both the main rotor and tail rotor, nonlinear fuselage aerodynamics, an elastic suspension system, nonlinear load aerodynamics, and a loadground contact model. The implementation of the mathematical model on a large digital computing system is described, and validation of the simulation is discussed. The mathematical model is validated by comparing measured flight data with simulated data, by comparing linearized system matrices, eigenvalues, and eigenvectors with manufacturers' data, and by the subjective comparison of handling characteristics by experienced pilots. A visual landing display system for use in simulation which generates the pilot's forward looking real world display was examined and a special head up, down looking load/landing zone display is described.
Chambers, David W
2010-01-01
Both panegyric and criticism of evidence-based dentistry tend to be clumsy because the concept is poorly defined. This analysis identifies several contributions to the profession that have been made under the EBD banner. Although the concept of clinicians integrating clinical epidemiology, the wisdom of their practices, and patients' values is powerful, its implementation has been distorted by a too heavy emphasis of computerized searches for research findings that meet the standards of academics. Although EBD advocates enjoy sharing anecdotal accounts of mistakes others have made, faulting others is not proof that one's own position is correct. There is no systematic, high-quality evidence that EBD is effective. The metaphor of a three-legged stool (evidence, experience, values, and integration) is used as an organizing principle. "Best evidence" has become a preoccupation among EBD enthusiasts. That overlong but thinly developed leg of the stool is critiqued from the perspectives of the criteria for evidence, the difference between internal and external validity, the relationship between evidence and decision making, the ambiguous meaning of "best," and the role of reasonable doubt. The strongest leg of the stool is clinical experience. Although bias exists in all observations (including searches for evidence), there are simple procedures that can be employed in practice to increase useful and objective evidence there, and there are dangers in delegating policy regarding allowable treatments to external groups. Patient and practitioner values are the shortest leg of the stool. As they are so little recognized, their integration in EBD is problematic and ethical tensions exist where paternalism privileges science over patient's self-determined best interests. Four potential approaches to integration are suggested, recognizing that there is virtually no literature on how the "seat" of the three-legged stool works or should work. It is likely that most dentists choose to wait for collective professional standards to reveal acceptable practice or follow a strategy of punctuated equilibrium, only switching out established practice habits when very conspicuous advantages are identified. Integration in medicine appears to follow the statistically sophisticated practice of updating estimates of clinical parameters (probabilities) for diagnoses, treatments, prognoses, and side-effects. This approach is likely beyond the skill or interest of clinical dentists and it fails to incorporate values in the integration. The use of decision trees to integrate both research and experiential parameters and values is illustrated and it is shown that such a technique identifies why there are very few cases in dentistry where evidence needs to be consulted and indicates what such cases are.
Validity evidence as a key marker of quality of technical skill assessment in OTL-HNS.
Labbé, Mathilde; Young, Meredith; Nguyen, Lily H P
2018-01-13
Quality monitoring of assessment practices should be a priority in all residency programs. Validity evidence is one of the main hallmarks of assessment quality and should be collected to support the interpretation and use of assessment data. Our objective was to identify, synthesize, and present the validity evidence reported supporting different technical skill assessment tools in otolaryngology-head and neck surgery (OTL-HNS). We performed a secondary analysis of data generated through a systematic review of all published tools for assessing technical skills in OTL-HNS (n = 16). For each tool, we coded validity evidence according to the five types of evidence described by the American Educational Research Association's interpretation of Messick's validity framework. Descriptive statistical analyses were conducted. All 16 tools included in our analysis were supported by internal structure and relationship to variables validity evidence. Eleven articles presented evidence supporting content. Response process was discussed only in one article, and no study reported on evidence exploring consequences. We present the validity evidence reported for 16 rater-based tools that could be used for work-based assessment of OTL-HNS residents in the operating room. The articles included in our review were consistently deficient in evidence for response process and consequences. Rater-based assessment tools that support high-stakes decisions that impact the learner and programs should include several sources of validity evidence. Thus, use of any assessment should be done with careful consideration of the context-specific validity evidence supporting score interpretation, and we encourage deliberate continual assessment quality-monitoring. NA. Laryngoscope, 2018. © 2018 The American Laryngological, Rhinological and Otological Society, Inc.
Zendejas, Benjamin; Ruparel, Raaj K; Cook, David A
2016-02-01
The Fundamentals of Laparoscopic Surgery (FLS) program uses five simulation stations (peg transfer, precision cutting, loop ligation, and suturing with extracorporeal and intracorporeal knot tying) to teach and assess laparoscopic surgery skills. We sought to summarize evidence regarding the validity of scores from the FLS assessment. We systematically searched for studies evaluating the FLS as an assessment tool (last search update February 26, 2013). We classified validity evidence using the currently standard validity framework (content, response process, internal structure, relations with other variables, and consequences). From a pool of 11,628 studies, we identified 23 studies reporting validity evidence for FLS scores. Studies involved residents (n = 19), practicing physicians (n = 17), and medical students (n = 8), in specialties of general (n = 17), gynecologic (n = 4), urologic (n = 1), and veterinary (n = 1) surgery. Evidence was most common in the form of relations with other variables (n = 22, most often expert-novice differences). Only three studies reported internal structure evidence (inter-rater or inter-station reliability), two studies reported content evidence (i.e., derivation of assessment elements), and three studies reported consequences evidence (definition of pass/fail thresholds). Evidence nearly always supported the validity of FLS total scores. However, the loop ligation task lacks discriminatory ability. Validity evidence confirms expected relations with other variables and acceptable inter-rater reliability, but other validity evidence is sparse. Given the high-stakes use of this assessment (required for board eligibility), we suggest that more validity evidence is required, especially to support its content (selection of tasks and scoring rubric) and the consequences (favorable and unfavorable impact) of assessment.
Wei, Feng; Hunley, Stanley C; Powell, John W; Haut, Roger C
2011-02-01
Recent studies, using two different manners of foot constraint, potted and taped, document altered failure characteristics in the human cadaver ankle under controlled external rotation of the foot. The posterior talofibular ligament (PTaFL) was commonly injured when the foot was constrained in potting material, while the frequency of deltoid ligament injury was higher for the taped foot. In this study an existing multibody computational modeling approach was validated to include the influence of foot constraint, determine the kinematics of the joint under external foot rotation, and consequently obtain strains in various ligaments. It was hypothesized that the location of ankle injury due to excessive levels of external foot rotation is a function of foot constraint. The results from this model simulation supported this hypothesis and helped to explain the mechanisms of injury in the cadaver experiments. An excessive external foot rotation might generate a PTaFL injury for a rigid foot constraint, and an anterior deltoid ligament injury for a pliant foot constraint. The computational models may be further developed and modified to simulate the human response for different shoe designs, as well as on various athletic shoe-surface interfaces, so as to provide a computational basis for optimizing athletic performance with minimal injury risk.
ERIC Educational Resources Information Center
Sefcik, Lesley; Bedford, Simon; Czech, Peter; Smith, Judith; Yorke, Jonathan
2018-01-01
External referencing of assessment and students' achievement standards is a growing priority area within higher education, which is being pressured by government requirements to evidence outcome attainment. External referencing benefits stakeholders connected to higher education by helping to assure that assessments and standards within courses…
Pannu, Neesh; Hemmelgarn, Brenda R.; Austin, Peter C.; Tan, Zhi; McArthur, Eric; Manns, Braden J.; Tonelli, Marcello; Wald, Ron; Quinn, Robert R.; Ravani, Pietro; Garg, Amit X.
2017-01-01
Importance Some patients will develop chronic kidney disease after a hospitalization with acute kidney injury; however, no risk-prediction tools have been developed to identify high-risk patients requiring follow-up. Objective To derive and validate predictive models for progression of acute kidney injury to advanced chronic kidney disease. Design, Setting, and Participants Data from 2 population-based cohorts of patients with a prehospitalization estimated glomerular filtration rate (eGFR) of more than 45 mL/min/1.73 m2 and who had survived hospitalization with acute kidney injury (defined by a serum creatinine increase during hospitalization > 0.3 mg/dL or > 50% of their prehospitalization baseline), were used to derive and validate multivariable prediction models. The risk models were derived from 9973 patients hospitalized in Alberta, Canada (April 2004-March 2014, with follow-up to March 2015). The risk models were externally validated with data from a cohort of 2761 patients hospitalized in Ontario, Canada (June 2004-March 2012, with follow-up to March 2013). Exposures Demographic, laboratory, and comorbidity variables measured prior to discharge. Main Outcomes and Measures Advanced chronic kidney disease was defined by a sustained reduction in eGFR less than 30 mL/min/1.73 m2 for at least 3 months during the year after discharge. All participants were followed up for up to 1 year. Results The participants (mean [SD] age, 66 [15] years in the derivation and internal validation cohorts and 69 [11] years in the external validation cohort; 40%-43% women per cohort) had a mean (SD) baseline serum creatinine level of 1.0 (0.2) mg/dL and more than 20% had stage 2 or 3 acute kidney injury. Advanced chronic kidney disease developed in 408 (2.7%) of 9973 patients in the derivation cohort and 62 (2.2%) of 2761 patients in the external validation cohort. In the derivation cohort, 6 variables were independently associated with the outcome: older age, female sex, higher baseline serum creatinine value, albuminuria, greater severity of acute kidney injury, and higher serum creatinine value at discharge. In the external validation cohort, a multivariable model including these 6 variables had a C statistic of 0.81 (95% CI, 0.75-0.86) and improved discrimination and reclassification compared with reduced models that included age, sex, and discharge serum creatinine value alone (integrated discrimination improvement, 2.6%; 95% CI, 1.1%-4.0%; categorical net reclassification index, 13.5%; 95% CI, 1.9%-25.1%) or included age, sex, and acute kidney injury stage alone (integrated discrimination improvement, 8.0%; 95% CI, 5.1%-11.0%; categorical net reclassification index, 79.9%; 95% CI, 60.9%-98.9%). Conclusions and Relevance A multivariable model using routine laboratory data was able to predict advanced chronic kidney disease following hospitalization with acute kidney injury. The utility of this model in clinical care requires further research. PMID:29136443
James, Matthew T; Pannu, Neesh; Hemmelgarn, Brenda R; Austin, Peter C; Tan, Zhi; McArthur, Eric; Manns, Braden J; Tonelli, Marcello; Wald, Ron; Quinn, Robert R; Ravani, Pietro; Garg, Amit X
2017-11-14
Some patients will develop chronic kidney disease after a hospitalization with acute kidney injury; however, no risk-prediction tools have been developed to identify high-risk patients requiring follow-up. To derive and validate predictive models for progression of acute kidney injury to advanced chronic kidney disease. Data from 2 population-based cohorts of patients with a prehospitalization estimated glomerular filtration rate (eGFR) of more than 45 mL/min/1.73 m2 and who had survived hospitalization with acute kidney injury (defined by a serum creatinine increase during hospitalization > 0.3 mg/dL or > 50% of their prehospitalization baseline), were used to derive and validate multivariable prediction models. The risk models were derived from 9973 patients hospitalized in Alberta, Canada (April 2004-March 2014, with follow-up to March 2015). The risk models were externally validated with data from a cohort of 2761 patients hospitalized in Ontario, Canada (June 2004-March 2012, with follow-up to March 2013). Demographic, laboratory, and comorbidity variables measured prior to discharge. Advanced chronic kidney disease was defined by a sustained reduction in eGFR less than 30 mL/min/1.73 m2 for at least 3 months during the year after discharge. All participants were followed up for up to 1 year. The participants (mean [SD] age, 66 [15] years in the derivation and internal validation cohorts and 69 [11] years in the external validation cohort; 40%-43% women per cohort) had a mean (SD) baseline serum creatinine level of 1.0 (0.2) mg/dL and more than 20% had stage 2 or 3 acute kidney injury. Advanced chronic kidney disease developed in 408 (2.7%) of 9973 patients in the derivation cohort and 62 (2.2%) of 2761 patients in the external validation cohort. In the derivation cohort, 6 variables were independently associated with the outcome: older age, female sex, higher baseline serum creatinine value, albuminuria, greater severity of acute kidney injury, and higher serum creatinine value at discharge. In the external validation cohort, a multivariable model including these 6 variables had a C statistic of 0.81 (95% CI, 0.75-0.86) and improved discrimination and reclassification compared with reduced models that included age, sex, and discharge serum creatinine value alone (integrated discrimination improvement, 2.6%; 95% CI, 1.1%-4.0%; categorical net reclassification index, 13.5%; 95% CI, 1.9%-25.1%) or included age, sex, and acute kidney injury stage alone (integrated discrimination improvement, 8.0%; 95% CI, 5.1%-11.0%; categorical net reclassification index, 79.9%; 95% CI, 60.9%-98.9%). A multivariable model using routine laboratory data was able to predict advanced chronic kidney disease following hospitalization with acute kidney injury. The utility of this model in clinical care requires further research.
Visentin, G; McDermott, A; McParland, S; Berry, D P; Kenny, O A; Brodkorb, A; Fenelon, M A; De Marchi, M
2015-09-01
Rapid, cost-effective monitoring of milk technological traits is a significant challenge for dairy industries specialized in cheese manufacturing. The objective of the present study was to investigate the ability of mid-infrared spectroscopy to predict rennet coagulation time, curd-firming time, curd firmness at 30 and 60min after rennet addition, heat coagulation time, casein micelle size, and pH in cow milk samples, and to quantify associations between these milk technological traits and conventional milk quality traits. Samples (n=713) were collected from 605 cows from multiple herds; the samples represented multiple breeds, stages of lactation, parities, and milking times. Reference analyses were undertaken in accordance with standardized methods, and mid-infrared spectra in the range of 900 to 5,000cm(-1) were available for all samples. Prediction models were developed using partial least squares regression, and prediction accuracy was based on both cross and external validation. The proportion of variance explained by the prediction models in external validation was greatest for pH (71%), followed by rennet coagulation time (55%) and milk heat coagulation time (46%). Models to predict curd firmness 60min from rennet addition and casein micelle size, however, were poor, explaining only 25 and 13%, respectively, of the total variance in each trait within external validation. On average, all prediction models tended to be unbiased. The linear regression coefficient of the reference value on the predicted value varied from 0.17 (casein micelle size regression model) to 0.83 (pH regression model) but all differed from 1. The ratio performance deviation of 1.07 (casein micelle size prediction model) to 1.79 (pH prediction model) for all prediction models in the external validation was <2, suggesting that none of the prediction models could be used for analytical purposes. With the exception of casein micelle size and curd firmness at 60min after rennet addition, the developed prediction models may be useful as a screening method, because the concordance correlation coefficient ranged from 0.63 (heat coagulation time prediction model) to 0.84 (pH prediction model) in the external validation. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Manterola, Carlos; Torres, Rodrigo; Burgos, Luis; Vial, Manuel; Pineda, Viviana
2006-07-01
Surgery is a curative treatment for gastric cancer (GC). As relapse is frequent, adjuvant therapies such as postoperative chemo radiotherapy have been tried. In Chile, some hospitals adopted Macdonald's study as a protocol for the treatment of GC. To determine methodological quality and internal and external validity of the Macdonald study. Three instruments were applied that assess methodological quality. A critical appraisal was done and the internal and external validity of the methodological quality was analyzed with two scales: MINCIR (Methodology and Research in Surgery), valid for therapy studies and CONSORT (Consolidated Standards of Reporting Trials), valid for randomized controlled trials (RCT). Guides and scales were applied by 5 researchers with training in clinical epidemiology. The reader's guide verified that the Macdonald study was not directed to answer a clearly defined question. There was random assignment, but the method used is not described and the patients were not considered until the end of the study (36% of the group with surgery plus chemo radiotherapy did not complete treatment). MINCIR scale confirmed a multicentric RCT, not blinded, with an unclear randomized sequence, erroneous sample size estimation, vague objectives and no exclusion criteria. CONSORT system proved the lack of working hypothesis and specific objectives as well as an absence of exclusion criteria and identification of the primary variable, an imprecise estimation of sample size, ambiguities in the randomization process, no blinding, an absence of statistical adjustment and the omission of a subgroup analysis. The instruments applied demonstrated methodological shortcomings that compromise the internal and external validity of the.
Piccioli, Andrea; Spinelli, M Silvia; Forsberg, Jonathan A; Wedin, Rikard; Healey, John H; Ippolito, Vincenzo; Daolio, Primo Andrea; Ruggieri, Pietro; Maccauro, Giulio; Gasbarrini, Alessandro; Biagini, Roberto; Piana, Raimondo; Fazioli, Flavio; Luzzati, Alessandro; Di Martino, Alberto; Nicolosi, Francesco; Camnasio, Francesco; Rosa, Michele Attilio; Campanacci, Domenico Andrea; Denaro, Vincenzo; Capanna, Rodolfo
2015-05-22
We recently developed a clinical decision support tool, capable of estimating the likelihood of survival at 3 and 12 months following surgery for patients with operable skeletal metastases. After making it publicly available on www.PATHFx.org , we attempted to externally validate it using independent, international data. We collected data from patients treated at 13 Italian orthopaedic oncology referral centers between 2010 and 2013, then applied to PATHFx, which generated a probability of survival at three and 12-months for each patient. We assessed accuracy using the area under the receiver-operating characteristic curve (AUC), clinical utility using Decision Curve Analysis (DCA), and compared the Italian patient data to the training set (United States) and first external validation set (Scandinavia). The Italian dataset contained 287 records with at least 12 months follow-up information. The AUCs for the three-month and 12-month estimates was 0.80 and 0.77, respectively. There were missing data, including the surgeon's estimate of survival that was missing in the majority of records. Physiologically, Italian patients were similar to patients in the training and first validation sets. However notable differences were observed in the proportion of those surviving three and 12-months, suggesting differences in referral patterns and perhaps indications for surgery. PATHFx was successfully validated in an Italian dataset containing missing data. This study demonstrates its broad applicability to European patients, even in centers with differing treatment philosophies from those previously studied.
AXIN2 expression predicts prostate cancer recurrence and regulates invasion and tumor growth.
Hu, Brian R; Fairey, Adrian S; Madhav, Anisha; Yang, Dongyun; Li, Meng; Groshen, Susan; Stephens, Craig; Kim, Philip H; Virk, Navneet; Wang, Lina; Martin, Sue Ellen; Erho, Nicholas; Davicioni, Elai; Jenkins, Robert B; Den, Robert B; Xu, Tong; Xu, Yucheng; Gill, Inderbir S; Quinn, David I; Goldkorn, Amir
2016-05-01
Treatment of prostate cancer (PCa) may be improved by identifying biological mechanisms of tumor growth that directly impact clinical disease progression. We investigated whether genes associated with a highly tumorigenic, drug resistant, progenitor phenotype impact PCa biology and recurrence. Radical prostatectomy (RP) specimens (±disease recurrence, N = 276) were analyzed by qRT-PCR to quantify expression of genes associated with self-renewal, drug resistance, and tumorigenicity in prior studies. Associations between gene expression and PCa recurrence were confirmed by bootstrap internal validation and by external validation in independent cohorts (total N = 675) and in silico. siRNA knockdown and lentiviral overexpression were used to determine the effect of gene expression on PCa invasion, proliferation, and tumor growth. Four candidate genes were differentially expressed in PCa recurrence. Of these, low AXIN2 expression was internally validated in the discovery cohort. Validation in external cohorts and in silico demonstrated that low AXIN2 was independently associated with more aggressive PCa, biochemical recurrence, and metastasis-free survival after RP. Functionally, siRNA-mediated depletion of AXIN2 significantly increased invasiveness, proliferation, and tumor growth. Conversely, ectopic overexpression of AXIN2 significantly reduced invasiveness, proliferation, and tumor growth. Low AXIN2 expression was associated with PCa recurrence after RP in our test population as well as in external validation cohorts, and its expression levels in PCa cells significantly impacted invasiveness, proliferation, and tumor growth. Given these novel roles, further study of AXIN2 in PCa may yield promising new predictive and therapeutic strategies. © 2016 Wiley Periodicals, Inc.
Modeling Liver-Related Adverse Effects of Drugs Using kNN QSAR Method
Rodgers, Amie D.; Zhu, Hao; Fourches, Dennis; Rusyn, Ivan; Tropsha, Alexander
2010-01-01
Adverse effects of drugs (AEDs) continue to be a major cause of drug withdrawals both in development and post-marketing. While liver-related AEDs are a major concern for drug safety, there are few in silico models for predicting human liver toxicity for drug candidates. We have applied the Quantitative Structure Activity Relationship (QSAR) approach to model liver AEDs. In this study, we aimed to construct a QSAR model capable of binary classification (active vs. inactive) of drugs for liver AEDs based on chemical structure. To build QSAR models, we have employed an FDA spontaneous reporting database of human liver AEDs (elevations in activity of serum liver enzymes), which contains data on approximately 500 approved drugs. Approximately 200 compounds with wide clinical data coverage, structural similarity and balanced (40/60) active/inactive ratio were selected for modeling and divided into multiple training/test and external validation sets. QSAR models were developed using the k nearest neighbor method and validated using external datasets. Models with high sensitivity (>73%) and specificity (>94%) for prediction of liver AEDs in external validation sets were developed. To test applicability of the models, three chemical databases (World Drug Index, Prestwick Chemical Library, and Biowisdom Liver Intelligence Module) were screened in silico and the validity of predictions was determined, where possible, by comparing model-based classification with assertions in publicly available literature. Validated QSAR models of liver AEDs based on the data from the FDA spontaneous reporting system can be employed as sensitive and specific predictors of AEDs in pre-clinical screening of drug candidates for potential hepatotoxicity in humans. PMID:20192250
Kutlay, Sehim; Kuçukdeveci, Ayse A; Elhan, Atilla H; Yavuzer, Gunes; Tennant, Alan
2007-02-28
Assessment of cognitive impairment with a valid cognitive screening tool is essential in neurorehabilitation. The aim of this study was to test the reliability and validity of the Turkish-adapted version of the Middlesex Elderly Assessment of Mental State (MEAMS) among acquired brain injury patients in Turkey. Some 155 patients with acquired brain injury admitted for rehabilitation were assessed by the adapted version of MEAMS at admission and discharge. Reliability was tested by internal consistency, intra-class correlation coefficient (ICC) and person separation index; internal construct validity by Rasch analysis; external construct validity by associations with physical and cognitive disability (FIM); and responsiveness by Effect Size. Reliability was found to be good with Cronbach's alpha of 0.82 at both admission and discharge; and likewise an ICC of 0.80. Person separation index was 0.813. Internal construct validity was good by fit of the data to the Rasch model (mean item fit -0.178; SD 1.019). Items were substantially free of differential item functioning. External construct validity was confirmed by expected associations with physical and cognitive disability. Effect size was 0.42 compared with 0.22 for cognitive FIM. The reliability and validity of the Turkish version of MEAMS as a cognitive impairment screening tool in acquired brain injury has been demonstrated.
Reliability and Validity of Composite Scores from the NIH Toolbox Cognition Battery in Adults
Heaton, Robert K.; Akshoomoff, Natacha; Tulsky, David; Mungas, Dan; Weintraub, Sandra; Dikmen, Sureyya; Beaumont, Jennifer; Casaletto, Kaitlin B.; Conway, Kevin; Slotkin, Jerry; Gershon, Richard
2014-01-01
This study describes psychometric properties of the NIH Toolbox Cognition Battery (NIHTB-CB) Composite Scores in an adult sample. The NIHTB-CB was designed for use in epidemiologic studies and clinical trials for ages 3 to 85. A total of 268 self-described healthy adults were recruited at four university-based sites, using stratified sampling guidelines to target demographic variability for age (20–85 years), gender, education, and ethnicity. The NIHTB-CB contains seven computer-based instruments assessing five cognitive sub-domains: Language, Executive Function, Episodic Memory, Processing Speed, and Working Memory. Participants completed the NIHTB-CB, corresponding gold standard validation measures selected to tap the same cognitive abilities, and sociodemographic questionnaires. Three Composite Scores were derived for both the NIHTB-CB and gold standard batteries: “Crystallized Cognition Composite,” “Fluid Cognition Composite,” and “Total Cognition Composite” scores. NIHTB Composite Scores showed acceptable internal consistency (Cronbach’s alphas = 0.84 Crystallized, 0.83 Fluid, 0.77 Total), excellent test–retest reliability (r: 0.86–0.92), strong convergent (r: 0.78–0.90) and discriminant (r: 0.19–0.39) validities versus gold standard composites, and expected age effects (r = 0.18 crystallized, r = − 0.68 fluid, r = − 0.26 total). Significant relationships with self-reported prior school difficulties and current health status, employment, and presence of a disability provided evidence of external validity. The NIH Toolbox Cognition Battery Composite Scores have excellent reliability and validity, suggesting they can be used effectively in epidemiologic and clinical studies. PMID:24960398
Implementing the undergraduate mini-CEX: a tailored approach at Southampton University.
Hill, Faith; Kendall, Kathleen; Galbraith, Kevin; Crossley, Jim
2009-04-01
The mini-clinical evaluation exercise (mini-CEX) is widely used in the UK to assess clinical competence, but there is little evidence regarding its implementation in the undergraduate setting. This study aimed to estimate the validity and reliability of the undergraduate mini-CEX and discuss the challenges involved in its implementation. A total of 3499 mini-CEX forms were completed. Validity was assessed by estimating associations between mini-CEX score and a number of external variables, examining the internal structure of the instrument, checking competency domain response rates and profiles against expectations, and by qualitative evaluation of stakeholder interviews. Reliability was evaluated by overall reliability coefficient (R), estimation of the standard error of measurement (SEM), and from stakeholders' perceptions. Variance component analysis examined the contribution of relevant factors to students' scores. Validity was threatened by various confounding variables, including: examiner status; case complexity; attachment specialty; patient gender, and case focus. Factor analysis suggested that competency domains reflect a single latent variable. Maximum reliability can be achieved by aggregating scores over 15 encounters (R = 0.73; 95% confidence interval [CI] +/- 0.28 based on a 6-point assessment scale). Examiner stringency contributed 29% of score variation and student attachment aptitude 13%. Stakeholder interviews revealed staff development needs but the majority perceived the mini-CEX as more reliable and valid than the previous long case. The mini-CEX has good overall utility for assessing aspects of the clinical encounter in an undergraduate setting. Strengths include fidelity, wide sampling, perceived validity, and formative observation and feedback. Reliability is limited by variable examiner stringency, and validity by confounding variables, but these should be viewed within the context of overall assessment strategies.
Kahraman, Turhan; Özdoğar, Asiye Tuba; Honan, Cynthia Alison; Ertekin, Özge; Özakbaş, Serkan
2018-05-09
To linguistically and culturally adapt the Multiple Sclerosis Work Difficulties Questionnaire-23 (MSWDQ-23) for use in Turkey, and to examine its reliability and validity. Following standard forward-back translation of the MSWDQ-23, it was administered to 124 people with multiple sclerosis (MS). Validity was evaluated using related outcome measures including those related to employment status and expectations, disability level, fatigue, walking, and quality of life. Randomly selected participants were asked to complete the MSWDQ-23 again to assess test-retest reliability. Confirmatory factor analysis on the MSWDQ-23 demonstrated a good fit for the data, and the internal consistency of each subscale was excellent. The test-retest reliability for the total score, psychological/cognitive barriers, physical barriers, and external barriers subscales were high. The MSWDQ-23 and its subscales were positively correlated with the employment, disability level, walking, and fatigue outcome measures. This study suggests that the Turkish version of MSWDQ-23 has high reliability and adequate validity, and it can be used to determine the difficulties faced by people with multiple sclerosis in workplace. Moreover, the study provides evidence about the test-retest reliability of the questionnaire. Implications for rehabilitation Multiple sclerosis affects young people of working age. Understanding work-related problems is crucial to enhance people with multiple sclerosis likelihood of maintaining their job. The Multiple Sclerosis Work Difficulties Questionnaire-23 (MSWDQ-23) is a valid and reliable measure of perceived workplace difficulties in people with multiple sclerosis: we presented its validation to Turkish. Professionals working in the field of vocational rehabilitation may benefit from using the MSWDQ-23 to predict the current work outcomes and future employment expectations.
Predictive and Incremental Validity of Global and Domain-Based Adolescent Life Satisfaction Reports
ERIC Educational Resources Information Center
Haranin, Emily C.; Huebner, E. Scott; Suldo, Shannon M.
2007-01-01
Concurrent, predictive, and incremental validity of global and domain-based adolescent life satisfaction reports are examined with respect to internalizing and externalizing behavior problems. The Students' Life Satisfaction Scale (SLSS), Multidimensional Students' Life Satisfaction Scale (MSLSS), and measures of internalizing and externalizing…
Theory of Self- vs. Externally-Regulated LearningTM: Fundamentals, Evidence, and Applicability.
de la Fuente-Arias, Jesús
2017-01-01
The Theory of Self- vs. Externally-Regulated Learning TM has integrated the variables of SRL theory, the DEDEPRO model, and the 3P model. This new Theory has proposed: (a) in general, the importance of the cyclical model of individual self-regulation (SR) and of external regulation stemming from the context (ER), as two different and complementary variables, both in combination and in interaction; (b) specifically, in the teaching-learning context, the relevance of different types of combinations between levels of self-regulation (SR) and of external regulation (ER) in the prediction of self-regulated learning (SRL), and of cognitive-emotional achievement. This review analyzes the assumptions, conceptual elements, empirical evidence, benefits and limitations of SRL vs. ERL Theory . Finally, professional fields of application and future lines of research are suggested.
Validity and reliability of naturalistic driving scene categorization Judgments from crowdsourcing.
Cabrall, Christopher D D; Lu, Zhenji; Kyriakidis, Miltos; Manca, Laura; Dijksterhuis, Chris; Happee, Riender; de Winter, Joost
2018-05-01
A common challenge with processing naturalistic driving data is that humans may need to categorize great volumes of recorded visual information. By means of the online platform CrowdFlower, we investigated the potential of crowdsourcing to categorize driving scene features (i.e., presence of other road users, straight road segments, etc.) at greater scale than a single person or a small team of researchers would be capable of. In total, 200 workers from 46 different countries participated in 1.5days. Validity and reliability were examined, both with and without embedding researcher generated control questions via the CrowdFlower mechanism known as Gold Test Questions (GTQs). By employing GTQs, we found significantly more valid (accurate) and reliable (consistent) identification of driving scene items from external workers. Specifically, at a small scale CrowdFlower Job of 48 three-second video segments, an accuracy (i.e., relative to the ratings of a confederate researcher) of 91% on items was found with GTQs compared to 78% without. A difference in bias was found, where without GTQs, external workers returned more false positives than with GTQs. At a larger scale CrowdFlower Job making exclusive use of GTQs, 12,862 three-second video segments were released for annotation. Infeasible (and self-defeating) to check the accuracy of each at this scale, a random subset of 1012 categorizations was validated and returned similar levels of accuracy (95%). In the small scale Job, where full video segments were repeated in triplicate, the percentage of unanimous agreement on the items was found significantly more consistent when using GTQs (90%) than without them (65%). Additionally, in the larger scale Job (where a single second of a video segment was overlapped by ratings of three sequentially neighboring segments), a mean unanimity of 94% was obtained with validated-as-correct ratings and 91% with non-validated ratings. Because the video segments overlapped in full for the small scale Job, and in part for the larger scale Job, it should be noted that such reliability reported here may not be directly comparable. Nonetheless, such results are both indicative of high levels of obtained rating reliability. Overall, our results provide compelling evidence for CrowdFlower, via use of GTQs, being able to yield more accurate and consistent crowdsourced categorizations of naturalistic driving scene contents than when used without such a control mechanism. Such annotations in such short periods of time present a potentially powerful resource in driving research and driving automation development. Copyright © 2017 Elsevier Ltd. All rights reserved.
Simon, Steven L; Baverstock, Keith F; Lindholm, Carita
2003-06-01
The presently available evidence about the magnitude of doses received by members of the public living in villages in the vicinity of Semipalatinsk nuclear test in Kazakhstan, particularly with respect to external radiation, while preliminary, is conflicting. The village of Dolon, in particular, has been identified for many years as the most highly exposed location in the vicinity of the test site. Previous publications cited external doses of more than 2 Gy to residents of Dolon while an expert group assembled by the WHO in 1997 estimated that external doses were likely to have been less than 0.5 Gy. In 2001, a larger expert group workshop was held in Helsinki jointly by the WHO, the National Cancer Institute of the United States, and the Radiation and Nuclear Safety Authority of Finland, with the expressed purpose to acquire data to evaluate the state of knowledge concerning doses received in Kazakhstan. This paper summarizes evidence presented at that workshop. External dose estimates from calculations based on sparse physical measurements and bio-dosimetric estimates based on chromosome abnormalities and electron paramagnetic resonance from a relatively small sample of teeth do not agree well. The physical dose estimates are generally higher than the biodosimetric estimates (1 Gy or more compared to 0.5 Gy or less). When viewed in its entirety, the present body of evidence does not appear to support external doses greater than 0.5 Gy; however, research is continuing to try and resolve the difference in dose estimates from the different methods. Thyroid doses from internal irradiation, which can only be estimated via calculation, are expected to have been several times greater than the doses from external irradiation, especially where received by small children.
USDA-ARS?s Scientific Manuscript database
External morphological criteria that enable the rapid determination of gender have been developed for yellow perch (Perca flavescens). Criteria are based upon 1) shape of the urogenital papilla (UGP), 2) relative size of the UGP to the anal (AN) opening, and 3) coloration of the UGP. In females, t...
ERIC Educational Resources Information Center
Finnila, Katarina; Mahlberg, Nina; Santtila, Pekka; Sandnabba, Kenneth; Niemi, Pekka
2003-01-01
Examined the relative contributions of internal and external sources of variation in children's suggestibility in interrogative situations. Found that internal sources of individual differences in suggestibility measured on a suggestibility test did influence children's answers during an interview, but that external sources or interview styles had…
Wang, Shukui; Liu, Xiangxiang; Pan, Bei; Sun, Li; Chen, Xiaoxiang; Zeng, Kaixuan; Hu, Xiuxiu; Xu, Tao; Xu, Mu
2018-05-08
Colorectal cancer (CRC) is one of the most common cancers worldwide usually with poor prognosis due to the advanced stage when diagnosed. This study aimed to investigate whether specific circulating exosomal miRNAs could act as biomarkers for early diagnosis of CRC. A total of 369 peripheral blood samples were included in this study. In the discovery phase, circulating exosomal miR-27a and miR-130a were selected after synthetical analysis of two GEO datasets and TCGA database. The differential expression and diagnostic utility of miR-27a and miR-130a panel were validated using quantitative reverse-transcriptase PCR (qRT-PCR) and Receiver operating characteristic (ROC) curve analysis in subsequent training phase, validation phase and external validation phase. The prognosis of circulating exosomal miR-27a and miR-130a were investigated using the Kaplan-Meier method. The expression of exosomal miR-27a and miR-130a in plasma significantly increased in CRC. The area under ROC curves (AUCs) of miR-27a (miR-130a) were 0.773 (0.742) in the training phase, 0.82 (0.787) in the validation phase, and 0.746 (0.697) in the external validation phase. The combination of two miRNAs presented higher diagnostic utility for CRC (AUCs = 0.846, 0.898 and 0.801 for the training, validation, and external validation phases, respectively). CRC patients with high expression of circulating exosomal miR-27a or miR-130a underwent poorer prognosis. We identified a circulating exosomal miRNAs panel for the detection of CRC. The exosomal miR-27a and miR-130a panel in plasma may act as a non-invasive biomarker for early detection and predicting prognosis of CRC. Copyright ©2018, American Association for Cancer Research.
NASA Astrophysics Data System (ADS)
Bertholet, Jenny; Toftegaard, Jakob; Hansen, Rune; Worm, Esben S.; Wan, Hanlin; Parikh, Parag J.; Weber, Britta; Høyer, Morten; Poulsen, Per R.
2018-03-01
The purpose of this study was to develop, validate and clinically demonstrate fully automatic tumour motion monitoring on a conventional linear accelerator by combined optical and sparse monoscopic imaging with kilovoltage x-rays (COSMIK). COSMIK combines auto-segmentation of implanted fiducial markers in cone-beam computed tomography (CBCT) projections and intra-treatment kV images with simultaneous streaming of an external motion signal. A pre-treatment CBCT is acquired with simultaneous recording of the motion of an external marker block on the abdomen. The 3-dimensional (3D) marker motion during the CBCT is estimated from the auto-segmented positions in the projections and used to optimize an external correlation model (ECM) of internal motion as a function of external motion. During treatment, the ECM estimates the internal motion from the external motion at 20 Hz. KV images are acquired every 3 s, auto-segmented, and used to update the ECM for baseline shifts between internal and external motion. The COSMIK method was validated using Calypso-recorded internal tumour motion with simultaneous camera-recorded external motion for 15 liver stereotactic body radiotherapy (SBRT) patients. The validation included phantom experiments and simulations hereof for 12 fractions and further simulations for 42 fractions. The simulations compared the accuracy of COSMIK with ECM-based monitoring without model updates and with model updates based on stereoscopic imaging as well as continuous kilovoltage intrafraction monitoring (KIM) at 10 Hz without an external signal. Clinical real-time tumour motion monitoring with COSMIK was performed offline for 14 liver SBRT patients (41 fractions) and online for one patient (two fractions). The mean 3D root-mean-square error for the four monitoring methods was 1.61 mm (COSMIK), 2.31 mm (ECM without updates), 1.49 mm (ECM with stereoscopic updates) and 0.75 mm (KIM). COSMIK is the first combined kV/optical real-time motion monitoring method used clinically online on a conventional accelerator. COSMIK gives less imaging dose than KIM and is in addition applicable when the kV imager cannot be deployed such as during non-coplanar fields.
Vagos, Paula; Ribeiro da Silva, Diana; Brazão, Nélio; Rijo, Daniel; Gilbert, Paul
2017-05-01
This work presents psychometric analyses on the Early Memories of Warmth and Safeness Scale, which intends to evaluate the subjective perception of ones' early rearing experiences. Factor structure, measurement invariance, latent mean comparisons and validity in relation to external variables (i.e., forms of self-criticism/self-assurance, experiential avoidance and depressive, anxious and stress symptoms) were investigated. A sample of 1464 adolescents (52.3% male adolescents, mean age = 16.16, standard deviation = 1.51) was used, including 1064 participants recruited from schools, 192 participants recruited from foster care facilities and 208 boys recruited from juvenile justice facilities. A shortened version of the scale was also developed and subjected to the same psychometric analyses. A one-factor measurement model was a good fit for the data taken from both the complete and brief versions of the instrument. Such measures showed to be internally consistent with alpha values higher than 0.89. Evidence for their construct validity in relation to external variables was also found, with correlation values ranging from 0.19 to 0.45 for the complete version and from 0.18 to 0.44 for the brief version of the instrument. The brief version was the only one proving to be gender and sample invariant. Boys and girls scored similarly in their account of early memories, whereas community boys presented significantly higher scores when compared with referred and detained boys. Thus, the brief version of the instrument may be an appropriate alternative for use with diverse adolescent samples and may serve to advance knowledge on how early experiences impact on psychopathological outcomes. Copyright © 2016 John Wiley & Sons, Ltd. The Early Memories of Warmth and Safeness Scale (EMWSS), assessing early memories of warmth and safeness, was studied across community, referred for behavioural problems and detained Portuguese adolescent samples. A brief version of this instrument was also developed and studied in these same samples. Both versions of the EMWSS revealed a one-factor structure, good internal consistency and construct validity in relation to external variables; the brief version was also found to be invariant across gender and groups. Boys and girls were found to report similar levels of experienced warmth and safeness, whereas community boys reported significantly more of those experiences, followed by detained boys, and, lastly, referred boys. The brief version of the EMWSS represents a quick and valid measure to assess early memories of warmth and safeness in youth, providing for insights into the subjective experience of adolescents with diverse rearing experiences. Early memories of warmth and safeness, as assessed by the brief version of the EMWSS, may serve to advance knowledge on how early experiences impact on psychopathological outcomes in diverse youth samples. Copyright © 2016 John Wiley & Sons, Ltd.
The role of external evidence in data monitoring of a clinical trial.
Pocock, S J
1996-06-30
Data monitoring of interim results from a randomized clinical trial should take into consideration evidence from other trials. This article presents both scientific and practical issues regarding the pros and cons of formally incorporating such external evidence into the decision making process for the current trial. Guidelines on how to use other trials' data are presented, along with cautiously sceptical comments on the impracticality of using formal meta-analyses in data monitoring. The arguments are illustrated by recent examples from specific trials, and the article concludes with some general recommendations.
Lowe, Mary Martin; Bennett, Nancy; Aparicio, Alejandro
2009-03-01
The Agency for Healthcare Research and Quality (AHRQ) Evidence Report identified and assessed audience characteristics (internal factors) and external factors that influence the effectiveness of continuing medical education (CME) in changing physician behavior. Thirteen studies examined a series of CME audience characteristics (internal factors), and six studies looked at external factors to reinforce the effects of CME in changing behavior. With regard to CME audience characteristics, the 13 studies examined age, gender, practice setting, years in practice, specialty, foreign vs US medical graduate, country of practice, personal motivation, nonmonetary rewards and motivations, learning satisfaction, and knowledge enhancement. With regard to the external characteristics, the six studies looked at the role of regulation, state licensing boards, professional boards, hospital credentialing, external audits, monetary and financial rewards, academic advancement, provision of tools, public demand and expectations, and CME credit. No consistent findings were identified. The AHRQ Evidence Report provides no conclusions about the ways that internal or external factors influence CME effectiveness in changing physician behavior. However, given what is known about how individuals approach learning, it is likely that internal factors play an important role in the design of effective CME. Regulatory and professional organizations are providing new structures, mandates, and recommendations for CME activities that influence the way CME providers design and present activities, supporting a role that is not yet clear for external factors. More research is needed to understand the impact of these factors in enhancing the effectiveness of CME.
Ando, Yukako; Kataoka, Tsuyoshi; Okamura, Hitoshi; Tanaka, Katsutoshi; Kobayashi, Toshio
2013-12-01
The purpose of this research is to verify the reliability and validity of a job stressor scale for nurses caring for patients with intractable neurological diseases. A mail survey was conducted using a self-report questionnaire. The subjects were 263 nurses and assistant nurses working in wards specializing in intractable neurological diseases. The response rate was 71.9% (valid response rate, 66.2%). With regard to reliability, internal consistency and stability were assessed. Internal consistency was examined via Cronbach's alpha. For stability, the test-retest method was performed and stability was examined via intraclass correlation coefficients. With regard to validity, factor validity, criterion-related validity, and content validity were assessed. Exploratory factor analysis was used for factor validity. For criterion-related validity, an existing scale was used as an external criterion; concurrent validity was examined via Spearman's rank correlation coefficients. As a result of analysis, there were 26 items in the scale created with an eight factor structure. Cronbach's a for the 26 items was 0.90; with the exception of two factors, alpha for all of the individual sub-factors was high at 0.7 or higher. The intraclass correlation coefficient for the 26 items was 0.89 (p < 0.001). With regard to criterion-related validity, concurrent validity was confirmed and the correlation coefficient with an external criterion was 0.73 (p < 0.001). For content validity, subjects who responded that "The questionnaire represents a stressor well or to a degree" accounted for 81% of the total responses. Reliability and validity were confirmed, so the scale created in the current research is a usable scale.
Sandercock, Peter; Lindley, Richard; Wardlaw, Joanna; Dennis, Martin; Innes, Karen; Cohen, Geoff; Whiteley, Will; Perry, David; Soosay, Vera; Buchanan, David; Venables, Graham; Czlonkowska, Anna; Kobayashi, Adam; Berge, Eivind; Slot, Karsten Bruins; Murray, Veronica; Peeters, Andre; Hankey, Graeme J; Matz, Karl; Brainin, Michael; Ricci, Stefano; Cantisani, Teresa A; Gubitz, Gordon; Phillips, Stephen J; Antonio, Arauz; Correia, Manuel; Lyrer, Phillippe; Kane, Ingrid; Lundstrom, Erik
2011-11-30
Intravenous recombinant tissue plasminogen activator (rtPA) is approved in Europe for use in patients with acute ischaemic stroke who meet strictly defined criteria. IST-3 sought to improve the external validity and precision of the estimates of the overall treatment effects (efficacy and safety) of rtPA in acute ischaemic stroke, and to determine whether a wider range of patients might benefit. International, multi-centre, prospective, randomized, open, blinded endpoint (PROBE) trial of intravenous rtPA in acute ischaemic stroke. Suitable patients had to be assessed and able to start treatment within 6 hours of developing symptoms, and brain imaging must have excluded intracranial haemorrhage and stroke mimics. The initial pilot phase was double blind and then, on 01/08/2003, changed to an open design. Recruitment began on 05/05/2000 and closed on 31/07/2011, by which time 3035 patients had been included, only 61 (2%) of whom met the criteria for the 2003 European approval for thrombolysis. 1617 patients were aged over 80 years at trial entry. The analysis plan will be finalised, without reference to the unblinded data, and published before the trial data are unblinded in early 2012. The main trial results will be presented at the European Stroke Conference in Lisbon in May 2012 with the aim to publish simultaneously in a peer-reviewed journal. The trial result will be presented in the context of an updated Cochrane systematic review. We also intend to include the trial data in an individual patient data meta-analysis of all the relevant randomised trials. The data from the trial will: improve the external validity and precision of the estimates of the overall treatment effects (efficacy and safety) of iv rtPA in acute ischaemic stroke; provide: new evidence on the balance of risk and benefit of intravenous rtPA among types of patients who do not clearly meet the terms of the current EU approval; and, provide the first large-scale randomised evidence on effects in patients over 80, an age group which had largely been excluded from previous acute stroke trials. ISRCTN25765518.
Makady, Amr; van Veelen, Ard; Jonsson, Páll; Moseley, Owen; D'Andon, Anne; de Boer, Anthonius; Hillege, Hans; Klungel, Olaf; Goettsch, Wim
2018-03-01
Reimbursement decisions are conventionally based on evidence from randomised controlled trials (RCTs), which often have high internal validity but low external validity. Real-world data (RWD) may provide complimentary evidence for relative effectiveness assessments (REAs) and cost-effectiveness assessments (CEAs). This study examines whether RWD is incorporated in health technology assessment (HTA) of melanoma drugs by European HTA agencies, as well as differences in RWD use between agencies and across time. HTA reports published between 1 January 2011 and 31 December 2016 were retrieved from websites of agencies representing five jurisdictions: England [National Institute for Health and Care Excellence (NICE)], Scotland [Scottish Medicines Consortium (SMC)], France [Haute Autorité de santé (HAS)], Germany [Institute for Quality and Efficacy in Healthcare (IQWiG)] and The Netherlands [Zorginstituut Nederland (ZIN)]. A standardized data extraction form was used to extract information on RWD inclusion for both REAs and CEAs. Overall, 52 reports were retrieved, all of which contained REAs; CEAs were present in 25 of the reports. RWD was included in 28 of the 52 REAs (54%), mainly to estimate melanoma prevalence, and in 22 of the 25 (88%) CEAs, mainly to extrapolate long-term effectiveness and/or identify drug-related costs. Differences emerged between agencies regarding RWD use in REAs; the ZIN and IQWiG cited RWD for evidence on prevalence, whereas the NICE, SMC and HAS additionally cited RWD use for drug effectiveness. No visible trend for RWD use in REAs and CEAs over time was observed. In general, RWD inclusion was higher in CEAs than REAs, and was mostly used to estimate melanoma prevalence in REAs or to predict long-term effectiveness in CEAs. Differences emerged between agencies' use of RWD; however, no visible trends for RWD use over time were observed.
Examining the Assessment Literacy of External Examiners
ERIC Educational Resources Information Center
Medland, Emma
2015-01-01
External scrutiny of higher education courses is evident globally, but the use of an external examiner from another institution for the purposes of quality assurance has been a distinguishing feature of UK higher education since the 1830s. However, the changing higher education context has led to mounting criticism of the system and the…
Temporal Stability and Convergent Validity of the Behavior Assessment System for Children.
ERIC Educational Resources Information Center
Merydith, Scott P.
2001-01-01
Assesses the temporal stability and convergent validity of the Behavioral Assessment System for Children (BASC). Teachers and parents rated kindergarten and first-grade students using BASC. Teachers were more stable in rating children's externalizing behaviors and attention problems. Discusses results in terms of the accuracy of information…
ERIC Educational Resources Information Center
Pekrun, Reinhard; Goetz, Thomas; Frenzel, Anne C.; Barchfeld, Petra; Perry, Raymond P.
2011-01-01
Aside from test anxiety scales, measurement instruments assessing students' achievement emotions are largely lacking. This article reports on the construction, reliability, internal validity, and external validity of the Achievement Emotions Questionnaire (AEQ) which is designed to assess various achievement emotions experienced by students in…
The Modified Cognitive Constructions Coding System: Reliability and Validity Assessments
ERIC Educational Resources Information Center
Moran, Galia S.; Diamond, Gary M.
2006-01-01
The cognitive constructions coding system (CCCS) was designed for coding client's expressed problem constructions on four dimensions: intrapersonal-interpersonal, internal-external, responsible-not responsible, and linear-circular. This study introduces, and examines the reliability and validity of, a modified version of the CCCS--a version that…
Integrating Validity Theory with Use of Measurement Instruments in Clinical Settings
Kelly, P Adam; O'Malley, Kimberly J; Kallen, Michael A; Ford, Marvella E
2005-01-01
Objective To present validity concepts in a conceptual framework useful for research in clinical settings. Principal Findings We present a three-level decision rubric for validating measurement instruments, to guide health services researchers step-by-step in gathering and evaluating validity evidence within their specific situation. We address construct precision, the capacity of an instrument to measure constructs it purports to measure and differentiate from other, unrelated constructs; quantification precision, the reliability of the instrument; and translation precision, the ability to generalize scores from an instrument across subjects from the same or similar populations. We illustrate with specific examples, such as an approach to validating a measurement instrument for veterans when prior evidence of instrument validity for this population does not exist. Conclusions Validity should be viewed as a property of the interpretations and uses of scores from an instrument, not of the instrument itself: how scores are used and the consequences of this use are integral to validity. Our advice is to liken validation to building a court case, including discovering evidence, weighing the evidence, and recognizing when the evidence is weak and more evidence is needed. PMID:16178998
Development and validation of the Stirling Eating Disorder Scales.
Williams, G J; Power, K G; Miller, H R; Freeman, C P; Yellowlees, A; Dowds, T; Walker, M; Parry-Jones, W L
1994-07-01
The development and reliability/validity check of an 80-item, 8-scale measure for use with eating disorder patients is presented. The Stirling Eating Disorder Scales (SEDS) assess anorexic dietary behavior, anorexic dietary cognitions, bulimic dietary behavior, bulimic dietary cognitions, high perceived external control, low assertiveness, low self-esteem, and self-directed hostility. The SEDS were administered to 82 eating disorder patients and 85 controls. Results indicate that the SEDS are acceptable in terms of internal consistency, reliability, group validity, and concurrent validity.
Grigoryeva, Evgeniya S; Haylock, Richard G E; Pikulina, Maria V; Moseeva, Maria B
2015-01-01
Objective: Incidence and mortality from ischaemic heart disease (IHD) was studied in an extended cohort of 22,377 workers first employed at the Mayak Production Association during 1948–82 and followed up to the end of 2008. Methods: Relative risks and excess relative risks per unit dose (ERR/Gy) were calculated based on the maximum likelihood using Epicure software (Hirosoft International Corporation, Seattle, WA). Dose estimates used in analyses were provided by an updated “Mayak Worker Dosimetry System—2008”. Results: A significant increasing linear trend in IHD incidence with total dose from external γ-rays was observed after having adjusted for non-radiation factors and dose from internal radiation {ERR/Gy = 0.10 [95% confidence interval (CI): 0.04 to 0.17]}. The pure quadratic model provided a better fit of the data than did the linear one. No significant association of IHD mortality with total dose from external γ-rays after having adjusted for non-radiation factors and dose from internal alpha radiation was observed in the study cohort [ERR/Gy = 0.06 (95% CI: <0 to 0.15)]. A significant increasing linear trend was observed in IHD mortality with total absorbed dose from internal alpha radiation to the liver after having adjusted for non-radiation factors and dose from external γ-rays in both the whole cohort [ERR/Gy = 0.21 (95% CI: 0.01 to 0.58)] and the subcohort of workers exposed at alpha dose <1.00 Gy [ERR/Gy = 1.08 (95% CI: 0.34 to 2.15)]. No association of IHD incidence with total dose from internal alpha radiation to the liver was found in the whole cohort after having adjusted for non-radiation factors and external gamma dose [ERR/Gy = 0.02 (95% CI: not available to 0.10)]. Statistically significant dose effect was revealed in the subcohort of workers exposed to internal alpha radiation at dose to the liver <1.00 Gy [ERR/Gy = 0.44 (95% CI: 0.09 to 0.85)]. Conclusion: This study provides strong evidence of IHD incidence and mortality association with external γ-ray exposure and some evidence of IHD incidence and mortality association with internal alpha-radiation exposure. Advances in knowledge: It is the first time the validity of internal radiation dose estimates has been shown to affect the risk of IHD incidence. PMID:26224431
Criterion Validity of the Child's Challenging Behavior Scale, Version 2 (CCBS-2).
Bourke-Taylor, Helen M; Cordier, Reinie; Pallant, Julie F
The Child's Challenging Behavior Scale, Version 2 (CCBS-2), measures maternal rating of a child's challenging behaviors that compromise maternal mental health. The CCBS-2, the Child Behavior Checklist (CBCL), and the Strengths and Difficulties Questionnaire (SDQ) were compared in a sample of typically developing young Australian children. Criterion validity was investigated by correlating the CCBS-2 with "gold standard" measures (CBCL and SDQ subscales). Data were collected in a cross-sectional survey of mothers (N = 336) of children ages 3-9 yr. Correlations with the CBCL externalizing subscales demonstrated moderate (ρ = .46) to strong (ρ = .66) correlations. Correlations with the SDQ externalizing behaviors subscales were moderate (ρ = .35) to strong (ρ = .60). The criterion validity established in this study strengthens the psychometric properties that support ongoing development of the CCBS-2 as an efficient tool that may identify children in need of further evaluation. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Outsourcing bioanalytical services at Janssen Research and Development: the sequel anno 2017.
Dillen, Lieve; Verhaeghe, Tom
2017-08-01
The strategy of outsourcing bioanalytical services at Janssen has been evolving over the last years and an update will be given on the recent changes in our processes. In 2016, all internal GLP-related activities were phased out and this decision lead to the re-orientation of the in-house bioanalytical activities. As a consequence, in-depth experience with the validated bioanalytical assays for new drug candidates is currently gained together with the external partner, since development and validation of the assay and execution of GLP preclinical studies are now transferred to the CRO. The evolution to externalize more bioanalytical support has created opportunities to build even stronger partnerships with the CROs and to refocus internal resources. Case studies are presented illustrating challenges encountered during method development and validation at preferred partners when limited internal experience is obtained or with introduction of new technology.
Pupillometric evidence for the decoupling of attention from perceptual input during offline thought.
Smallwood, Jonathan; Brown, Kevin S; Tipper, Christine; Giesbrecht, Barry; Franklin, Michael S; Mrazek, Michael D; Carlson, Jean M; Schooler, Jonathan W
2011-03-25
Accumulating evidence suggests that the brain can efficiently process both external and internal information. The processing of internal information is a distinct "offline" cognitive mode that requires not only spontaneously generated mental activity; it has also been hypothesized to require a decoupling of attention from perception in order to separate competing streams of internal and external information. This process of decoupling is potentially adaptive because it could prevent unimportant external events from disrupting an internal train of thought. Here, we use measurements of pupil diameter (PD) to provide concrete evidence for the role of decoupling during spontaneous cognitive activity. First, during periods conducive to offline thought but not during periods of task focus, PD exhibited spontaneous activity decoupled from task events. Second, periods requiring external task focus were characterized by large task evoked changes in PD; in contrast, encoding failures were preceded by episodes of high spontaneous baseline PD activity. Finally, high spontaneous PD activity also occurred prior to only the slowest 20% of correct responses, suggesting high baseline PD indexes a distinct mode of cognitive functioning. Together, these data are consistent with the decoupling hypothesis, which suggests that the capacity for spontaneous cognitive activity depends upon minimizing disruptions from the external world.
An observational examination of the literature in diagnostic anatomic pathology.
Foucar, Elliott; Wick, Mark R
2005-05-01
Original research published in the medical literature confronts the reader with three very basic and closely linked questions--are the authors' conclusions true in the contextual setting in which the work was performed (internally valid); if so, are the conclusions also applicable in other practice settings (externally valid); and, if the conclusions of the study are bona fide, do they represent an important contribution to medical practice or are they true-but-insignificant? Most publications attempt to convince readers that the researchers' conclusions are both internally valid and important, and occasionally papers also directly address external validity. Developing standardized methods to facilitate the prospective determination of research importance would be useful to both journals and their readers, but has proven difficult. In contrast, the evidence-based medicine (EBM) movement has had more success with understanding and codifying factors thought to promote research validity. Of the many variables that can influence research validity, research design is the one that has received the most attention. The present paper reviews the contributions of EBM to understanding research validity, looking for areas where EBM's body of knowledge is applicable to the anatomic pathology (AP) literature. As part of this project, the authors performed a pilot observational analysis of a representative sample of the current pertinent literature on diagnostic tissue pathology. The results of that review showed that most of the latter publications employ one of the four categories of "observational" research design that have been delineated by the EBM movement, and that the most common of these observational designs is a "cross-sectional" comparison. Pathologists do not presently use the "experimental" research designs so admired by advocates of EBM. Slightly > 50% of AP observational studies employed statistical evaluations to support their final conclusions. Comparison of the current AP literature with a selected group of papers published in 1977 shows a discernible change over that period that has affected not just technological procedures, but also research design and use of statistics. Although we feel that advocates of EBM deserve credit for bringing attention to the close link between research design and research validity, much of the EBM effort has centered on refining "experimental" methodology, and the complexities of observational research have often been treated in an inappropriately dismissive manner. For advocates of EBM, an observational study is what you are relegated to as a second choice when you are unable to do an experimental study. The latter viewpoint may be true for evaluating new chemotherapeutic agents, but is unacceptable to pathologists, whose research advances are currently completely dependent on well-conducted observational research. Rather than succumb to randomization envy and accept EBM's assertion that observational research is second best, the challenge to AP is to develop and adhere to standards for observational research that will allow our patients to benefit from the full potential of this time tested approach to developing valid insights into disease.
2014-01-01
Background Health impairments can result in disability and changed work productivity imposing considerable costs for the employee, employer and society as a whole. A large number of instruments exist to measure health-related productivity changes; however their methodological quality remains unclear. This systematic review critically appraised the measurement properties in generic self-reported instruments that measure health-related productivity changes to recommend appropriate instruments for use in occupational and economic health practice. Methods PubMed, PsycINFO, Econlit and Embase were systematically searched for studies whereof: (i) instruments measured health-related productivity changes; (ii) the aim was to evaluate instrument measurement properties; (iii) instruments were generic; (iv) ratings were self-reported; (v) full-texts were available. Next, methodological quality appraisal was based on COSMIN elements: (i) internal consistency; (ii) reliability; (iii) measurement error; (iv) content validity; (v) structural validity; (vi) hypotheses testing; (vii) cross-cultural validity; (viii) criterion validity; and (ix) responsiveness. Recommendations are based on evidence syntheses. Results This review included 25 articles assessing the reliability, validity and responsiveness of 15 different generic self-reported instruments measuring health-related productivity changes. Most studies evaluated criterion validity, none evaluated cross-cultural validity and information on measurement error is lacking. The Work Limitation Questionnaire (WLQ) was most frequently evaluated with moderate respectively strong positive evidence for content and structural validity and negative evidence for reliability, hypothesis testing and responsiveness. Less frequently evaluated, the Stanford Presenteeism Scale (SPS) showed strong positive evidence for internal consistency and structural validity, and moderate positive evidence for hypotheses testing and criterion validity. The Productivity and Disease Questionnaire (PRODISQ) yielded strong positive evidence for content validity, evidence for other properties is lacking. The other instruments resulted in mostly fair-to-poor quality ratings with limited evidence. Conclusions Decisions based on the content of the instrument, usage purpose, target country and population, and available evidence are recommended. Until high-quality studies are in place to accurately assess the measurement properties of the currently available instruments, the WLQ and, in a Dutch context, the PRODISQ are cautiously preferred based on its strong positive evidence for content validity. Based on its strong positive evidence for internal consistency and structural validity, the SPS is cautiously recommended. PMID:24495301
Zendjidjian, X Y; Auquier, P; Lançon, C; Loundou, A; Parola, N; Faugère, M; Boyer, L
2015-01-01
The aim of our study was to develop a specific French self-administered instrument for measuring hospitalized patients' satisfaction in psychiatry based on exclusive patient point of view: the SATISPSY-22. The development of the SATISPSY was undertaken in three steps: item generation, item reduction, and validation. The content of the SATISPSY was derived from 80 interviews with patients hospitalized in psychiatry. Using item response and classical test theories, item reduction was performed in 2 hospitals on 270 responders. The validation was based on construct validity, reliability, and some aspects of external validity. The SATISPSY contains 22 items describing 6 dimensions (staff, quality of care, personal experience, information, activity, and food). The six-factor structure accounted for 78.0% of the total variance. Each item achieved the 0.40 standard for item-internal consistency, and the Cronbach's alpha coefficients were>0.70. Scores of dimensions were strongly positively correlated with Visual Analogue Scale scores. Significant associations with socioeconomic and clinical indicators showed good discriminant and external validity. INFIT statistics were ranged from 0.71 to 1.25. The SATISPSY-22 presents satisfactory psychometric properties, enabling patient feedback to be incorporated in a continuous quality health care improvement strategy. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Shen, Jenny I; Lum, Erik L; Chang, Tara I
2016-09-01
Because large randomized clinical trials (RCTs) in dialysis have been relatively scarce, evidence-based dialysis care has depended heavily on the results of observational studies. However, when results from RCTs appear to contradict the findings of observational studies, nephrologists are left to wonder which type of study they should believe. In this editorial, we explore the key differences between observational studies and RCTs in the context of such seemingly conflicting studies in dialysis. Confounding is the major limitation of observational studies, whereas low statistical power and problems with external validity are more likely to limit the findings of RCTs. Differences in the specification of the population, exposure, and outcomes can also contribute to different results among RCTs and observational studies. Rigorous methods are required regardless of what type of study is conducted, and readers should not automatically assume that one type of study design is superior to the other. Ultimately, dialysis care requires both well-designed, well-conducted observational studies and RCTs to move the field forward. © 2016 Wiley Periodicals, Inc.
Shen, Jenny I.; Lum, Erik L.; Chang, Tara I.
2016-01-01
Because large randomized clinical trials (RCTs) in dialysis have been relatively scarce, evidence-based dialysis care has depended heavily on the results of observational studies. However, when results from RCTs appear to contradict the findings of observational studies, nephrologists are left to wonder which type of study they should believe. In this editorial we explore the key differences between observational studies and RCTs in the context of such seemingly conflicting studies in dialysis. Confounding is the major limitation of observational studies, while low statistical power and problems with external validity are more likely to limit the findings of RCTs. Differences in the specification of the population, exposure, and outcomes can also contribute to different results among RCTs and observational studies. Rigorous methods are required regardless of what type of study is conducted, and readers should not automatically assume that one type of study design is superior to the other. Ultimately, dialysis care requires both well-designed, well-conducted observational studies and RCTs to move the field forward. PMID:27207819
Methodology, Methods, and Metrics for Testing and Evaluating Augmented Cognition Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greitzer, Frank L.
The augmented cognition research community seeks cognitive neuroscience-based solutions to improve warfighter performance by applying and managing mitigation strategies to reduce workload and improve the throughput and quality of decisions. The focus of augmented cognition mitigation research is to define, demonstrate, and exploit neuroscience and behavioral measures that support inferences about the warfighter’s cognitive state that prescribe the nature and timing of mitigation. A research challenge is to develop valid evaluation methodologies, metrics and measures to assess the impact of augmented cognition mitigations. Two considerations are external validity, which is the extent to which the results apply to operational contexts;more » and internal validity, which reflects the reliability of performance measures and the conclusions based on analysis of results. The scientific rigor of the research methodology employed in conducting empirical investigations largely affects the validity of the findings. External validity requirements also compel us to demonstrate operational significance of mitigations. Thus it is important to demonstrate effectiveness of mitigations under specific conditions. This chapter reviews some cognitive science and methodological considerations in designing augmented cognition research studies and associated human performance metrics and analysis methods to assess the impact of augmented cognition mitigations.« less
Validation of multisource electronic health record data: an application to blood transfusion data.
Hoeven, Loan R van; Bruijne, Martine C de; Kemper, Peter F; Koopman, Maria M W; Rondeel, Jan M M; Leyte, Anja; Koffijberg, Hendrik; Janssen, Mart P; Roes, Kit C B
2017-07-14
Although data from electronic health records (EHR) are often used for research purposes, systematic validation of these data prior to their use is not standard practice. Existing validation frameworks discuss validity concepts without translating these into practical implementation steps or addressing the potential influence of linking multiple sources. Therefore we developed a practical approach for validating routinely collected data from multiple sources and to apply it to a blood transfusion data warehouse to evaluate the usability in practice. The approach consists of identifying existing validation frameworks for EHR data or linked data, selecting validity concepts from these frameworks and establishing quantifiable validity outcomes for each concept. The approach distinguishes external validation concepts (e.g. concordance with external reports, previous literature and expert feedback) and internal consistency concepts which use expected associations within the dataset itself (e.g. completeness, uniformity and plausibility). In an example case, the selected concepts were applied to a transfusion dataset and specified in more detail. Application of the approach to a transfusion dataset resulted in a structured overview of data validity aspects. This allowed improvement of these aspects through further processing of the data and in some cases adjustment of the data extraction. For example, the proportion of transfused products that could not be linked to the corresponding issued products initially was 2.2% but could be improved by adjusting data extraction criteria to 0.17%. This stepwise approach for validating linked multisource data provides a basis for evaluating data quality and enhancing interpretation. When the process of data validation is adopted more broadly, this contributes to increased transparency and greater reliability of research based on routinely collected electronic health records.
Tisch, Anita
2015-03-01
In the examination of older employees' employability, one can distinguish between internal and external employability. Internal employability can be measured by individual employment stability, and external employability occurs when employees replace one employment relationship with another. Most studies focus on the personal skills and characteristics that are necessary to maintain employability. However, external factors also contribute to individual employability. Therefore, this study examines which organisational attributes of firms contribute to older employees' employability in Germany. Taking firm and individual characteristics into account, the results of discrete-time survival models show that in specific organisational structures, older employees have higher internal employability. Accordingly, older employees are more likely to maintain employment in the service sector and in recruiting organisations facing (skilled) labour shortages. However, the results also indicate that financially investing organisations facilitate early labour market exits. With regard to older employees' external employability, the results show only little evidence indicating an association between organisational attributes of firms and the likelihood of job change.
DeCruze, B; Guthrie, D
1999-01-01
Poor prognosis (poorly differentiated and/or deep myometrial invasion) Stage I endometrial cancer can have a relapse rate as high as 50%. Traditionally, most clinical oncologists treat these patients with external beam radiotherapy after surgery but there is no evidence to show that this improves survival. The retrospective study looks at the results of not giving external beam radiotherapy in 25 consecutive patients and compares the results with a group of 13 consecutive patients who did have such treatment. The two groups were comparable with regard to age, degree of differentiation and degree of invasion. Survival was comparable in the two groups. There is no evidence of any obvious decrease in survival from withholding external beam radiotherapy, but this was not a prospective randomized controlled trial. This study illustrates that it is essential that the Medical Research Council ASTEC trial should be supported because this will determine the true place of external beam radiotherapy in such patients.
External fixators in the treatment of midshaft clavicle non-unions: a systematic review.
Barlow, Tim; Upadhyay, Piyush; Barlow, David
2014-02-01
Non- or mal-union of the clavicle is reported to occur in up to 15 % of conservatively treated fractures: the purpose of this systematic review is to examine the evidence for the use of external fixation in the treatment of clavicular non-union. We performed a search of MEDLINE and Embase, including all papers using external fixators for the treatment of clavicular non-union. Four papers satisfied our eligibility criteria: three case series and one case-control study. Level of evidence and quality assessment scoring were performed using published methods. Due to the heterogeneity of the study populations and interventions, no attempt at meta-analysis was made. External fixation in hypertrophic non-union of the clavicle, but not atrophic non-union, appears to be a reasonable treatment option. A pragmatic, multicentre, randomised controlled trial comparing external fixation and open reduction internal fixation in the treatment of hypertrophic non-union of the clavicle would be valuable.
Guidance for updating clinical practice guidelines: a systematic review of methodological handbooks.
Vernooij, Robin W M; Sanabria, Andrea Juliana; Solà, Ivan; Alonso-Coello, Pablo; Martínez García, Laura
2014-01-02
Updating clinical practice guidelines (CPGs) is a crucial process for maintaining the validity of recommendations. Methodological handbooks should provide guidance on both developing and updating CPGs. However, little is known about the updating guidance provided by these handbooks. We conducted a systematic review to identify and describe the updating guidance provided by CPG methodological handbooks and included handbooks that provide updating guidance for CPGs. We searched in the Guidelines International Network library, US National Guidelines Clearinghouse and MEDLINE (PubMed) from 1966 to September 2013. Two authors independently selected the handbooks and extracted the data. We used descriptive statistics to analyze the extracted data and conducted a narrative synthesis. We included 35 handbooks. Most handbooks (97.1%) focus mainly on developing CPGs, including variable degrees of information about updating. Guidance on identifying new evidence and the methodology of assessing the need for an update is described in 11 (31.4%) and eight handbooks (22.8%), respectively. The period of time between two updates is described in 25 handbooks (71.4%), two to three years being the most frequent (40.0%). The majority of handbooks do not provide guidance for the literature search, evidence selection, assessment, synthesis, and external review of the updating process. Guidance for updating CPGs is poorly described in methodological handbooks. This guidance should be more rigorous and explicit. This could lead to a more optimal updating process, and, ultimately to valid trustworthy guidelines.
Rolston, John D; Han, Seunggu J; Chang, Edward F
2017-03-01
The American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) provides a rich database of North American surgical procedures and their complications. Yet no external source has validated the accuracy of the information within this database. Using records from the 2006 to 2013 NSQIP database, we used two methods to identify errors: (1) mismatches between the Current Procedural Terminology (CPT) code that was used to identify the surgical procedure, and the International Classification of Diseases (ICD-9) post-operative diagnosis: i.e., a diagnosis that is incompatible with a certain procedure. (2) Primary anesthetic and CPT code mismatching: i.e., anesthesia not indicated for a particular procedure. Analyzing data for movement disorders, epilepsy, and tumor resection, we found evidence of CPT code and postoperative diagnosis mismatches in 0.4-100% of cases, depending on the CPT code examined. When analyzing anesthetic data from brain tumor, epilepsy, trauma, and spine surgery, we found evidence of miscoded anesthesia in 0.1-0.8% of cases. National databases like NSQIP are an important tool for quality improvement. Yet all databases are subject to errors, and measures of internal consistency show that errors affect up to 100% of case records for certain procedures in NSQIP. Steps should be taken to improve data collection on the frontend of NSQIP, and also to ensure that future studies with NSQIP take steps to exclude erroneous cases from analysis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Teixidó, Mercè; Pallejà, Tomàs; Font, Davinia; Tresanchez, Marcel; Moreno, Javier; Palacín, Jordi
2012-11-28
This paper presents the use of an external fixed two-dimensional laser scanner to detect cylindrical targets attached to moving devices, such as a mobile robot. This proposal is based on the detection of circular markers in the raw data provided by the laser scanner by applying an algorithm for outlier avoidance and a least-squares circular fitting. Some experiments have been developed to empirically validate the proposal with different cylindrical targets in order to estimate the location and tracking errors achieved, which are generally less than 20 mm in the area covered by the laser sensor. As a result of the validation experiments, several error maps have been obtained in order to give an estimate of the uncertainty of any location computed. This proposal has been validated with a medium-sized mobile robot with an attached cylindrical target (diameter 200 mm). The trajectory of the mobile robot was estimated with an average location error of less than 15 mm, and the real location error in each individual circular fitting was similar to the error estimated with the obtained error maps. The radial area covered in this validation experiment was up to 10 m, a value that depends on the radius of the cylindrical target and the radial density of the distance range points provided by the laser scanner but this area can be increased by combining the information of additional external laser scanners.
Development and External Validation of a Prognostic Nomogram for Metastatic Uveal Melanoma
Valpione, Sara; Moser, Justin C.; Parrozzani, Raffaele; Bazzi, Marco; Mansfield, Aaron S.; Mocellin, Simone; Pigozzo, Jacopo; Midena, Edoardo; Markovic, Svetomir N.; Aliberti, Camillo; Campana, Luca G.; Chiarion-Sileni, Vanna
2015-01-01
Background Approximately 50% of patients with uveal melanoma (UM) will develop metastatic disease, usually involving the liver. The outcome of metastatic UM (mUM) is generally poor and no standard therapy has been established. Additionally, clinicians lack a validated prognostic tool to evaluate these patients. The aim of this work was to develop a reliable prognostic nomogram for clinicians. Patients and Methods Two cohorts of mUM patients, from Veneto Oncology Institute (IOV) (N=152) and Mayo Clinic (MC) (N=102), were analyzed to develop and externally validate, a prognostic nomogram. Results The median survival of mUM was 17.2 months in the IOV cohort and 19.7 in the MC cohort. Percentage of liver involvement (HR 1.6), elevated levels of serum LDH (HR 1.6), and a WHO performance status=1 (HR 1.5) or 2–3 (HR 4.6) were associated with worse prognosis. Longer disease-free interval from diagnosis of UM to that of mUM conferred a survival advantage (HR 0.9). The nomogram had a concordance probability of 0.75 (SE .006) in the development dataset (IOV), and 0.80 (SE .009) in the external validation (MC). Nomogram predictions were well calibrated. Conclusions The nomogram, which includes percentage of liver involvement, LDH levels, WHO performance status and disease free-interval accurately predicts the prognosis of mUM and could be useful for decision-making and risk stratification for clinical trials. PMID:25780931
[Study on Accurately Controlling Discharge Energy Method Used in External Defibrillator].
Song, Biao; Wang, Jianfei; Jin, Lian; Wu, Xiaomei
2016-01-01
This paper introduces a new method which controls discharge energy accurately. It is achieved by calculating target voltage based on transthoracic impedance and accurately controlling charging voltage and discharge pulse width. A new defibrillator is designed and programmed using this method. The test results show that this method is valid and applicable to all kinds of external defibrillators.
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Swogger, Emily D.; Schatschneider, Christopher; Menzies, Holly Mariah; Sanchez, Jeremy
2015-01-01
We report findings of a convergent validity study examining the internalizing subscale (SRSS-I5) of the newly adapted Student Risk Screening Scale for Internalizing and Externalizing (SRSS-IE12) with the internalizing subscale of the Teacher Report Form (TRF; Achenbach, 1991) conducted in 13 schools across three states with 195 kindergarten…
ERIC Educational Resources Information Center
De Meyer, Jotie; Soenens, Bart; Aelterman, Nathalie; De Bourdeaudhuij, Ilse; Haerens, Leen
2016-01-01
Background: In Self-Determination Theory (SDT), a well-validated macro-theory on human motivation, a distinction is made between internally controlling teaching practices (e.g. guilt-induction and shaming) and externally controlling practices (e.g. threats and punishments, commands). While both practices are said to undermine students' motivation,…
Theory of Self- vs. Externally-Regulated LearningTM: Fundamentals, Evidence, and Applicability
de la Fuente-Arias, Jesús
2017-01-01
The Theory of Self- vs. Externally-Regulated LearningTM has integrated the variables of SRL theory, the DEDEPRO model, and the 3P model. This new Theory has proposed: (a) in general, the importance of the cyclical model of individual self-regulation (SR) and of external regulation stemming from the context (ER), as two different and complementary variables, both in combination and in interaction; (b) specifically, in the teaching-learning context, the relevance of different types of combinations between levels of self-regulation (SR) and of external regulation (ER) in the prediction of self-regulated learning (SRL), and of cognitive-emotional achievement. This review analyzes the assumptions, conceptual elements, empirical evidence, benefits and limitations of SRL vs. ERL Theory. Finally, professional fields of application and future lines of research are suggested. PMID:29033872
NASA Astrophysics Data System (ADS)
Seregni, M.; Cerveri, P.; Riboldi, M.; Pella, A.; Baroni, G.
2012-11-01
In radiotherapy, organ motion mitigation by means of dynamic tumor tracking requires continuous information about the internal tumor position, which can be estimated relying on external/internal correlation models as a function of external surface surrogates. In this work, we propose a validation of a time-independent artificial neural networks-based tumor tracking method in the presence of changes in the breathing pattern, evaluating the performance on two datasets. First, simulated breathing motion traces were specifically generated to include gradually increasing respiratory irregularities. Then, seven publically available human liver motion traces were analyzed for the assessment of tracking accuracy, whose sensitivity with respect to the structural parameters of the model was also investigated. Results on simulated data showed that the proposed method was not affected by hysteretic target trajectories and it was able to cope with different respiratory irregularities, such as baseline drift and internal/external phase shift. The analysis of the liver motion traces reported an average RMS error equal to 1.10 mm, with five out of seven cases below 1 mm. In conclusion, this validation study proved that the proposed method is able to deal with respiratory irregularities both in controlled and real conditions.
External gear pumps operating with non-Newtonian fluids: Modelling and experimental validation
NASA Astrophysics Data System (ADS)
Rituraj, Fnu; Vacca, Andrea
2018-06-01
External Gear Pumps are used in various industries to pump non-Newtonian viscoelastic fluids like plastics, paints, inks, etc. For both design and analysis purposes, it is often a matter of interest to understand the features of the displacing action realized by meshing of the gears and the description of the behavior of the leakages for this kind of pumps. However, very limited work can be found in literature about methodologies suitable to model such phenomena. This article describes the technique of modelling external gear pumps that operate with non-Newtonian fluids. In particular, it explains how the displacing action of the unit can be modelled using a lumped parameter approach which involves dividing fluid domain into several control volumes and internal flow connections. This work is built upon the HYGESim simulation tool, conceived by the authors' research team in the last decade, which is for the first time extended for the simulation of non-Newtonian fluids. The article also describes several comparisons between simulation results and experimental data obtained from numerous experiments performed for validation of the presented methodology. Finally, operation of external gear pump with fluids having different viscosity characteristics is discussed.
External Validity of Contingent Valuation: Comparing Hypothetical and Actual Payments.
Ryan, Mandy; Mentzakis, Emmanouil; Jareinpituk, Suthi; Cairns, John
2017-11-01
Whilst contingent valuation is increasingly used in economics to value benefits, questions remain concerning its external validity that is do hypothetical responses match actual responses? We present results from the first within sample field test. Whilst Hypothetical No is always an Actual No, Hypothetical Yes exceed Actual Yes responses. A constant rate of response reversals across bids/prices could suggest theoretically consistent option value responses. Certainty calibrations (verbal and numerical response scales) minimise hypothetical-actual discrepancies offering a useful solution. Helping respondents resolve uncertainty may reduce the discrepancy between hypothetical and actual payments and thus lead to more accurate policy recommendations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
External validation of scoring instruments for evaluating pediatric resuscitation.
Levy, Arielle; Donoghue, Aaron; Bailey, Benoit; Thompson, Nathan; Jamoulle, Olivier; Gagnon, Robert; Gravel, Jocelyn
2014-12-01
Although many methods have been proposed to assess clinical performance during resuscitation, robust and generalizable metrics are still lacking. Further research is necessary to develop validated clinical performance assessment tools and show an improvement in outcomes after training. We aimed to establish evidence for validity of a previously published scoring instrument--the Clinical Performance Tool (CPT)--designed to evaluate clinical performance during simulated pediatric resuscitations. This was a prospective experimental trial performed in the simulation laboratory of a pediatric tertiary care facility, with a pretest/posttest design that assessed residents before and after pediatric advanced life support (PALS) certification. Thirteen postgraduate year 1 (PGY1) and 11 PGY3 pediatric residents completed 5 simulated pediatric resuscitation scenarios each during 2 consecutive sessions; between the 2 sessions, they completed a full PALS certification course. All sessions were video recorded. Sessions were scored by raters using the CPT; total scores were expressed as a percentage of maximum points possible for each scenario. Validity evidence was established and interpreted according to Messick's framework. Evidence regarding relations to other variables was assessed by calculating differences in scores between pre-PALS and post-PALS certification and PGY1 and PGY3 using a repeated-measures analysis of variance test. Internal structure evidence was established by assessing interrater reliability using intraclass correlation coefficients (ICCs) for each scenario, a G-study, and a variance component analysis of individual measurement facets (scenarios, raters, and occasions) and associated interactions. Overall scores for the entire study cohort improved by 10% after PALS training. Scores improved by 9.9% (95% confidence interval [CI], 4.5-15.4) for the pulseless nonshockable arrest (ICC, 0.85; 95% CI, 0.74-0.92), 14.6% (95% CI, 6.7-22.4) for the pulseless shockable arrest (ICC, 0.98; 95% CI, 0.96-0.99), 4.1% (95% CI, -4.5 to 12.8) for the dysrhythmias (ICC, 0.92; 95% CI, 0.87-0.96), 18.4% (95% CI, 9.7-27.1) for the respiratory scenario (ICC, 0.97; 95% CI, 0.95-0.98), and 5.3% (95% CI, -1.4 to 2.0) for the shock scenarios (ICC, 0.94; 95% CI, 0.90-0.97). There were no differences between PGY1 and PGY3 scores before or after the PALS course. Reliability of the instrument was acceptable as demonstrated by a mean ICC of 0.95 (95% CI, 0.94-0.96). The G-study coefficient was 0.94. Most variance could be attributed to the subject (57%). Interactions between subject and scenario and subject and occasion were 9.9% and 1.4%, respectively, and variance attributable to rater was minimal (0%). Pediatric residents improved scores on CPT after completion of a PALS course. Clinical Performance Tool scores are sensitive to the increase in skills and knowledge resulting from such a course but not to learners' levels. Validity evidence from scores for the CPT confirms implementation in new contexts and partially supports internal structure. More evidence is required to further support internal structure and especially to support relations with other variables and consequence evidence. Additional modifications should be made to the CPT before considering its use for high-stakes certification such as PALS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, X; Wang, J; Hu, W
Purpose: The Varian RapidPlan™ is a commercial knowledge-based optimization process which uses a set of clinically used treatment plans to train a model that can predict individualized dose-volume objectives. The purpose of this study is to evaluate the performance of RapidPlan to generate intensity modulated radiation therapy (IMRT) plans for cervical cancer. Methods: Totally 70 IMRT plans for cervical cancer with varying clinical and physiological indications were enrolled in this study. These patients were all previously treated in our institution. There were two prescription levels usually used in our institution: 45Gy/25 fractions and 50.4Gy/28 fractions. 50 of these plans weremore » selected to train the RapidPlan model for predicting dose-volume constraints. After model training, this model was validated with 10 plans from training pool(internal validation) and additional other 20 new plans(external validation). All plans used for the validation were re-optimized with the original beam configuration and the generated priorities from RapidPlan were manually adjusted to ensure that re-optimized DVH located in the range of the model prediction. DVH quantitative analysis was performed to compare the RapidPlan generated and the original manual optimized plans. Results: For all the validation cases, RapidPlan based plans (RapidPlan) showed similar or superior results compared to the manual optimized ones. RapidPlan increased the result of D98% and homogeneity in both two validations. For organs at risk, the RapidPlan decreased mean doses of bladder by 1.25Gy/1.13Gy (internal/external validation) on average, with p=0.12/p<0.01. The mean dose of rectum and bowel were also decreased by an average of 2.64Gy/0.83Gy and 0.66Gy/1.05Gy,with p<0.01/ p<0.01and p=0.04/<0.01 for the internal/external validation, respectively. Conclusion: The RapidPlan model based cervical cancer plans shows ability to systematically improve the IMRT plan quality. It suggests that RapidPlan has great potential to make the treatment planning process more efficient.« less
Psallidas, Ioannis; Kanellakis, Nikolaos I; Gerry, Stephen; Thézénas, Marie Laëtitia; Charles, Philip D; Samsonova, Anastasia; Schiller, Herbert B; Fischer, Roman; Asciak, Rachelle; Hallifax, Robert J; Mercer, Rachel; Dobson, Melissa; Dong, Tao; Pavord, Ian D; Collins, Gary S; Kessler, Benedikt M; Pass, Harvey I; Maskell, Nick; Stathopoulos, Georgios T; Rahman, Najib M
2018-06-13
The prevalence of malignant pleural effusion is increasing worldwide, but prognostic biomarkers to plan treatment and to understand the underlying mechanisms of disease progression remain unidentified. The PROMISE study was designed with the objectives to discover, validate, and prospectively assess biomarkers of survival and pleurodesis response in malignant pleural effusion and build a score that predicts survival. In this multicohort study, we used five separate and independent datasets from randomised controlled trials to investigate potential biomarkers of survival and pleurodesis. Mass spectrometry-based discovery was used to investigate pleural fluid samples for differential protein expression in patients from the discovery group with different survival and pleurodesis outcomes. Clinical, radiological, and biological variables were entered into least absolute shrinkage and selection operator regression to build a model that predicts 3-month mortality. We evaluated the model using internal and external validation. 17 biomarker candidates of survival and seven of pleurodesis were identified in the discovery dataset. Three independent datasets (n=502) were used for biomarker validation. All pleurodesis biomarkers failed, and gelsolin, macrophage migration inhibitory factor, versican, and tissue inhibitor of metalloproteinases 1 (TIMP1) emerged as accurate predictors of survival. Eight variables (haemoglobin, C-reactive protein, white blood cell count, Eastern Cooperative Oncology Group performance status, cancer type, pleural fluid TIMP1 concentrations, and previous chemotherapy or radiotherapy) were validated and used to develop a survival score. Internal validation with bootstrap resampling and external validation with 162 patients from two independent datasets showed good discrimination (C statistic values of 0·78 [95% CI 0·72-0·83] for internal validation and 0·89 [0·84-0·93] for external validation of the clinical PROMISE score). To our knowledge, the PROMISE score is the first prospectively validated prognostic model for malignant pleural effusion that combines biological and clinical parameters to accurately estimate 3-month mortality. It is a robust, clinically relevant prognostic score that can be applied immediately, provide important information on patient prognosis, and guide the selection of appropriate management strategies. European Respiratory Society, Medical Research Funding-University of Oxford, Slater & Gordon Research Fund, and Oxfordshire Health Services Research Committee Research Grants. Copyright © 2018 Elsevier Ltd. All rights reserved.
Akiyoshi, Takashi; Maeda, Hiromichi; Kashiwabara, Kosuke; Kanda, Mitsuro; Mayanagi, Shuhei; Aoyama, Toru; Hamada, Chikuma; Sadahiro, Sotaro; Fukunaga, Yosuke; Ueno, Masashi; Sakamoto, Junichi; Saji, Shigetoyo; Yoshikawa, Takaki
2017-01-01
Background Few prediction models have so far been developed and assessed for the prognosis of patients who undergo curative resection for colorectal cancer (CRC). Materials and Methods We prepared a clinical dataset including 5,530 patients who participated in three major randomized controlled trials as a training dataset and 2,263 consecutive patients who were treated at a cancer-specialized hospital as a validation dataset. All subjects underwent radical resection for CRC which was histologically diagnosed to be adenocarcinoma. The main outcomes that were predicted were the overall survival (OS) and disease free survival (DFS). The identification of the variables in this nomogram was based on a Cox regression analysis and the model performance was evaluated by Harrell's c-index. The calibration plot and its slope were also studied. For the external validation assessment, risk group stratification was employed. Results The multivariate Cox model identified variables; sex, age, pathological T and N factor, tumor location, size, lymphnode dissection, postoperative complications and adjuvant chemotherapy. The c-index was 0.72 (95% confidence interval [CI] 0.66-0.77) for the OS and 0.74 (95% CI 0.69-0.78) for the DFS. The proposed stratification in the risk groups demonstrated a significant distinction between the Kaplan–Meier curves for OS and DFS in the external validation dataset. Conclusions We established a clinically reliable nomogram to predict the OS and DFS in patients with CRC using large scale and reliable independent patient data from phase III randomized controlled trials. The external validity was also confirmed on the practical dataset. PMID:29228760
Ali, Syed F; Hubert, Gordian J; Switzer, Jeffrey A; Majersik, Jennifer J; Backhaus, Roland; Shepard, L Wylie; Vedala, Kishore; Schwamm, Lee H
2018-03-01
Up to 30% of acute stroke evaluations are deemed stroke mimics, and these are common in telestroke as well. We recently published a risk prediction score for use during telestroke encounters to differentiate stroke mimics from ischemic cerebrovascular disease derived and validated in the Partners TeleStroke Network. Using data from 3 distinct US and European telestroke networks, we sought to externally validate the TeleStroke Mimic (TM) score in a broader population. We evaluated the TM score in 1930 telestroke consults from the University of Utah, Georgia Regents University, and the German TeleMedical Project for Integrative Stroke Care Network. We report the area under the curve in receiver-operating characteristic curve analysis with 95% confidence interval for our previously derived TM score in which lower TM scores correspond with a higher likelihood of being a stroke mimic. Based on final diagnosis at the end of the telestroke consultation, there were 630 of 1930 (32.6%) stroke mimics in the external validation cohort. All 6 variables included in the score were significantly different between patients with ischemic cerebrovascular disease versus stroke mimics. The TM score performed well (area under curve, 0.72; 95% confidence interval, 0.70-0.73; P <0.001), similar to our prior external validation in the Partners National Telestroke Network. The TM score's ability to predict the presence of a stroke mimic during telestroke consultation in these diverse cohorts was similar to its performance in our original cohort. Predictive decision-support tools like the TM score may help highlight key clinical differences between mimics and patients with stroke during complex, time-critical telestroke evaluations. © 2018 American Heart Association, Inc.
Sachser, Cedric; Berliner, Lucy; Holt, Tonje; Jensen, Tine K; Jungbluth, Nathaniel; Risch, Elizabeth; Rosner, Rita; Goldbeck, Lutz
2017-03-01
Systematic screening is a powerful means by which children and adolescents with posttraumatic stress symptoms (PTSS) can be detected. Reliable and valid measures based on current diagnostic criteria are needed. To investigate the internal consistency and construct validity of the Child and Adolescent Trauma Screen (CATS) in three samples of trauma-exposed children in the US (self-reports: n=249; caregiver reports: n=267; pre-school n=190), in Germany (self-reports: n=117; caregiver reports: n=95) and in Norway (self-reports: n=109; caregiver reports: n=62). Internal consistency was calculated using Cronbach's α. Convergent-discriminant validity was investigated using bivariate correlation coefficients with measures of depression, anxiety and externalizing symptoms. CFA was used to investigate the DSM-5 factor structure. In all three language samples the 20 item symptom score of the self-report and the caregiver report proved good to excellent reliability with α ranging between .88 and .94. The convergent-discriminant validity pattern showed medium to strong correlations with measures of depression (r =.62-.82) and anxiety (r =.40-.77) and low to medium correlations with externalizing symptoms (r =-.15-.43) within informants in all language versions. Using CFA the underlying DSM-5 factor structure with four symptom clusters (re-experiencing, avoidance, negative alterations in mood and cognitions, hyperarousal) was supported (n =475 for self-report; n =424 for caregiver reports). The external validation of the CATS with a DSM-5 based semi-structured clinical interview and corresponding determination of cut-points is pending. The CATS has satisfactory psychometric properties. Clinicians may consider the CATS as a screening tool and for symptom monitoring. Copyright © 2016 Elsevier B.V. All rights reserved.
Brown, Jeremiah R; MacKenzie, Todd A; Maddox, Thomas M; Fly, James; Tsai, Thomas T; Plomondon, Mary E; Nielson, Christopher D; Siew, Edward D; Resnic, Frederic S; Baker, Clifton R; Rumsfeld, John S; Matheny, Michael E
2015-12-11
Acute kidney injury (AKI) occurs frequently after cardiac catheterization and percutaneous coronary intervention. Although a clinical risk model exists for percutaneous coronary intervention, no models exist for both procedures, nor do existing models account for risk factors prior to the index admission. We aimed to develop such a model for use in prospective automated surveillance programs in the Veterans Health Administration. We collected data on all patients undergoing cardiac catheterization or percutaneous coronary intervention in the Veterans Health Administration from January 01, 2009 to September 30, 2013, excluding patients with chronic dialysis, end-stage renal disease, renal transplant, and missing pre- and postprocedural creatinine measurement. We used 4 AKI definitions in model development and included risk factors from up to 1 year prior to the procedure and at presentation. We developed our prediction models for postprocedural AKI using the least absolute shrinkage and selection operator (LASSO) and internally validated using bootstrapping. We developed models using 115 633 angiogram procedures and externally validated using 27 905 procedures from a New England cohort. Models had cross-validated C-statistics of 0.74 (95% CI: 0.74-0.75) for AKI, 0.83 (95% CI: 0.82-0.84) for AKIN2, 0.74 (95% CI: 0.74-0.75) for contrast-induced nephropathy, and 0.89 (95% CI: 0.87-0.90) for dialysis. We developed a robust, externally validated clinical prediction model for AKI following cardiac catheterization or percutaneous coronary intervention to automatically identify high-risk patients before and immediately after a procedure in the Veterans Health Administration. Work is ongoing to incorporate these models into routine clinical practice. © 2015 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley Blackwell.
Epidemiology of bruxism in adults: a systematic review of the literature.
Manfredini, Daniele; Winocur, Ephraim; Guarda-Nardini, Luca; Paesani, Daniel; Lobbezoo, Frank
2013-01-01
To perform a systematic review of the literature dealing with the prevalence of bruxism in adult populations. A systematic search of the medical literature was performed to identify all peer-reviewed English-language papers dealing with the prevalence assessment of either awake or sleep bruxism at the general population level by the adoption of questionnaires, clinical assessments, and polysomnographic (PSG) or electromyographic (EMG) recordings. Quality assessment of the reviewed papers was performed according to the Methodological evaluation of Observational REsearch (MORE) checklist, which enables the identification of flaws in the external and internal validity. Cut-off criteria for an acceptable external validity were established to select studies for the discussion of prevalence data. For each included study, the sample features, diagnostic strategy, and prevalence of bruxism in relation to age, sex, and circadian rhythm, if available, were recorded. Thirty-five publications were included in the review. Several methodological problems limited the external validity of findings in most studies, and prevalence data extraction was performed only on seven papers. Of those, only one paper had a flaw less external validity, whilst internal validity was low in all the selected papers due to their self-reported bruxism diagnosis alone, mainly based on only one or two questionnaire items. No epidemiologic data were available from studies adopting other diagnostic strategies (eg, PSG, EMG). Generically identified "bruxism" was assessed in two studies reporting an 8% to 31.4% prevalence, awake bruxism was investigated in two studies describing a 22.1% to 31% prevalence, and prevalence of sleep bruxism was found to be more consistent across the three studies investigating the report of "frequent" bruxism (12.8% ± 3.1%). Bruxism activities were found to be unrelated to sex, and a decrease with age was described in elderly people. The present systematic review described variable prevalence data for bruxism activities. Findings must be interpreted with caution due to the poor methodological quality of the reviewed literature and to potential diagnostic bias related with having to rely on an individual's self-report of bruxism.
Jornet, Núria; Carrasco, Pablo; Beltrán, Mercè; Calvo, Juan Francisco; Escudé, Lluís; Hernández, Victor; Quera, Jaume; Sáez, Jordi
2014-09-01
We performed a multicentre intercomparison of IMRT optimisation and dose planning and IMRT pre-treatment verification methods and results. The aims were to check consistency between dose plans and to validate whether in-house pre-treatment verification results agreed with those of an external audit. Participating centres used two mock cases (prostate and head and neck) for the intercomparison and audit. Compliance to dosimetric goals and total number of MU per plan were collected. A simple quality index to compare the different plans was proposed. We compared gamma index pass rates using the centre's equipment and methodology to those of an external audit. While for the prostate case, all centres fulfilled the dosimetric goals and plan quality was homogeneous, that was not the case for the head and neck case. The number of MU did not correlate with the plan quality index. Pre-treatment verifications results of the external audit did not agree with those of the in-house measurements for two centres: being within tolerance for in-house measurements and unacceptable for the audit or the other way round. Although all plans fulfilled dosimetric constraints, plan quality is highly dependent on the planner expertise. External audits are an excellent tool to detect errors in IMRT implementation and cannot be replaced by intercomparison using results obtained by centres. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
[Again on language of biology].
Morchio, di Renzo
2004-01-01
Some time ago I proposed in an Editorial in this journal some considerations on the language of biology. I concluded that, to realize an autonomy of such a language (and therefore of biology), we have to develop a valid language for biology. In such a context, it seemed to me that the term "metaphors" referred to the concepts concerning the information carried by genetic code, was a reasonable one. However, Barbieri's article in this issue of Rivista di Biologia / Biology Forum calls for a reply. Of course, we do not know very much in this field, even if we have some evidence that a sequence of bases on a DNA is not determined only by chance. In any case we can exclude that nature in this occasion has "invented" a code. Nature doesn't "invent" anything: it only follows its rules, that we name "laws of nature". Barbieri quotes the Morse code, but forgets to say that such a code is "conventional" in the sense that it is valid only because it is the result of an "agreement" between Morse and the users of that code. There is nothing more unnatural than a "code": with whom nature should actually have to "reach an agreement"? As a matter of fact, we interpret as "information" what happens by law of nature. Also Barbieri's thesis that genes and proteins are molecular artifacts, assembled by external agents, whereas generally molecules are determined by their bonds, i.e. by internal factors, is a disputable one. It is examined how much an external structure plays a role in ordinary chemical reactions. The "information" of physics is not a semantic information. For such information we can refer to history of literature, telegraphic offices, genetics or biochemistry.
Craike, M; Hill, B; Gaskin, C J; Skouteris, H
2017-03-01
Physical activity (PA) during pregnancy has significant health benefits for the mother and her child; however, many women reduce their activity levels during pregnancy and most are not sufficiently active. Given the important health benefits of PA during pregnancy, evidence that supports research translation is vital. To determine the extent to which physical activity interventions for pregnant women report on internal and external validity factors using the RE-AIM framework (reach, efficacy/effectiveness, adoption, implementation, and maintenance). Ten databases were searched up to 1 June 2015. Eligible published papers and unpublished/grey literature were identified using relevant search terms. Studies had to report on physical activity interventions during pregnancy, including measures of physical activity during pregnancy at baseline and at least one point post-intervention. Randomised controlled trials and quasi-experimental studies that had a comparator group were included. Reporting of RE-AIM dimensions were summarised and synthesised across studies. The reach (72.1%) and efficacy/effectiveness (71.8%) dimensions were commonly reported; however, the implementation (28.9%) and adoption (23.2%) dimensions were less commonly reported and no studies reported on maintenance. This review highlights the under-reporting of issues of contextual factors in studies of physical activity during pregnancy. The translation of physical activity interventions during pregnancy could be improved through reporting of representativeness of participants, clearer reporting of outcomes, more detail on the setting and staff who deliver interventions, costing of interventions and the inclusion of process evaluations and qualitative data. The systematic review highlights the under-reporting of contextual factors in studies of physical activity during pregnancy. © 2016 Royal College of Obstetricians and Gynaecologists.
McDermott, A; Visentin, G; De Marchi, M; Berry, D P; Fenelon, M A; O'Connor, P M; Kenny, O A; McParland, S
2016-04-01
The aim of this study was to evaluate the effectiveness of mid-infrared spectroscopy in predicting milk protein and free amino acid (FAA) composition in bovine milk. Milk samples were collected from 7 Irish research herds and represented cows from a range of breeds, parities, and stages of lactation. Mid-infrared spectral data in the range of 900 to 5,000 cm(-1) were available for 730 milk samples; gold standard methods were used to quantify individual protein fractions and FAA of these samples with a view to predicting these gold standard protein fractions and FAA levels with available mid-infrared spectroscopy data. Separate prediction equations were developed for each trait using partial least squares regression; accuracy of prediction was assessed using both cross validation on a calibration data set (n=400 to 591 samples) and external validation on an independent data set (n=143 to 294 samples). The accuracy of prediction in external validation was the same irrespective of whether undertaken on the entire external validation data set or just within the Holstein-Friesian breed. The strongest coefficient of correlation obtained for protein fractions in external validation was 0.74, 0.69, and 0.67 for total casein, total β-lactoglobulin, and β-casein, respectively. Total proteins (i.e., total casein, total whey, and total lactoglobulin) were predicted with greater accuracy then their respective component traits; prediction accuracy using the infrared spectrum was superior to prediction using just milk protein concentration. Weak to moderate prediction accuracies were observed for FAA. The greatest coefficient of correlation in both cross validation and external validation was for Gly (0.75), indicating a moderate accuracy of prediction. Overall, the FAA prediction models overpredicted the gold standard values. Near-unity correlations existed between total casein and β-casein irrespective of whether the traits were based on the gold standard (0.92) or mid-infrared spectroscopy predictions (0.95). Weaker correlations among FAA were observed than the correlations among the protein fractions. Pearson correlations between gold standard protein fractions and the milk processing characteristics of rennet coagulation time, curd firming time, curd firmness, heat coagulating time, pH, and casein micelle size were weak to moderate and ranged from -0.48 (protein and pH) to 0.50 (total casein and a30). Pearson correlations between gold standard FAA and these milk processing characteristics were also weak to moderate and ranged from -0.60 (Val and pH) to 0.49 (Val and K20). Results from this study indicate that mid-infrared spectroscopy has the potential to predict protein fractions and some FAA in milk at a population level. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
25 Years of Dysphagia Rehabilitation: What Have We Done, What are We Doing, and Where are We Going?
Easterling, Caryn
2017-02-01
As deglutologists, we strive to use the best evidence available in the treatment of swallowing disorders. Evidence-based medicine is a bottom-up approach that thoughtfully combines the best external evidence with individual clinical expertise and the patients' choice reflective of their clinical state and preferences for their specific care plan. Evidence-based medicine is not restricted to randomized clinical trials and meta-analyses; rather, evidence-based medicine includes our ability to discriminate the best external evidence with which to answer clinical questions and then skillfully and appropriately being able to apply this evidence in the care and treatment of our patients (Sackett et al. in BMJ 312:71-72, 1996). Translation of efficient and effective dysphagia rehabilitative clinical practice implies the need to use treatment that has proven therapeutic value, yields measurable physiologic results and most importantly allows appreciable qualitative outcomes for the patient.
Numerical Validation of the N3S-NATUR Code for Supersonic Nozzles and Afterbody Flows
NASA Astrophysics Data System (ADS)
Perrot, Y.; Hadjadj, A.
2005-02-01
A numerical investigation was conducted to assess the ability of the three-dimensional Navier-Stokes solver, N3S-Natur [1], using the k-ω SST turbulence model when computing nozzle-afterbody flows with propulsive jets. Three nozzle configurations were selected as test cases for the computational method: the first is the ONERA TIC nozzle, the second is an axisymmetric boat-tailed afterbody configuration and the third is a fully 3D transonic nozzle. In most situations, internal and external flow-field regions are modeled. The obtained results are carefully analyzed and compared to the experimental data. A three-dimensional computation was done to make evidence of 3D phenomena which are not negligible. A particular attention was payed to the appearance of a recirculation zone on the afterbody.
Boada-Grau, Joan; Sánchez-García, José-Carlos; Prizmic-Kuzmica, Aldo-Javier; Vigil-Colet, Andreu
2014-01-01
This study follows the theoretical framework put forward by Hinton on creative potential and practised creativity. The objective was to adapt the 17-item Creative Potential and Practised Creativity scale into Spanish and examine its psychometric properties. The study sample was made up of 975 Spanish employees (48.5% men and 51.5% women). After performing a confirmatory factor analysis, the findings revealed a three-factor structure: Creative potential, Practised creativity and Perception of organizational support. Furthermore, appropriate reliability was found for all three factors as well as initial evidence of construct validity in relation to certain external correlates and a series of scales measuring workaholism, irritation, burnout and personality. The present scale may prove ideal for adequately identifying Creative potential, Practised creativity and Perceived organizational support.
A typology of pain coping strategies in pediatric patients with chronic abdominal pain.
Walker, Lynn S; Baber, Kari Freeman; Garber, Judy; Smith, Craig A
2008-07-15
This study aimed to identify clinically meaningful profiles of pain coping strategies used by youth with chronic abdominal pain (CAP). Participants (n=699) were pediatric patients (ages 8-18 years) and their parents. Patients completed the Pain Response Inventory (PRI) and measures of somatic and depressive symptoms, disability, pain severity and pain efficacy, and perceived competence. Parents rated their children's pain severity and coping efficacy. Hierarchical cluster analysis based on the 13 PRI subscales identified pain coping profiles in Sample 1 (n=311) that replicated in Sample 2 (n=388). Evidence was found of external validity and distinctiveness of the profiles. The findings support a typology of pain coping that reflects the quality of patients' pain mastery efforts and interpersonal relationships associated with pain coping. Results are discussed in relation to developmental processes, attachment styles, and treatment implications.
The Student Risk Screening Scale for Early Childhood: An Initial Validation Study
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Menzies, Holly Mariah; Major, Rebecca; Allegra, Laurie; Powers, Lisa; Schatschneider, Chris
2015-01-01
We report findings of two exploratory validation studies of a revised instrument: the "Student Risk Screening Scale for Early Childhood" version (SRSS-EC). The SRSS-EC was modified to reflect characteristics of externalizing and internalizing behaviors manifested by preschool-age children. In Study 1, we explored the reliability of…
ERIC Educational Resources Information Center
Leong, Frederick T. L.; Austin, James T.; Sekaran, Uma; Komarraju, Meera
1998-01-01
Natives of India (n=172) completed Holland's Vocational Preference Inventory and job satisfaction measures. The inventory did not exhibit high external validity with this population. Congruence, consistency, and differentiation did not predict job or occupational satisfaction, suggesting cross-cultural limits on Holland's theory. (SK)
A Comparison between SRSS-IE and SSiS-PSG Scores: Examining Convergent Validity
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Common, Eric Alan; Zorigian, Kris; Brunsting, Nelson C.; Schatschneider, Christopher
2015-01-01
We report findings of a validation study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE, an adapted version of the Student Risk Screening Scale) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG). Participants included 458 kindergarten through fifth-grade…
Multidimensional Motivation and Engagement for Writing: Construct Validation with a Sample of Boys
ERIC Educational Resources Information Center
Collie, Rebecca J.; Martin, Andrew J.; Curwood, Jen Scott
2016-01-01
Given recent concerns around boys' literacy, this study examined multidimensional writing motivation and engagement among boys. We explored internal and external validity of 11 adaptive (e.g. self-efficacy for writing) and maladaptive (e.g. disengagement from writing) factors of writing motivation and engagement. The sample comprised 781 male…
42 CFR 438.358 - Activities related to external quality review.
Code of Federal Regulations, 2010 CFR
2010-10-01
...) Mandatory activities. For each MCO and PIHP, the EQR must use information from the following activities: (1) Validation of performance improvement projects required by the State to comply with requirements set forth in § 438.240(b)(1) and that were underway during the preceding 12 months. (2) Validation of MCO or PIHP...
ERIC Educational Resources Information Center
Matson, Johnny L.; Malone, Carrie J.
2006-01-01
Currently there are no available sleep disorder measures for individuals with severe and profound intellectual disability. We, therefore, attempted to establish the external validity of the "Diagnostic Assessment for the Severely Handicapped-II" (DASH-II) sleep subscale by comparing daily observational sleep data with the responses of…
Validation of Geriatric Depression Scale--5 Scores among Sedentary Older Adults
ERIC Educational Resources Information Center
Marquez, David X.; McAuley, Edward; Motl, Robert W.; Elavsky, Steriani; Konopack, James F.; Jerome, Gerald J.; Kramer, Arthur F.
2006-01-01
This study examined the validity of Geriatric Depression Scale--5 (GDS-5) scores among older sedentary adults based on its structural properties and relationship with external criteria. Participants from two samples (Ns = 185 and 93; M ages = 66 and 67 years) completed baseline assessments as part of randomized controlled exercise trials.…
Reliability and Validity of the Yale Global Tic Severity Scale
ERIC Educational Resources Information Center
Storch, Eric A.; Murphy, Tanya K.; Geffken, Gary R.; Sajid, Muhammad; Allen, Pam; Roberti, Jonathan W.; Goodman, Wayne K.
2005-01-01
To investigate the reliability and validity of the Yale Global Tic Severity Scale (YGTSS), 28 youth aged 6 to 17 years with Tourette's syndrome (TS) participated in the study. Data included clinician reports of tics and obsessive-compulsive disorder (OCD) severity, parent reports of tics, internalizing and externalizing problems, and child reports…
QSPR for predicting chloroform formation in drinking water disinfection.
Luilo, G B; Cabaniss, S E
2011-01-01
Chlorination is the most widely used technique for water disinfection, but may lead to the formation of chloroform (trichloromethane; TCM) and other by-products. This article reports the first quantitative structure-property relationship (QSPR) for predicting the formation of TCM in chlorinated drinking water. Model compounds (n = 117) drawn from 10 literature sources were divided into training data (n = 90, analysed by five-way leave-many-out internal cross-validation) and external validation data (n = 27). QSPR internal cross-validation had Q² = 0.94 and root mean square error (RMSE) of 0.09 moles TCM per mole compound, consistent with external validation Q2 of 0.94 and RMSE of 0.08 moles TCM per mole compound, and met criteria for high predictive power and robustness. In contrast, log TCM QSPR performed poorly and did not meet the criteria for predictive power. The QSPR predictions were consistent with experimental values for TCM formation from tannic acid and for model fulvic acid structures. The descriptors used are consistent with a relatively small number of important TCM precursor structures based upon 1,3-dicarbonyls or 1,3-diphenols.
Gaspardo, B; Del Zotto, S; Torelli, E; Cividino, S R; Firrao, G; Della Riccia, G; Stefanon, B
2012-12-01
Fourier transform near infrared (FT-NIR) spectroscopy is an analytical procedure generally used to detect organic compounds in food. In this work the ability to predict fumonisin B(1)+B(2) contents in corn meal using an FT-NIR spectrophotometer, equipped with an integration sphere, was assessed. A total of 143 corn meal samples were collected in Friuli Venezia Giulia Region (Italy) and used to define a 15 principal components regression model, applying partial least square regression algorithm with full cross validation as internal validation. External validation was performed to 25 unknown samples. Coefficients of correlation, root mean square error and standard error of calibration were 0.964, 0.630 and 0.632, respectively and the external validation confirmed a fair potential of the model in predicting FB(1)+FB(2) concentration. Results suggest that FT-NIR analysis is a suitable method to detect FB(1)+FB(2) in corn meal and to discriminate safe meals from those contaminated. Copyright © 2012 Elsevier Ltd. All rights reserved.
External validation of risk prediction models for incident colorectal cancer using UK Biobank
Usher-Smith, J A; Harshfield, A; Saunders, C L; Sharp, S J; Emery, J; Walter, F M; Muir, K; Griffin, S J
2018-01-01
Background: This study aimed to compare and externally validate risk scores developed to predict incident colorectal cancer (CRC) that include variables routinely available or easily obtainable via self-completed questionnaire. Methods: External validation of fourteen risk models from a previous systematic review in 373 112 men and women within the UK Biobank cohort with 5-year follow-up, no prior history of CRC and data for incidence of CRC through linkage to national cancer registries. Results: There were 1719 (0.46%) cases of incident CRC. The performance of the risk models varied substantially. In men, the QCancer10 model and models by Tao, Driver and Ma all had an area under the receiver operating characteristic curve (AUC) between 0.67 and 0.70. Discrimination was lower in women: the QCancer10, Wells, Tao, Guesmi and Ma models were the best performing with AUCs between 0.63 and 0.66. Assessment of calibration was possible for six models in men and women. All would require country-specific recalibration if estimates of absolute risks were to be given to individuals. Conclusions: Several risk models based on easily obtainable data have relatively good discrimination in a UK population. Modelling studies are now required to estimate the potential health benefits and cost-effectiveness of implementing stratified risk-based CRC screening. PMID:29381683
Kong, Anthony Pak-Hin
2011-02-01
The 1st aim of this study was to further establish the external validity of the main concept (MC) analysis by examining its relationship with the Cantonese Linguistic Communication Measure (CLCM; Kong, 2006; Kong & Law, 2004)-an established quantitative system for narrative production-and the Cantonese version of the Western Aphasia Battery (CAB; Yiu, 1992). The 2nd purpose of the study was to evaluate how well the MC analysis reflects the stability of discourse production among chronic Cantonese speakers with aphasia. Sixteen participants with aphasia were evaluated on the MC analysis, CAB, and CLCM in the summer of 2008 and were subsequently reassessed in the summer of 2009. They encompassed a range of aphasia severity (with an Aphasia Quotient ranging between 30.2/100 and 94.8/100 at the time of the 1st evaluation). Significant associations were found between the MC measures and the corresponding CLCM indices and CAB performance scores that were relevant to the presence, accuracy, and completeness of content in oral narratives. Moreover, the MC analysis was found to yield comparable scores for chronic speakers on 2 occasions 1 year apart. The present study has further established the external validity of MC analysis in Cantonese. Future investigations involving more speakers with aphasia will allow adequate description of its psychometric properties.
Hathi, Payal; Haque, Sabrina; Pant, Lovey; Coffey, Diane; Spears, Dean
2017-02-01
A long literature in demography has debated the importance of place for health, especially children's health. In this study, we assess whether the importance of dense settlement for infant mortality and child height is moderated by exposure to local sanitation behavior. Is open defecation (i.e., without a toilet or latrine) worse for infant mortality and child height where population density is greater? Is poor sanitation is an important mechanism by which population density influences child health outcomes? We present two complementary analyses using newly assembled data sets, which represent two points in a trade-off between external and internal validity. First, we concentrate on external validity by studying infant mortality and child height in a large, international child-level data set of 172 Demographic and Health Surveys, matched to census population density data for 1,800 subnational regions. Second, we concentrate on internal validity by studying child height in Bangladeshi districts, using a new data set constructed with GIS techniques that allows us to control for fixed effects at a high level of geographic resolution. We find a statistically robust and quantitatively comparable interaction between sanitation and population density with both approaches: open defecation externalities are more important for child health outcomes where people live more closely together.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ivanov, N. V.; Kakurin, A. M.
2014-10-15
Simulation of the magnetic island evolution under Resonant Magnetic Perturbation (RMP) in rotating T-10 tokamak plasma is presented with intent of TEAR code experimental validation. In the T-10 experiment chosen for simulation, the RMP consists of a stationary error field, a magnetic field of the eddy current in the resistive vacuum vessel and magnetic field of the externally applied controlled halo current in the plasma scrape-off layer (SOL). The halo-current loop consists of a rail limiter, plasma SOL, vacuum vessel, and external part of the circuit. Effects of plasma resistivity, viscosity, and RMP are taken into account in the TEARmore » code based on the two-fluid MHD approximation. Radial distribution of the magnetic flux perturbation is calculated with account of the externally applied RMP. A good agreement is obtained between the simulation results and experimental data for the cases of preprogrammed and feedback-controlled halo current in the plasma SOL.« less
Hofmeester, Ilse; Kollen, Boudewijn J; Steffens, Martijn G; Bosch, J L H Ruud; Drake, Marcus J; Weiss, Jeffrey P; Blanker, Marco H
2015-04-01
To systematically review and evaluate the impact of the International Continence Society (ICS)-2002 report on standardisation of terminology in nocturia, on publications reporting on nocturia and nocturnal polyuria (NP). In 2002, the ICS defined NP as a Nocturnal Polyuria Index (nocturnal urine volume/total 24-h urine volume) of >0.2-0.33, depending on age. In April 2013 the PubMed and Embase databases were searched for studies (in English, German, French or Dutch) based on original data and adult participants, investigating the relationship between nocturia and NP. A methodological quality assessment was performed, including scores on external validity, internal validity and informativeness. Quality scores of items were compared between studies published before and after the ICS-2002 report. The search yielded 78 publications based on 66 studies. Quality scores of studies were generally high for internal validity (median 5, interquartile range [IQR] 4-6) but low for external validity. After publication of the ICS-2002 report, external validity showed a significant change from 1 (IQR 1-2) to 2 (IQR 1-2.5; P = 0.019). Nocturia remained undefined in 12 studies. In all, 19 different definitions were used for NP, most often being the ICS (or similar) definition: this covered 52% (n = 11) of studies before and 66% (n = 27) after the ICS-2002 report. Clear definitions of both nocturia and NP were identified in 67% and 76% before, and in 88% and 88% of the studies after the ICS-2002 report, respectively. The ICS-2002 report on standardisation of terminology in nocturia appears to have had a beneficial impact on reporting definitions of nocturia and NP, enabling better interpretation of results and comparisons between research projects. Because the external validity of most of the 66 studies is considered a problem, the results of these studies may not be validly extrapolated to other populations. The ICS definition of NP is used most often. However, its discriminative value seems limited due to the estimated difference of 0.6 nocturnal voids between individuals with and without NP. Refinement of current definitions based on robust research is required. Based on pathophysiological reasoning, we argue that it may be more appropriate to define NP based on nocturnal urine production or nocturnal voided volumes, rather than on a diurnal urine production pattern. © 2014 The Authors. BJU International © 2014 BJU International.
Prabhu, Roshan S; Press, Robert H; Boselli, Danielle M; Miller, Katherine R; Lankford, Scott P; McCammon, Robert J; Moeller, Benjamin J; Heinzerling, John H; Fasola, Carolina E; Patel, Kirtesh R; Asher, Anthony L; Sumrall, Ashley L; Curran, Walter J; Shu, Hui-Kuo G; Burri, Stuart H
2018-03-01
Patients treated with stereotactic radiosurgery (SRS) for brain metastases (BM) are at increased risk of distant brain failure (DBF). Two nomograms have been recently published to predict individualized risk of DBF after SRS. The goal of this study was to assess the external validity of these nomograms in an independent patient cohort. The records of consecutive patients with BM treated with SRS at Levine Cancer Institute and Emory University between 2005 and 2013 were reviewed. Three validation cohorts were generated based on the specific nomogram or recursive partitioning analysis (RPA) entry criteria: Wake Forest nomogram (n = 281), Canadian nomogram (n = 282), and Canadian RPA (n = 303) validation cohorts. Freedom from DBF at 1-year in the Wake Forest study was 30% compared with 50% in the validation cohort. The validation c-index for both the 6-month and 9-month freedom from DBF Wake Forest nomograms was 0.55, indicating poor discrimination ability, and the goodness-of-fit test for both nomograms was highly significant (p < 0.001), indicating poor calibration. The 1-year actuarial DBF in the Canadian nomogram study was 43.9% compared with 50.9% in the validation cohort. The validation c-index for the Canadian 1-year DBF nomogram was 0.56, and the goodness-of-fit test was also highly significant (p < 0.001). The validation accuracy and c-index of the Canadian RPA classification was 53% and 0.61, respectively. The Wake Forest and Canadian nomograms for predicting risk of DBF after SRS were found to have limited predictive ability in an independent bi-institutional validation cohort. These results reinforce the importance of validating predictive models in independent patient cohorts.
Zastrow, Stefan; Brookman-May, Sabine; Cong, Thi Anh Phuong; Jurk, Stanislaw; von Bar, Immanuel; Novotny, Vladimir; Wirth, Manfred
2015-03-01
To predict outcome of patients with renal cell carcinoma (RCC) who undergo surgical therapy, risk models and nomograms are valuable tools. External validation on independent datasets is crucial for evaluating accuracy and generalizability of these models. The objective of the present study was to externally validate the postoperative nomogram developed by Karakiewicz et al. for prediction of cancer-specific survival. A total of 1,480 consecutive patients with a median follow-up of 82 months (IQR 46-128) were included into this analysis with 268 RCC-specific deaths. Nomogram-estimated survival probabilities were compared with survival probabilities of the actual cohort, and concordance indices were calculated. Calibration plots and decision curve analyses were used for evaluating calibration and clinical net benefit of the nomogram. Concordance between predictions of the nomogram and survival rates of the cohort was 0.911 after 12, 0.909 after 24 months and 0.896 after 60 months. Comparison of predicted probabilities and actual survival estimates with calibration plots showed an overestimation of tumor-specific survival based on nomogram predictions of high-risk patients, although calibration plots showed a reasonable calibration for probability ranges of interest. Decision curve analysis showed a positive net benefit of nomogram predictions for our patient cohort. The postoperative Karakiewicz nomogram provides a good concordance in this external cohort and is reasonably calibrated. It may overestimate tumor-specific survival in high-risk patients, which should be kept in mind when counseling patients. A positive net benefit of nomogram predictions was proven.