additional validity evidence: Topics by Science.gov

Sample records for additional validity evidence

Validity evidence based on test content.

PubMed

Sireci, Stephen; Faulkner-Bond, Molly

2014-01-01

Validity evidence based on test content is one of the five forms of validity evidence stipulated in the Standards for Educational and Psychological Testing developed by the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education. In this paper, we describe the logic and theory underlying such evidence and describe traditional and modern methods for gathering and analyzing content validity data. A comprehensive review of the literature and of the aforementioned Standards is presented. For educational tests and other assessments targeting knowledge and skill possessed by examinees, validity evidence based on test content is necessary for building a validity argument to support the use of a test for a particular purpose. By following the methods described in this article, practitioners have a wide arsenal of tools available for determining how well the content of an assessment is congruent with and appropriate for the specific testing purposes.
Eating Disorder Diagnostic Scale: Additional Evidence of Reliability and Validity

ERIC Educational Resources Information Center

Stice, Eric; Fisher, Melissa; Martinez, Erin

2004-01-01

The authors conducted 4 studies investigating the reliability and validity of the Eating Disorder Diagnostic Scale (HDDS; E. Stice, C. F. Telch, & S. L. Rizvi, 2000), a brief self-report measure for diagnosing anorexia nervosa, bulimia nervosa, and binge eating disorder. Study 1 found that the HDDS showed criterion validity with interview-based…
Truth and Evidence in Validity Theory

ERIC Educational Resources Information Center

Borsboom, Denny; Markus, Keith A.

2013-01-01

According to Kane (this issue), "the validity of a proposed interpretation or use depends on how well the evidence supports" the claims being made. Because truth and evidence are distinct, this means that the validity of a test score interpretation could be high even though the interpretation is false. As an illustration, we discuss the case of…
20 CFR 404.727 - Evidence of a deemed valid marriage.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Evidence of a deemed valid marriage. 404.727... DISABILITY INSURANCE (1950- ) Evidence Evidence of Age, Marriage, and Death § 404.727 Evidence of a deemed valid marriage. (a) General. A deemed valid marriage is a ceremonial marriage we consider valid even...
20 CFR 404.725 - Evidence of a valid ceremonial marriage.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Evidence of a valid ceremonial marriage. 404... DISABILITY INSURANCE (1950- ) Evidence Evidence of Age, Marriage, and Death § 404.725 Evidence of a valid ceremonial marriage. (a) General. A valid ceremonial marriage is one that follows procedures set by law in...
Evidence of Construct Validity in Published Achievement Tests.

ERIC Educational Resources Information Center

Nolet, Victor; Tindal, Gerald

Valid interpretation of test scores is the shared responsibility of the test designer and the test user. Test publishers must provide evidence of the validity of the decisions their tests are intended to support, while test users are responsible for analyzing this evidence and subsequently using the test in the manner indicated by the publisher.…
The Students' Perceptions of School Success Promoting Strategies Inventory (SPSI): development and validity evidence based studies.

PubMed

Moreira, Paulo A S; Oliveira, João Tiago; Dias, Paulo; Vaz, Filipa Machado; Torres-Oliveira, Isabel

2014-08-04

Students' perceptions about school success promotion strategies are of great importance for schools, as they are an indicator of how students perceive the school success promotion strategies. The objective of this study was to develop and analyze the validity evidence based of The Students' Perceptions of School Success Promoting Strategies Inventory (SPSI), which assesses both individual students' perceptions of their school success promoting strategies, and dimensions of school quality. A structure of 7 related factors was found, which showed good adjustment indices in two additional different samples, suggesting that this is a well-fitting multi-group model (p < .001). All scales presented good reliability values. Schools with good academic results registered higher values in Career development, Active learning, Proximity, Educational Technologies and Extra-curricular activities (p < .05). SPSI showed to be adequate to measure within-schools (students within schools) dimensions of school success. In addition, there is preliminary evidence for its adequacy for measuring school success promotion dimensions between schools for 4 dimensions. This study supports the validity evidence based of the SPSI (validity evidence based on test content, on internal structure, on relations to other variables and on consequences of testing). Future studies should test for within- and between-level variance in a bigger sample of schools.
Gathering Validity Evidence for Surgical Simulation: A Systematic Review.

PubMed

Borgersen, Nanna Jo; Naur, Therese M H; Sørensen, Stine M D; Bjerrum, Flemming; Konge, Lars; Subhi, Yousif; Thomsen, Ann Sofia S

2018-06-01

To identify current trends in the use of validity frameworks in surgical simulation, to provide an overview of the evidence behind the assessment of technical skills in all surgical specialties, and to present recommendations and guidelines for future validity studies. Validity evidence for assessment tools used in the evaluation of surgical performance is of paramount importance to ensure valid and reliable assessment of skills. We systematically reviewed the literature by searching 5 databases (PubMed, EMBASE, Web of Science, PsycINFO, and the Cochrane Library) for studies published from January 1, 2008, to July 10, 2017. We included original studies evaluating simulation-based assessments of health professionals in surgical specialties and extracted data on surgical specialty, simulator modality, participant characteristics, and the validity framework used. Data were synthesized qualitatively. We identified 498 studies with a total of 18,312 participants. Publications involving validity assessments in surgical simulation more than doubled from 2008 to 2010 (∼30 studies/year) to 2014 to 2016 (∼70 to 90 studies/year). Only 6.6% of the studies used the recommended contemporary validity framework (Messick). The majority of studies used outdated frameworks such as face validity. Significant differences were identified across surgical specialties. The evaluated assessment tools were mostly inanimate or virtual reality simulation models. An increasing number of studies have gathered validity evidence for simulation-based assessments in surgical specialties, but the use of outdated frameworks remains common. To address the current practice, this paper presents guidelines on how to use the contemporary validity framework when designing validity studies.
Validation analysis of probabilistic models of dietary exposure to food additives.

PubMed

Gilsenan, M B; Thompson, R L; Lambe, J; Gibney, M J

2003-10-01

The validity of a range of simple conceptual models designed specifically for the estimation of food additive intakes using probabilistic analysis was assessed. Modelled intake estimates that fell below traditional conservative point estimates of intake and above 'true' additive intakes (calculated from a reference database at brand level) were considered to be in a valid region. Models were developed for 10 food additives by combining food intake data, the probability of an additive being present in a food group and additive concentration data. Food intake and additive concentration data were entered as raw data or as a lognormal distribution, and the probability of an additive being present was entered based on the per cent brands or the per cent eating occasions within a food group that contained an additive. Since the three model components assumed two possible modes of input, the validity of eight (2(3)) model combinations was assessed. All model inputs were derived from the reference database. An iterative approach was employed in which the validity of individual model components was assessed first, followed by validation of full conceptual models. While the distribution of intake estimates from models fell below conservative intakes, which assume that the additive is present at maximum permitted levels (MPLs) in all foods in which it is permitted, intake estimates were not consistently above 'true' intakes. These analyses indicate the need for more complex models for the estimation of food additive intakes using probabilistic analysis. Such models should incorporate information on market share and/or brand loyalty.
Montreal-Toulouse Language Assessment Battery: evidence of criterion validity from patients with aphasia.

PubMed

Pagliarin, Karina Carlesso; Ortiz, Karin Zazo; Barreto, Simone dos Santos; Pimenta Parente, Maria Alice de Mattos; Nespoulous, Jean-Luc; Joanette, Yves; Fonseca, Rochele Paz

2015-10-15

The Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) provides a general description of language processing and related components in adults with brain injury. The present study aimed at verifying the criterion-related validity of the Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) by assessing its ability to discriminate between individuals with unilateral brain damage with and without aphasia. The investigation was carried out in a Brazilian community-based sample of 104 adults, divided into four groups: 26 participants with left hemisphere damage (LHD) with aphasia, 25 participants with right hemisphere damage (RHD), 28 with LHD non-aphasic, and 25 healthy adults. There were significant differences between patients with aphasia and the other groups on most total and subtotal scores on MTL-BR tasks. The results showed strong criterion-related validity evidence for the MTL-BR Battery, and provided important information regarding hemispheric specialization and interhemispheric cooperation. Future research is required to search for additional evidence of sensitivity, specificity and validity of the MTL-BR in samples with different types of aphasia and degrees of language impairment. Copyright © 2015 Elsevier B.V. All rights reserved.
Literature evidence in open targets - a target validation platform.

PubMed

Kafkas, Şenay; Dunham, Ian; McEntyre, Johanna

2017-06-06

We present the Europe PMC literature component of Open Targets - a target validation platform that integrates various evidence to aid drug target identification and validation. The component identifies target-disease associations in documents and ranks the documents based on their confidence from the Europe PMC literature database, by using rules utilising expert-provided heuristic information. The confidence score of a given document represents how valuable the document is in the scope of target validation for a given target-disease association by taking into account the credibility of the association based on the properties of the text. The component serves the platform regularly with the up-to-date data since December, 2015. Currently, there are a total number of 1168365 distinct target-disease associations text mined from >26 million PubMed abstracts and >1.2 million Open Access full text articles. Our comparative analyses on the current available evidence data in the platform revealed that 850179 of these associations are exclusively identified by literature mining. This component helps the platform's users by providing the most relevant literature hits for a given target and disease. The text mining evidence along with the other types of evidence can be explored visually through https://www.targetvalidation.org and all the evidence data is available for download in json format from https://www.targetvalidation.org/downloads/data .
Health Sciences-Evidence Based Practice questionnaire (HS-EBP) for measuring transprofessional evidence-based practice: Creation, development and psychometric validation.

PubMed

Fernández-Domínguez, Juan Carlos; de Pedro-Gómez, Joan Ernest; Morales-Asencio, José Miguel; Bennasar-Veny, Miquel; Sastre-Fullana, Pedro; Sesé-Abad, Albert

2017-01-01

Most of the EBP measuring instruments available to date present limitations both in the operationalisation of the construct and also in the rigour of their psychometric development, as revealed in the literature review performed. The aim of this paper is to provide rigorous and adequate reliability and validity evidence of the scores of a new transdisciplinary psychometric tool, the Health Sciences Evidence-Based Practice (HS-EBP), for measuring the construct EBP in Health Sciences professionals. A pilot study and a subsequent two-stage validation test sample were conducted to progressively refine the instrument until a reduced 60-item version with a five-factor latent structure. Reliability was analysed through both Cronbach's alpha coefficient and intraclass correlations (ICC). Latent structure was contrasted using confirmatory factor analysis (CFA) following a model comparison aproach. Evidence of criterion validity of the scores obtained was achieved by considering attitudinal resistance to change, burnout, and quality of professional life as criterion variables; while convergent validity was assessed using the Spanish version of the Evidence-Based Practice Questionnaire (EBPQ-19). Adequate evidence of both reliability and ICC was obtained for the five dimensions of the questionnaire. According to the CFA model comparison, the best fit corresponded to the five-factor model (RMSEA = 0.049; CI 90% RMSEA = [0.047; 0.050]; CFI = 0.99). Adequate criterion and convergent validity evidence was also provided. Finally, the HS-EBP showed the capability to find differences between EBP training levels as an important evidence of decision validity. Reliability and validity evidence obtained regarding the HS-EBP confirm the adequate operationalisation of the EBP construct as a process put into practice to respond to every clinical situation arising in the daily practice of professionals in health sciences (transprofessional). The tool could be useful for EBP individual
Macro- and Micro-Validation: Beyond the "Five Sources" Framework for Classifying Validation Evidence and Analysis

ERIC Educational Resources Information Center

Newton, Paul E.

2016-01-01

This paper argues that the dominant framework for conceptualizing validation evidence and analysis--the "five sources" framework from the 1999 "Standards"--is seriously limited. Its limitation raises a significant barrier to understanding the nature of comprehensive validation, and this presents a significant threat to…
External Standards or Standard Addition? Selecting and Validating a Method of Standardization

NASA Astrophysics Data System (ADS)

Harvey, David T.

2002-05-01

A common feature of many problem-based laboratories in analytical chemistry is a lengthy independent project involving the analysis of "real-world" samples. Students research the literature, adapting and developing a method suitable for their analyte, sample matrix, and problem scenario. Because these projects encompass the complete analytical process, students must consider issues such as obtaining a representative sample, selecting a method of analysis, developing a suitable standardization, validating results, and implementing appropriate quality assessment/quality control practices. Most textbooks and monographs suitable for an undergraduate course in analytical chemistry, however, provide only limited coverage of these important topics. The need for short laboratory experiments emphasizing important facets of method development, such as selecting a method of standardization, is evident. The experiment reported here, which is suitable for an introductory course in analytical chemistry, illustrates the importance of matrix effects when selecting a method of standardization. Students also learn how a spike recovery is used to validate an analytical method, and obtain a practical experience in the difference between performing an external standardization and a standard addition.
When Assessment Data Are Words: Validity Evidence for Qualitative Educational Assessments.

PubMed

Cook, David A; Kuper, Ayelet; Hatala, Rose; Ginsburg, Shiphra

2016-10-01

Quantitative scores fail to capture all important features of learner performance. This awareness has led to increased use of qualitative data when assessing health professionals. Yet the use of qualitative assessments is hampered by incomplete understanding of their role in forming judgments, and lack of consensus in how to appraise the rigor of judgments therein derived. The authors articulate the role of qualitative assessment as part of a comprehensive program of assessment, and translate the concept of validity to apply to judgments arising from qualitative assessments. They first identify standards for rigor in qualitative research, and then use two contemporary assessment validity frameworks to reorganize these standards for application to qualitative assessment.Standards for rigor in qualitative research include responsiveness, reflexivity, purposive sampling, thick description, triangulation, transparency, and transferability. These standards can be reframed using Messick's five sources of validity evidence (content, response process, internal structure, relationships with other variables, and consequences) and Kane's four inferences in validation (scoring, generalization, extrapolation, and implications). Evidence can be collected and evaluated for each evidence source or inference. The authors illustrate this approach using published research on learning portfolios.The authors advocate a "methods-neutral" approach to assessment, in which a clearly stated purpose determines the nature of and approach to data collection and analysis. Increased use of qualitative assessments will necessitate more rigorous judgments of the defensibility (validity) of inferences and decisions. Evidence should be strategically sought to inform a coherent validity argument.
Health Sciences-Evidence Based Practice questionnaire (HS-EBP) for measuring transprofessional evidence-based practice: Creation, development and psychometric validation

PubMed Central

Fernández-Domínguez, Juan Carlos; de Pedro-Gómez, Joan Ernest; Morales-Asencio, José Miguel; Sastre-Fullana, Pedro; Sesé-Abad, Albert

2017-01-01

Introduction Most of the EBP measuring instruments available to date present limitations both in the operationalisation of the construct and also in the rigour of their psychometric development, as revealed in the literature review performed. The aim of this paper is to provide rigorous and adequate reliability and validity evidence of the scores of a new transdisciplinary psychometric tool, the Health Sciences Evidence-Based Practice (HS-EBP), for measuring the construct EBP in Health Sciences professionals. Methods A pilot study and a subsequent two-stage validation test sample were conducted to progressively refine the instrument until a reduced 60-item version with a five-factor latent structure. Reliability was analysed through both Cronbach’s alpha coefficient and intraclass correlations (ICC). Latent structure was contrasted using confirmatory factor analysis (CFA) following a model comparison aproach. Evidence of criterion validity of the scores obtained was achieved by considering attitudinal resistance to change, burnout, and quality of professional life as criterion variables; while convergent validity was assessed using the Spanish version of the Evidence-Based Practice Questionnaire (EBPQ-19). Results Adequate evidence of both reliability and ICC was obtained for the five dimensions of the questionnaire. According to the CFA model comparison, the best fit corresponded to the five-factor model (RMSEA = 0.049; CI 90% RMSEA = [0.047; 0.050]; CFI = 0.99). Adequate criterion and convergent validity evidence was also provided. Finally, the HS-EBP showed the capability to find differences between EBP training levels as an important evidence of decision validity. Conclusions Reliability and validity evidence obtained regarding the HS-EBP confirm the adequate operationalisation of the EBP construct as a process put into practice to respond to every clinical situation arising in the daily practice of professionals in health sciences (transprofessional). The
The key-features approach to assess clinical decisions: validity evidence to date.

PubMed

Bordage, G; Page, G

2018-05-17

The key-features (KFs) approach to assessment was initially proposed during the First Cambridge Conference on Medical Education in 1984 as a more efficient and effective means of assessing clinical decision-making skills. Over three decades later, we conducted a comprehensive, systematic review of the validity evidence gathered since then. The evidence was compiled according to the Standards for Educational and Psychological Testing's five sources of validity evidence, namely, Content, Response process, Internal structure, Relations to other variables, and Consequences, to which we added two other types related to Cost-feasibility and Acceptability. Of the 457 publications that referred to the KFs approach between 1984 and October 2017, 164 are cited here; the remaining 293 were either redundant or the authors simply mentioned the KFs concept in relation to their work. While one set of articles reported meeting the validity standards, another set examined KFs test development choices and score interpretation. The accumulated validity evidence for the KFs approach since its inception supports the decision-making construct measured and its use to assess clinical decision-making skills at all levels of training and practice and with various types of exam formats. Recognizing that gathering validity evidence is an ongoing process, areas with limited evidence, such as item factor analyses or consequences of testing, are identified as well as new topics needing further clarification, such as the use of the KFs approach for formative assessment and its place within a program of assessment.
Validity evidence as a key marker of quality of technical skill assessment in OTL-HNS.

PubMed

Labbé, Mathilde; Young, Meredith; Nguyen, Lily H P

2018-01-13

Quality monitoring of assessment practices should be a priority in all residency programs. Validity evidence is one of the main hallmarks of assessment quality and should be collected to support the interpretation and use of assessment data. Our objective was to identify, synthesize, and present the validity evidence reported supporting different technical skill assessment tools in otolaryngology-head and neck surgery (OTL-HNS). We performed a secondary analysis of data generated through a systematic review of all published tools for assessing technical skills in OTL-HNS (n = 16). For each tool, we coded validity evidence according to the five types of evidence described by the American Educational Research Association's interpretation of Messick's validity framework. Descriptive statistical analyses were conducted. All 16 tools included in our analysis were supported by internal structure and relationship to variables validity evidence. Eleven articles presented evidence supporting content. Response process was discussed only in one article, and no study reported on evidence exploring consequences. We present the validity evidence reported for 16 rater-based tools that could be used for work-based assessment of OTL-HNS residents in the operating room. The articles included in our review were consistently deficient in evidence for response process and consequences. Rater-based assessment tools that support high-stakes decisions that impact the learner and programs should include several sources of validity evidence. Thus, use of any assessment should be done with careful consideration of the context-specific validity evidence supporting score interpretation, and we encourage deliberate continual assessment quality-monitoring. NA. Laryngoscope, 2018. © 2018 The American Laryngological, Rhinological and Otological Society, Inc.
20 CFR 219.31 - Evidence of a valid ceremonial marriage.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 20 Employees' Benefits 1 2010-04-01 2010-04-01 false Evidence of a valid ceremonial marriage. 219... marriage. (a) Preferred evidence. Preferred evidence of a ceremonial marriage is— (1) A copy of the public record of the marriage, certified by the custodian of the record or by a Board employee; (2) A copy of a...
The Prosocial and Antisocial Behaviour in Sport Scale: further evidence for construct validity and reliability.

PubMed

Kavussanu, Maria; Stanger, Nicholas; Boardley, Ian D

2013-01-01

The purpose of this research was to provide further evidence for the construct validity (i.e., convergent, concurrent, and discriminant validity) of the Prosocial and Antisocial Behaviour in Sport Scale (PABSS), an instrument that has four subscales measuring prosocial and antisocial behaviour toward teammates and opponents. We also investigated test-retest reliability and stability of the PABSS. We conducted three studies using athletes from a variety of team sports. In Study 1, participants (N = 129) completed the PABSS and measures of physical and verbal aggression, hostility, anger, moral identity, and empathy; a sub-sample (n = 111) also completed the PABSS one week later. In Study 2, in addition to the PABSS, participants (N = 89) completed measures of competitive aggressiveness and anger, moral attitudes, moral disengagement, goal orientation, and anxiety. In Study 3, participants (N = 307) completed the PABSS and a measure of social goals. Across the three studies, the four subscales evidenced the hypothesised relationships with a number of variables. Correlations were large between the two antisocial behaviours and small between the two prosocial behaviours. Overall, the findings supported the convergent, concurrent, and discriminant validity of the scale, provided evidence for its test-retest reliability and stability, and suggest that the instrument is a valid and reliable measure of prosocial and antisocial behaviour in sport.

Validity evidence for the measurement of the strength of motivation for medical school.

PubMed

Kusurkar, Rashmi; Croiset, Gerda; Kruitwagen, Cas; ten Cate, Olle

2011-05-01

The Strength of Motivation for Medical School (SMMS) questionnaire is designed to determine the strength of motivation of students particularly for medical study. This research was performed to establish the validity evidence for measuring strength of motivation for medical school. Internal structure and relations to other variables were used as the sources of validity evidence. The SMMS questionnaire was filled out by 1,494 medical students in different years of medical curriculum. The validity evidence for the internal structure was analyzed by principal components analysis with promax rotation. Validity evidence for relations to other variables was tested by comparing the SMMS scores with scores on the Academic Motivation Scale (AMS) and the exhaustion scale of Maslach Burnout Inventory-Student Survey (MBI-SS) for measuring study stress. Evidence for internal consistency was determined through the Cronbach's alpha for reliability. The analysis showed that the SMMS had a 3-factor structure. The validity in relations to other variables was established as both, the subscales and full scale scores significantly correlated positively with the intrinsic motivation scores and with the more autonomous forms of extrinsic motivation, the correlation decreasing and finally becoming negative towards the extrinsic motivation end of the spectrum. They also had significant negative correlations with amotivation scale of the AMS and exhaustion scale of MBI-SS. The Cronbach's alpha for reliability of the three subscales and full SMMS scores was 0.70, 0.67, 0.55 and 0.79. The strength of motivation for medical school has a three factor structure and acceptable validity evidence was found in our study.
Additional Evidence of Convergent Validity between SRSS-IE and SSiS-PSG Scores

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Ennis, Robin Parks; Royer, David James

2015-01-01

We report findings of a validity study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG; Elliott & Gresham, 2007). Participants were 1,680 kindergarten through sixth-grade elementary students from three…
Measuring Practitioner Attitudes toward Evidence-Based Treatments: A Validation Study

ERIC Educational Resources Information Center

Ashcraft, Rindee G. P.; Foster, Sharon L.; Lowery, Amy E.; Henggeler, Scott W.; Chapman, Jason E.; Rowland, Melisa D.

2011-01-01

A better understanding of clinicians' attitudes toward evidence-based treatments (EBT) will presumably enhance the transfer of EBTs for substance-abusing adolescents from research to clinical application. The reliability and validity of two measures of therapist attitudes toward EBT were examined: the Evidence-Based Practice Attitude Scale…
20 CFR 416.805 - When additional evidence may be required.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 20 Employees' Benefits 2 2011-04-01 2011-04-01 false When additional evidence may be required. 416.805 Section 416.805 Employees' Benefits SOCIAL SECURITY ADMINISTRATION SUPPLEMENTAL SECURITY INCOME FOR THE AGED, BLIND, AND DISABLED Determination of Age § 416.805 When additional evidence may be...
Applicability Analysis of Validation Evidence for Biomedical Computational Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pathmanathan, Pras; Gray, Richard A.; Romero, Vicente J.

Computational modeling has the potential to revolutionize medicine the way it transformed engineering. However, despite decades of work, there has only been limited progress to successfully translate modeling research to patient care. One major difficulty which often occurs with biomedical computational models is an inability to perform validation in a setting that closely resembles how the model will be used. For example, for a biomedical model that makes in vivo clinically relevant predictions, direct validation of predictions may be impossible for ethical, technological, or financial reasons. Unavoidable limitations inherent to the validation process lead to challenges in evaluating the credibilitymore » of biomedical model predictions. Therefore, when evaluating biomedical models, it is critical to rigorously assess applicability, that is, the relevance of the computational model, and its validation evidence to the proposed context of use (COU). However, there are no well-established methods for assessing applicability. In this paper, we present a novel framework for performing applicability analysis and demonstrate its use with a medical device computational model. The framework provides a systematic, step-by-step method for breaking down the broad question of applicability into a series of focused questions, which may be addressed using supporting evidence and subject matter expertise. The framework can be used for model justification, model assessment, and validation planning. While motivated by biomedical models, it is relevant to a broad range of disciplines and underlying physics. Finally, the proposed applicability framework could help overcome some of the barriers inherent to validation of, and aid clinical implementation of, biomedical models.« less
Applicability Analysis of Validation Evidence for Biomedical Computational Models

DOE PAGES

Pathmanathan, Pras; Gray, Richard A.; Romero, Vicente J.; ...

2017-09-07

Computational modeling has the potential to revolutionize medicine the way it transformed engineering. However, despite decades of work, there has only been limited progress to successfully translate modeling research to patient care. One major difficulty which often occurs with biomedical computational models is an inability to perform validation in a setting that closely resembles how the model will be used. For example, for a biomedical model that makes in vivo clinically relevant predictions, direct validation of predictions may be impossible for ethical, technological, or financial reasons. Unavoidable limitations inherent to the validation process lead to challenges in evaluating the credibilitymore » of biomedical model predictions. Therefore, when evaluating biomedical models, it is critical to rigorously assess applicability, that is, the relevance of the computational model, and its validation evidence to the proposed context of use (COU). However, there are no well-established methods for assessing applicability. In this paper, we present a novel framework for performing applicability analysis and demonstrate its use with a medical device computational model. The framework provides a systematic, step-by-step method for breaking down the broad question of applicability into a series of focused questions, which may be addressed using supporting evidence and subject matter expertise. The framework can be used for model justification, model assessment, and validation planning. While motivated by biomedical models, it is relevant to a broad range of disciplines and underlying physics. Finally, the proposed applicability framework could help overcome some of the barriers inherent to validation of, and aid clinical implementation of, biomedical models.« less
Validation of Reverse-Engineered and Additive-Manufactured Microsurgical Instrument Prototype.

PubMed

Singh, Ramandeep; Suri, Ashish; Anand, Sneh; Baby, Britty

2016-12-01

With advancements in imaging techniques, neurosurgical procedures are becoming highly precise and minimally invasive, thus demanding development of new ergonomically aesthetic instruments. Conventionally, neurosurgical instruments are manufactured using subtractive manufacturing methods. Such a process is complex, time-consuming, and impractical for prototype development and validation of new designs. Therefore, an alternative design process has been used utilizing blue light scanning, computer-aided designing, and additive manufacturing direct metal laser sintering (DMLS) for microsurgical instrument prototype development. Deviations of DMLS-fabricated instrument were studied by superimposing scan data of fabricated instrument with the computer-aided designing model. Content and concurrent validity of the fabricated prototypes was done by a group of 15 neurosurgeons by performing sciatic nerve anastomosis in small laboratory animals. Comparative scoring was obtained for the control and study instrument. T test was applied to the individual parameters and P values for force (P < .0001) and surface roughness (P < .01) were found to be statistically significant. These 2 parameters were further analyzed using objective measures. Results depicts that additive manufacturing by DMLS provides an effective method for prototype development. However, direct application of these additive-manufactured instruments in the operating room requires further validation. © The Author(s) 2016.
Trainees' Perceptions of Feedback: Validity Evidence for Two FEEDME (Feedback in Medical Education) Instruments.

PubMed

Bing-You, Robert; Ramesh, Saradha; Hayes, Victoria; Varaklis, Kalli; Ward, Denham; Blanco, Maria

2018-01-01

results provide preliminary validity evidence of 2 novel feedback instruments. After further validation of both FEEDME instruments, sharing the results of the FEEDME-Culture instrument with educational leaders and faculty may improve the culture of feedback on specific educational rotations and at the institutional level. The FEEDME-Provider instrument could be useful for faculty development targeting feedback skills. Additional research studies could assess whether both instruments may be used to help learners receive feedback and prompt reflective learning.
Convergent Validity Evidence regarding the Validity of the Chilean Standards-Based Teacher Evaluation System

ERIC Educational Resources Information Center

Santelices, Maria Veronica; Taut, Sandy

2011-01-01

This paper describes convergent validity evidence regarding the mandatory, standards-based Chilean national teacher evaluation system (NTES). The study examined whether NTES identifies--and thereby rewards or punishes--the "right" teachers as high- or low-performing. We collected in-depth teaching performance data on a sample of 58…
Validity evidence for the Fundamentals of Laparoscopic Surgery (FLS) program as an assessment tool: a systematic review.

PubMed

Zendejas, Benjamin; Ruparel, Raaj K; Cook, David A

2016-02-01

The Fundamentals of Laparoscopic Surgery (FLS) program uses five simulation stations (peg transfer, precision cutting, loop ligation, and suturing with extracorporeal and intracorporeal knot tying) to teach and assess laparoscopic surgery skills. We sought to summarize evidence regarding the validity of scores from the FLS assessment. We systematically searched for studies evaluating the FLS as an assessment tool (last search update February 26, 2013). We classified validity evidence using the currently standard validity framework (content, response process, internal structure, relations with other variables, and consequences). From a pool of 11,628 studies, we identified 23 studies reporting validity evidence for FLS scores. Studies involved residents (n = 19), practicing physicians (n = 17), and medical students (n = 8), in specialties of general (n = 17), gynecologic (n = 4), urologic (n = 1), and veterinary (n = 1) surgery. Evidence was most common in the form of relations with other variables (n = 22, most often expert-novice differences). Only three studies reported internal structure evidence (inter-rater or inter-station reliability), two studies reported content evidence (i.e., derivation of assessment elements), and three studies reported consequences evidence (definition of pass/fail thresholds). Evidence nearly always supported the validity of FLS total scores. However, the loop ligation task lacks discriminatory ability. Validity evidence confirms expected relations with other variables and acceptable inter-rater reliability, but other validity evidence is sparse. Given the high-stakes use of this assessment (required for board eligibility), we suggest that more validity evidence is required, especially to support its content (selection of tasks and scoring rubric) and the consequences (favorable and unfavorable impact) of assessment.
Evaluating Existing and New Validity Evidence for the Academic Motivation Scale

ERIC Educational Resources Information Center

Fairchild, Amanda J.; Horst, S. Jeanne; Finney, Sara J.; Barron, Kenneth E.

2005-01-01

The current study evaluates existing and new validity evidence for the Academic Motivation Scale (AMS; Vallerand et al., 1992). We first provide a narrative review synthesizing past research, and then conduct a validity investigation of the scores from the measure. Data analysis using a sample of 1406 American college students provided construct…
Validity Evidence for the Neuro-Endoscopic Ventriculostomy Assessment Tool (NEVAT).

PubMed

Breimer, Gerben E; Haji, Faizal A; Cinalli, Giuseppe; Hoving, Eelco W; Drake, James M

2017-02-01

Growing demand for transparent and standardized methods for evaluating surgical competence prompted the construction of the Neuro-Endoscopic Ventriculostomy Assessment Tool (NEVAT). To provide validity evidence of the NEVAT by reporting on the tool's internal structure and its relationship with surgical expertise during simulation-based training. The NEVAT was used to assess performance of trainees and faculty at an international neuroendoscopy workshop. All participants performed an endoscopic third ventriculostomy (ETV) on a synthetic simulator. Participants were simultaneously scored by 2 raters using the NEVAT procedural checklist and global rating scale (GRS). Evidence of internal structure was collected by calculating interrater reliability and internal consistency of raters' scores. Evidence of relationships with other variables was collected by comparing the ETV performance of experts, experienced trainees, and novices using Jonckheere's test (evidence of construct validity). Thirteen experts, 11 experienced trainees, and 10 novices participated. The interrater reliability by the intraclass correlation coefficient for the checklist and GRS was 0.82 and 0.94, respectively. Internal consistency (Cronbach's α) for the checklist and the GRS was 0.74 and 0.97, respectively. Median scores with interquartile range on the checklist and GRS for novices, experienced trainees, and experts were 0.69 (0.58-0.86), 0.85 (0.63-0.89), and 0.85 (0.81-0.91) and 3.1 (2.5-3.8), 3.7 (2.2-4.3) and 4.6 (4.4-4.9), respectively. Jonckheere's test showed that the median checklist and GRS score increased with performer expertise ( P = .04 and .002, respectively). This study provides validity evidence for the NEVAT to support its use as a standardized method of evaluating neuroendoscopic competence during simulation-based training. Copyright © 2016 by the Congress of Neurological Surgeons
Analytic Validation of Immunohistochemical Assays: A Comparison of Laboratory Practices Before and After Introduction of an Evidence-Based Guideline.

PubMed

Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Souers, Rhona J; Fatheree, Lisa A; Volmar, Keith E; Stuart, Lauren N; Nowak, Jan A; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- Laboratories must demonstrate analytic validity before any test can be used clinically, but studies have shown inconsistent practices in immunohistochemical assay validation. - To assess changes in immunohistochemistry analytic validation practices after publication of an evidence-based laboratory practice guideline. - A survey on current immunohistochemistry assay validation practices and on the awareness and adoption of a recently published guideline was sent to subscribers enrolled in one of 3 relevant College of American Pathologists proficiency testing programs and to additional nonsubscribing laboratories that perform immunohistochemical testing. The results were compared with an earlier survey of validation practices. - Analysis was based on responses from 1085 laboratories that perform immunohistochemical staining. Of 1057 responses, 65.4% (691) were aware of the guideline recommendations before this survey was sent and 79.9% (550 of 688) of those have already adopted some or all of the recommendations. Compared with the 2010 survey, a significant number of laboratories now have written validation procedures for both predictive and nonpredictive marker assays and specifications for the minimum numbers of cases needed for validation. There was also significant improvement in compliance with validation requirements, with 99% (100 of 102) having validated their most recently introduced predictive marker assay, compared with 74.9% (326 of 435) in 2010. The difficulty in finding validation cases for rare antigens and resource limitations were cited as the biggest challenges in implementing the guideline. - Dissemination of the 2014 evidence-based guideline validation practices had a positive impact on laboratory performance; some or all of the recommendations have been adopted by nearly 80% of respondents.
Validation of the Evidence-Based Practice Process Assessment Scale

ERIC Educational Resources Information Center

Rubin, Allen; Parrish, Danielle E.

2011-01-01

Objective: This report describes the reliability, validity, and sensitivity of a scale that assesses practitioners' perceived familiarity with, attitudes of, and implementation of the evidence-based practice (EBP) process. Method: Social work practitioners and second-year master of social works (MSW) students (N = 511) were surveyed in four sites…
Extending Validity Evidence for Multidimensional Measures of Coaching Competency

ERIC Educational Resources Information Center

Myers, Nicholas D.; Wolfe, Edward W.; Maier, Kimberly S.; Feltz, Deborah L.; Reckase, Mark D.

2006-01-01

This study extended validity evidence for multidimensional measures of coaching competency derived from the Coaching Competency Scale (CCS; Myers, Feltz, Maier, Wolfe, & Reckase, 2006) by examining use of the original rating scale structure and testing how measures related to satisfaction with the head coach within teams and between teams.…
Evidence of Concurrent Validity of SII Scores for Asian American College Students

ERIC Educational Resources Information Center

Hansen, Jo-Ida C.; Lee, W. Vanessa

2007-01-01

The validity of scores on the Strong Interest Inventory (SII) for Asian American college students has not been thoroughly investigated. This study examined the evidence of validity of the SII Occupational Scale scores for predicting college major choices of Asian American women and men and White women and men. The sample included 186 female and…
Assessing mental health clinicians' intentions to adopt evidence-based treatments: reliability and validity testing of the evidence-based treatment intentions scale.

PubMed

Williams, Nathaniel J

2016-05-05

Intentions play a central role in numerous empirically supported theories of behavior and behavior change and have been identified as a potentially important antecedent to successful evidence-based treatment (EBT) implementation. Despite this, few measures of mental health clinicians' EBT intentions exist and available measures have not been subject to thorough psychometric evaluation or testing. This paper evaluates the psychometric properties of the evidence-based treatment intentions (EBTI) scale, a new measure of mental health clinicians' intentions to adopt EBTs. The study evaluates the reliability and validity of inferences made with the EBTI using multi-method, multi-informant criterion variables collected over 12 months from a sample of 197 mental health clinicians delivering services in 13 mental health agencies. Structural, predictive, and discriminant validity evidence is assessed. Findings support the EBTI's factor structure (χ (2) = 3.96, df = 5, p = .556) and internal consistency reliability (α = .80). Predictive validity evidence was provided by robust and significant associations between EBTI scores and clinicians' observer-reported attendance at a voluntary EBT workshop at a 1-month follow-up (OR = 1.92, p < .05), self-reported EBT adoption at a 12-month follow-up (R (2) = .17, p < .001), and self-reported use of EBTs with clients at a 12-month follow-up (R (2) = .25, p < .001). Discriminant validity evidence was provided by small associations with clinicians' concurrently measured psychological work climate perceptions of functionality (R (2) = .06, p < .05), engagement (R (2) = .06, p < .05), and stress (R (2) = .00, ns). The EBTI is a practical and theoretically grounded measure of mental health clinicians' EBT intentions. Scores on the EBTI provide a basis for valid inferences regarding mental health clinicians' intentions to adopt EBTs. Discussion focuses on research and practice applications.
Diverse convergent evidence in the genetic analysis of complex disease: coordinating omic, informatic, and experimental evidence to better identify and validate risk factors

PubMed Central

2014-01-01

In omic research, such as genome wide association studies, researchers seek to repeat their results in other datasets to reduce false positive findings and thus provide evidence for the existence of true associations. Unfortunately this standard validation approach cannot completely eliminate false positive conclusions, and it can also mask many true associations that might otherwise advance our understanding of pathology. These issues beg the question: How can we increase the amount of knowledge gained from high throughput genetic data? To address this challenge, we present an approach that complements standard statistical validation methods by drawing attention to both potential false negative and false positive conclusions, as well as providing broad information for directing future research. The Diverse Convergent Evidence approach (DiCE) we propose integrates information from multiple sources (omics, informatics, and laboratory experiments) to estimate the strength of the available corroborating evidence supporting a given association. This process is designed to yield an evidence metric that has utility when etiologic heterogeneity, variable risk factor frequencies, and a variety of observational data imperfections might lead to false conclusions. We provide proof of principle examples in which DiCE identified strong evidence for associations that have established biological importance, when standard validation methods alone did not provide support. If used as an adjunct to standard validation methods this approach can leverage multiple distinct data types to improve genetic risk factor discovery/validation, promote effective science communication, and guide future research directions. PMID:25071867
What should we mean by empirical validation in hypnotherapy: evidence-based practice in clinical hypnosis.

PubMed

Alladin, Assen; Sabatini, Linda; Amundson, Jon K

2007-04-01

This paper briefly surveys the trend of and controversy surrounding empirical validation in psychotherapy. Empirical validation of hypnotherapy has paralleled the practice of validation in psychotherapy and the professionalization of clinical psychology, in general. This evolution in determining what counts as evidence for bona fide clinical practice has gone from theory-driven clinical approaches in the 1960s and 1970s through critical attempts at categorization of empirically supported therapies in the 1990s on to the concept of evidence-based practice in 2006. Implications of this progression in professional psychology are discussed in the light of hypnosis's current quest for validation and empirical accreditation.
Validity Evidence for Games as Assessment Environments. CRESST Report 773

ERIC Educational Resources Information Center

Delacruz, Girlie C.; Chung, Gregory K. W. K.; Baker, Eva L.

2010-01-01

This study provides empirical evidence of a highly specific use of games in education--the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also…

The Online Student Connectedness Survey: Evidence of Initial Construct Validity

ERIC Educational Resources Information Center

Zimmerman, Tekeisha; Nimon, Kim

2017-01-01

The Online Student Connectedness Survey (OSCS) was introduced to the academic community in 2012 as an instrument designed to measure feelings of connectedness between students participating in online degree and certification programs. The purpose of this study was to examine data from the instrument for initial evidence of validity and reliability…
A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation.

PubMed

Meuwly, Didier; Ramos, Daniel; Haraksim, Rudolf

2017-07-01

This Guideline proposes a protocol for the validation of forensic evaluation methods at the source level, using the Likelihood Ratio framework as defined within the Bayes' inference model. In the context of the inference of identity of source, the Likelihood Ratio is used to evaluate the strength of the evidence for a trace specimen, e.g. a fingermark, and a reference specimen, e.g. a fingerprint, to originate from common or different sources. Some theoretical aspects of probabilities necessary for this Guideline were discussed prior to its elaboration, which started after a workshop of forensic researchers and practitioners involved in this topic. In the workshop, the following questions were addressed: "which aspects of a forensic evaluation scenario need to be validated?", "what is the role of the LR as part of a decision process?" and "how to deal with uncertainty in the LR calculation?". The questions: "what to validate?" focuses on the validation methods and criteria and "how to validate?" deals with the implementation of the validation protocol. Answers to these questions were deemed necessary with several objectives. First, concepts typical for validation standards [1], such as performance characteristics, performance metrics and validation criteria, will be adapted or applied by analogy to the LR framework. Second, a validation strategy will be defined. Third, validation methods will be described. Finally, a validation protocol and an example of validation report will be proposed, which can be applied to the forensic fields developing and validating LR methods for the evaluation of the strength of evidence at source level under the following propositions. Copyright © 2016. Published by Elsevier B.V.
Reliability and Validity of the Evidence-Based Practice Confidence (EPIC) Scale

ERIC Educational Resources Information Center

Salbach, Nancy M.; Jaglal, Susan B.; Williams, Jack I.

2013-01-01

Introduction: The reliability, minimal detectable change (MDC), and construct validity of the evidence-based practice confidence (EPIC) scale were evaluated among physical therapists (PTs) in clinical practice. Methods: A longitudinal mail survey was conducted. Internal consistency and test-retest reliability were estimated using Cronbach's alpha…
easyCBM® Reading Criterion Related Validity Evidence: Grades K-1. Technical Report #1309

ERIC Educational Resources Information Center

Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald

2013-01-01

In this technical report, we present the results of a study to gather criterion-related evidence for Grade K-1 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Dynamic Indicators of Basic Early Literacy…
Do short courses in evidence based medicine improve knowledge and skills? Validation of Berlin questionnaire and before and after study of courses in evidence based medicine

PubMed Central

Fritsche, L; Greenhalgh, T; Falck-Ytter, Y; Neumayer, H-H; Kunz, R

2002-01-01

Objective To develop and validate an instrument for measuring knowledge and skills in evidence based medicine and to investigate whether short courses in evidence based medicine lead to a meaningful increase in knowledge and skills. Design Development and validation of an assessment instrument and before and after study. Setting Various postgraduate short courses in evidence based medicine in Germany. Participants The instrument was validated with experts in evidence based medicine, postgraduate doctors, and medical students. The effect of courses was assessed by postgraduate doctors from medical and surgical backgrounds. Intervention Intensive 3 day courses in evidence based medicine delivered through tutor facilitated small groups. Main outcome measure Increase in knowledge and skills. Results The questionnaire distinguished reliably between groups with different expertise in evidence based medicine. Experts attained a threefold higher average score than students. Postgraduates who had not attended a course performed better than students but significantly worse than experts. Knowledge and skills in evidence based medicine increased after the course by 57% (mean score before course 6.3 (SD 2.9) v 9.9 (SD 2.8), P<0.001). No difference was found among experts or students in absence of an intervention. Conclusions The instrument reliably assessed knowledge and skills in evidence based medicine. An intensive 3 day course in evidence based medicine led to a significant increase in knowledge and skills. What is already known on this topicNumerous observational studies have investigated the impact of teaching evidence based medicine to healthcare professionals, with conflicting resultsMost of the studies were of poor methodological qualityWhat this study addsAn instrument assessing basic knowledge and skills required for practising evidence based medicine was developed and validatedAn intensive 3 day course on evidence based medicine for doctors from various backgrounds
Latency-Based and Psychophysiological Measures of Sexual Interest Show Convergent and Concurrent Validity.

PubMed

Ó Ciardha, Caoilte; Attard-Johnson, Janice; Bindemann, Markus

2018-04-01

Latency-based measures of sexual interest require additional evidence of validity, as do newer pupil dilation approaches. A total of 102 community men completed six latency-based measures of sexual interest. Pupillary responses were recorded during three of these tasks and in an additional task where no participant response was required. For adult stimuli, there was a high degree of intercorrelation between measures, suggesting that tasks may be measuring the same underlying construct (convergent validity). In addition to being correlated with one another, measures also predicted participants' self-reported sexual interest, demonstrating concurrent validity (i.e., the ability of a task to predict a more validated, simultaneously recorded, measure). Latency-based and pupillometric approaches also showed preliminary evidence of concurrent validity in predicting both self-reported interest in child molestation and viewing pornographic material containing children. Taken together, the study findings build on the evidence base for the validity of latency-based and pupillometric measures of sexual interest.
easyCBM® Reading Criterion Related Validity Evidence: Grades 2-5. Technical Report #1310

ERIC Educational Resources Information Center

Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald

2013-01-01

In this technical report, we present the results of a study to gather criterion-related evidence for Grade 2-5 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Gates-MacGinitie Reading Tests and the Dynamic…
Performance of a cognitive load inventory during simulated handoffs: Evidence for validity.

PubMed

Young, John Q; Boscardin, Christy K; van Dijk, Savannah M; Abdullah, Ruqayyah; Irby, David M; Sewell, Justin L; Ten Cate, Olle; O'Sullivan, Patricia S

2016-01-01

Advancing patient safety during handoffs remains a public health priority. The application of cognitive load theory offers promise, but is currently limited by the inability to measure cognitive load types. To develop and collect validity evidence for a revised self-report inventory that measures cognitive load types during a handoff. Based on prior published work, input from experts in cognitive load theory and handoffs, and a think-aloud exercise with residents, a revised Cognitive Load Inventory for Handoffs was developed. The Cognitive Load Inventory for Handoffs has items for intrinsic, extraneous, and germane load. Students who were second- and sixth-year students recruited from a Dutch medical school participated in four simulated handoffs (two simple and two complex cases). At the end of each handoff, study participants completed the Cognitive Load Inventory for Handoffs, Paas' Cognitive Load Scale, and one global rating item for intrinsic load, extraneous load, and germane load, respectively. Factor and correlational analyses were performed to collect evidence for validity. Confirmatory factor analysis yielded a single factor that combined intrinsic and germane loads. The extraneous load items performed poorly and were removed from the model. The score from the combined intrinsic and germane load items associated, as predicted by cognitive load theory, with a commonly used measure of overall cognitive load (Pearson's r = 0.83, p < 0.001), case complexity (beta = 0.74, p < 0.001), level of experience (beta = -0.96, p < 0.001), and handoff accuracy (r = -0.34, p < 0.001). These results offer encouragement that intrinsic load during handoffs may be measured via a self-report measure. Additional work is required to develop an adequate measure of extraneous load.
Logic Brightens My Day: Evidence for Implicit Sensitivity to Logical Validity

ERIC Educational Resources Information Center

Trippas, Dries; Handley, Simon J.; Verde, Michael F.; Morsanyi, Kinga

2016-01-01

A key assumption of dual process theory is that reasoning is an explicit, effortful, deliberative process. The present study offers evidence for an implicit, possibly intuitive component of reasoning. Participants were shown sentences embedded in logically valid or invalid arguments. Participants were not asked to reason but instead rated the…
An Australian Version of the Neighborhood Environment Walkability Scale: Validity Evidence

ERIC Educational Resources Information Center

Cerin, Ester; Leslie, Eva; Owen, Neville; Bauman, Adrian

2008-01-01

This study examined validity evidence for the Australian version of the Neighborhood Environment Walkability Scale (NEWS-AU). A stratified two-stage cluster sampling design was used to recruit 2,650 adults from Adelaide (Australia). The sample was drawn from residential addresses within eight high-walkable and eight low-walkable suburbs matched…
Patient-Reported Outcome Measures for Hand and Wrist Trauma: Is There Sufficient Evidence of Reliability, Validity, and Responsiveness?

PubMed

Dacombe, Peter Jonathan; Amirfeyz, Rouin; Davis, Tim

2016-03-01

Patient-reported outcome measures (PROMs) are important tools for assessing outcomes following injuries to the hand and wrist. Many commonly used PROMs have no evidence of reliability, validity, and responsiveness in a hand and wrist trauma population. This systematic review examines the PROMs used in the assessment of hand and wrist trauma patients, and the evidence for reliability, validity, and responsiveness of each measure in this population. A systematic review of Pubmed, Medline, and CINAHL searching for randomized controlled trials of patients with traumatic injuries to the hand and wrist was carried out to identify the PROMs. For each identified PROM, evidence of reliability, validity, and responsiveness was identified using a further systematic review of the Pubmed, Medline, CINAHL, and reverse citation trail audit procedure. The PROM used most often was the Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire; the Patient-Rated Wrist Evaluation (PRWE), Gartland and Werley score, Michigan Hand Outcomes score, Mayo Wrist Score, and Short Form 36 were also commonly used. Only the DASH and PRWE have evidence of reliability, validity, and responsiveness in patients with traumatic injuries to the hand and wrist; other measures either have incomplete evidence or evidence gathered in a nontraumatic population. The DASH and PRWE both have evidence of reliability, validity, and responsiveness in a hand and wrist trauma population. Other PROMs used to assess hand and wrist trauma patients do not. This should be considered when selecting a PROM for patients with traumatic hand and wrist pathology.
Validity-Supporting Evidence of the Self-Efficacy for Teaching Mathematics Instrument

ERIC Educational Resources Information Center

McGee, Jennifer R.; Wang, Chuang

2014-01-01

The purpose of this study is to provide evidence of reliability and validity of the Self-Efficacy for Teaching Mathematics Instrument (SETMI). Self-efficacy, as defined by Bandura, was the theoretical framework for the development of the instrument. The complex belief systems of mathematics teachers, as touted by Ernest provided insights into the…
Screening tool for oropharyngeal dysphagia in stroke - Part I: evidence of validity based on the content and response processes.

PubMed

Almeida, Tatiana Magalhães de; Cola, Paula Cristina; Pernambuco, Leandro de Araújo; Magalhães, Hipólito Virgílio; Magnoni, Carlos Daniel; Silva, Roberta Gonçalves da

2017-08-17

The aim of the present study was to identify the evidence of validity based on the content and response process of the Rastreamento de Disfagia Orofaríngea no Acidente Vascular Encefálico (RADAVE; "Screening Tool for Oropharyngeal Dysphagia in Stroke"). The criteria used to elaborate the questions were based on a literature review. A group of judges consisting of 19 different health professionals evaluated the relevance and representativeness of the questions, and the results were analyzed using the Content Validity Index. In order to evidence validity based on the response processes, 23 health professionals administered the screening tool and analyzed the questions using a structured scale and cognitive interview. The RADAVE structured to be applied in two stages. The first version consisted of 18 questions in stage I and 11 questions in stage II. Eight questions in stage I and four in stage II did not reach the minimum Content Validity Index, requiring reformulation by the authors. The cognitive interview demonstrated some misconceptions. New adjustments were made and the final version was produced with 12 questions in stage I and six questions in stage II. It was possible to develop a screening tool for dysphagia in stroke with adequate evidence of validity based on content and response processes. Both validity evidences obtained so far allowed to adjust the screening tool in relation to its construct. The next studies will analyze the other evidences of validity and the measures of accuracy.
[Job stress and quality of life of primary care health-workers: evidence of validity of the PECVEC questionnaire].

PubMed

Fernández-López, Juan Antonio; Fernández-Fidalgo, María; Martín-Payo, Rubén; Rödel, Andreas

2007-08-01

To evaluate the relationship between Health-Related Quality of Life (HRQL) and stress at work among Primary Care workers, as evidence of the construct validity of the Spanish version (PECVEC) of the profile of quality of life in the chronically ill (PLC) questionnaire. In addition, to check its other psychometric properties. Cross-sectional study. Eighteen primary care centres in Health Area IV, Asturias (Oviedo), Spain, sharing similar socio-demographic conditions. Two hundred and thirty-three primary care nurses and physicians. HRQL was evaluated by the 6 general dimensions of the Spanish version of the PLC. Stress at work was evaluated by the three scales of the Effort-Reward Imbalance (ERI) questionnaire. The construct validity of the PECVEC was assessed by testing the inverse associations of QoL dimensions and job stress ones, when the most important confuser variables were monitored. The non-response rate was low (<3%), and no floor effects and only small ceiling effects were observed. Internal consistency analysis and exploratory and confirmatory factor analysis demonstrated high reliability, factorial validity and convergent/divergent validity of the PECVEC. The PECVEC demonstrates adequate psychometric properties for evaluating HRQL in healthy subjects.
Evidence flow graph methods for validation and verification of expert systems

NASA Technical Reports Server (NTRS)

Becker, Lee A.; Green, Peter G.; Bhatnagar, Jayant

1989-01-01

The results of an investigation into the use of evidence flow graph techniques for performing validation and verification of expert systems are given. A translator to convert horn-clause rule bases into evidence flow graphs, a simulation program, and methods of analysis were developed. These tools were then applied to a simple rule base which contained errors. It was found that the method was capable of identifying a variety of problems, for example that the order of presentation of input data or small changes in critical parameters could affect the output from a set of rules.
Evidence flow graph methods for validation and verification of expert systems

NASA Technical Reports Server (NTRS)

Becker, Lee A.; Green, Peter G.; Bhatnagar, Jayant

1988-01-01

This final report describes the results of an investigation into the use of evidence flow graph techniques for performing validation and verification of expert systems. This was approached by developing a translator to convert horn-clause rule bases into evidence flow graphs, a simulation program, and methods of analysis. These tools were then applied to a simple rule base which contained errors. It was found that the method was capable of identifying a variety of problems, for example that the order of presentation of input data or small changes in critical parameters could effect the output from a set of rules.
Reliability and validity evidence of the Assessment of Language Use in Social Contexts for Adults (ALUSCA).

PubMed

Valente, Ana Rita S; Hall, Andreia; Alvelos, Helena; Leahy, Margaret; Jesus, Luis M T

2018-04-12

The appropriate use of language in context depends on the speaker's pragmatic language competencies. A coding system was used to develop a specific and adult-focused self-administered questionnaire to adults who stutter and adults who do not stutter, The Assessment of Language Use in Social Contexts for Adults, with three categories: precursors, basic exchanges, and extended literal/non-literal discourse. This paper presents the content validity, item analysis, reliability coefficients and evidences of construct validity of the instrument. Content validity analysis was based on a two-stage process: first, 11 pragmatic questionnaires were assessed to identify items that probe each pragmatic competency and to create the first version of the instrument; second, items were assessed qualitatively by an expert panel composed by adults who stutter and controls, and quantitatively and qualitatively by an expert panel composed by clinicians. A pilot study was conducted with five adults who stutter and five controls to analyse items and calculate reliability. Construct validity evidences were obtained using the hypothesized relationships method and factor analysis with 28 adults who stutter and 28 controls. Concerning content validity, the questionnaires assessed up to 13 pragmatic competencies. Qualitative and quantitative analysis revealed ambiguities in items construction. Disagreement between experts was solved through item modification. The pilot study showed that the instrument presented internal consistency and temporal stability. Significant differences between adults who stutter and controls and different response profiles revealed the instrument's underlying construct. The instrument is reliable and presented evidences of construct validity.
Adaptation and validation of the Evidence-Based Practice Belief and Implementation scales for French-speaking Swiss nurses and allied healthcare providers.

PubMed

Verloo, Henk; Desmedt, Mario; Morin, Diane

2017-09-01

To evaluate two psychometric properties of the French versions of the Evidence-Based Practice Beliefs and Evidence-Based Practice Implementation scales, namely their internal consistency and construct validity. The Evidence-Based Practice Beliefs and Evidence-Based Practice Implementation scales developed by Melnyk et al. are recognised as valid, reliable instruments in English. However, no psychometric validation for their French versions existed. Secondary analysis of a cross sectional survey. Source data came from a cross-sectional descriptive study sample of 382 nurses and other allied healthcare providers. Cronbach's alpha was used to evaluate internal consistency, and principal axis factor analysis and varimax rotation were computed to determine construct validity. The French Evidence-Based Practice Beliefs and Evidence-Based Practice Implementation scales showed excellent reliability, with Cronbach's alphas close to the scores established by Melnyk et al.'s original versions. Principal axis factor analysis showed medium-to-high factor loading scores without obtaining collinearity. Principal axis factor analysis with varimax rotation of the 16-item Evidence-Based Practice Beliefs scale resulted in a four-factor loading structure. Principal axis factor analysis with varimax rotation of the 17-item Evidence-Based Practice Implementation scale revealed a two-factor loading structure. Further research should attempt to understand why the French Evidence-Based Practice Implementation scale showed a two-factor loading structure but Melnyk et al.'s original has only one. The French versions of the Evidence-Based Practice Beliefs and Evidence-Based Practice Implementation scales can both be considered valid and reliable instruments for measuring Evidence-Based Practice beliefs and implementation. The results suggest that the French Evidence-Based Practice Beliefs and Evidence-Based Practice Implementation scales are valid and reliable and can therefore be used to
Validating the Watson Glaser Critical Thinking Appraisal

ERIC Educational Resources Information Center

Hassan, Karma El; Madhum, Ghida

2007-01-01

This study validated the Watson Glaser Critical Thinking Appraisal (WGCTA) on a sample of 273 private university students in Lebanon. For that purpose, evidence for construct validation was investigated through identifying the test's factor structure and subscale total correlations, in addition to differences in scores by gender, different levels,…
Definitions and validation criteria for biomarkers and surrogate endpoints: development and testing of a quantitative hierarchical levels of evidence schema.

PubMed

Lassere, Marissa N; Johnson, Kent R; Boers, Maarten; Tugwell, Peter; Brooks, Peter; Simon, Lee; Strand, Vibeke; Conaghan, Philip G; Ostergaard, Mikkel; Maksymowych, Walter P; Landewe, Robert; Bresnihan, Barry; Tak, Paul-Peter; Wakefield, Richard; Mease, Philip; Bingham, Clifton O; Hughes, Michael; Altman, Doug; Buyse, Marc; Galbraith, Sally; Wells, George

2007-03-01

There are clear advantages to using biomarkers and surrogate endpoints, but concerns about clinical and statistical validity and systematic methods to evaluate these aspects hinder their efficient application. Our objective was to review the literature on biomarkers and surrogates to develop a hierarchical schema that systematically evaluates and ranks the surrogacy status of biomarkers and surrogates; and to obtain feedback from stakeholders. After a systematic search of Medline and Embase on biomarkers, surrogate (outcomes, endpoints, markers, indicators), intermediate endpoints, and leading indicators, a quantitative surrogate validation schema was developed and subsequently evaluated at a stakeholder workshop. The search identified several classification schema and definitions. Components of these were incorporated into a new quantitative surrogate validation level of evidence schema that evaluates biomarkers along 4 domains: Target, Study Design, Statistical Strength, and Penalties. Scores derived from 3 domains the Target that the marker is being substituted for, the Design of the (best) evidence, and the Statistical strength are additive. Penalties are then applied if there is serious counterevidence. A total score (0 to 15) determines the level of evidence, with Level 1 the strongest and Level 5 the weakest. It was proposed that the term "surrogate" be restricted to markers attaining Levels 1 or 2 only. Most stakeholders agreed that this operationalization of the National Institutes of Health definitions of biomarker, surrogate endpoint, and clinical endpoint was useful. Further development and application of this schema provides incentives and guidance for effective biomarker and surrogate endpoint research, and more efficient drug discovery, development, and approval.

Criminal profiling as expert witness evidence: The implications of the profiler validity research.

PubMed

Kocsis, Richard N; Palermo, George B

The use and development of the investigative tool colloquially known as criminal profiling has steadily increased over the past five decades throughout the world. Coupled with this growth has been a diversification in the suggested range of applications for this technique. Possibly the most notable of these has been the attempted transition of the technique from a tool intended to assist police investigations into a form of expert witness evidence admissible in legal proceedings. Whilst case law in various jurisdictions has considered with mutual disinclination the evidentiary admissibility of criminal profiling, a disjunction has evolved between these judicial examinations and the scientifically vetted research testing the accuracy (i.e., validity) of the technique. This article offers an analysis of the research directly testing the validity of the criminal profiling technique and the extant legal principles considering its evidentiary admissibility. This analysis reveals that research findings concerning the validity of criminal profiling are surprisingly compatible with the extant legal principles. The overall conclusion is that a discrete form of crime behavioural analysis is supported by the profiler validity research and could be regarded as potentially admissible expert witness evidence. Finally, a number of theoretical connections are also identified concerning the skills and qualifications of individuals who may feasibly provide such expert testimony. Copyright © 2016 Elsevier Ltd. All rights reserved.
What Counts as Validity Evidence? Examples and Prevalence in a Systematic Review of Simulation-Based Assessment

ERIC Educational Resources Information Center

Cook, David A.; Zendejas, Benjamin; Hamstra, Stanley J.; Hatala, Rose; Brydges, Ryan

2014-01-01

Ongoing transformations in health professions education underscore the need for valid and reliable assessment. The current standard for assessment validation requires evidence from five sources: content, response process, internal structure, relations with other variables, and consequences. However, researchers remain uncertain regarding the types…
75 FR 25763 - Addition to the List of Validated End-Users: Advanced Micro Devices China, Inc.

Federal Register 2010, 2011, 2012, 2013, 2014

2010-05-10

.... Additional Validated End-User in the PRC and Its Respective ``Eligible Items (By ECCN)'' and ``Eligible... to the ``development'' of products under ECCN 4A003). This authorization was made based on an... Country Validated end-user Eligible items (by ECCN) Eligible destination China (People's Republic of...
Preliminary evidence for validity of the Bahasa Indonesian version of Study Process Questionnaire.

PubMed

Liem, Arief Darmanegara; Prasetya, Paulus Hidajat

2007-02-01

This study provides preliminary evidence for the validity of the Bahasa Indonesian version of the Study Process Questionnaire (BI-SPQ) from a sample of 147 psychology students (22 men and 125 women; M age = 21.8 yr., SD = 1.3). The internal consistency alpha of the BI-SPQ subscales were found to range from .46 (Surface Strategy) to .77 (Deep Strategy), with a median of .67. Principal component analysis indicated a two-factor solution, where the Deep and Achieving subscales loaded onto Factor 1 and the Surface subscales loaded on Factor 2. Students' GPAs were associated negatively with Surface Motive (r = -.24) and were associated positively with Deep and Achieving Motives (rs = .20). Further studies with larger samples involving students majoring in other disciplines are needed to provide further evidence of the validity of the BI-SPQ.
Voices from Test-Takers: Further Evidence for Language Assessment Validation and Use

ERIC Educational Resources Information Center

Cheng, Liying; DeLuca, Christopher

2011-01-01

Test-takers' interpretations of validity as related to test constructs and test use have been widely debated in large-scale language assessment. This study contributes further evidence to this debate by examining 59 test-takers' perspectives in writing large-scale English language tests. Participants wrote about their test-taking experiences in…
A Brazilian Portuguese Survey of School Climate: Evidence of Validity and Reliability

ERIC Educational Resources Information Center

Bear, George G.; Holst, Bruna; Lisboa, Carolina; Chen, Dandan; Yang, Chunyan; Chen, Fang Fang

2016-01-01

This study presents evidence of the validity and reliability of scores for the newly developed Brazilian Portuguese version of the Delaware School Climate Survey-Student (Brazilian DSCS-S). The sample consisted of 378 students, grades 5 through 9, attending four private and three public schools in southern Brazil. Confirmatory factor analyses…
Evidence for the construct validity of self-motivation as a correlate of exercise adherence in French older adults.

PubMed

André, Nathalie; Dishman, Rod K

2012-04-01

Exercise adherence involves a number of sociocognitive factors that influence the adoption and maintenance of regular physical activity. Among trait-like factors, self-motivation is believed to be a unique predictor of persistence during behavior change. The aim of this study was to validate the factor structure of a French version of the Self-Motivation Inventory (SMI) and to provide initial convergent and discriminant evidence for its construct validity as a correlate of exercise adherence. Four hundred seventy-one elderly were recruited and administered the SMI-10. Structural equation modeling tested the relation of SMI-10 scores with exercise adherence in a correlated network that included decisional balance and perceived quality of life. Acceptable evidence was found to support the factor validity and measurement equivalence of the French version of the SMI-10. Moreover, self-motivation was related to exercise adherence independently of decisional balance and perceived quality of life, providing initial evidence for construct validity.
"Compacted" procedures for adults' simple addition: A review and critique of the evidence.

PubMed

Chen, Yalin; Campbell, Jamie I D

2018-04-01

We review recent empirical findings and arguments proffered as evidence that educated adults solve elementary addition problems (3 + 2, 4 + 1) using so-called compacted procedures (e.g., unconscious, automatic counting); a conclusion that could have significant pedagogical implications. We begin with the large-sample experiment reported by Uittenhove, Thevenot and Barrouillet (2016, Cognition, 146, 289-303), which tested 90 adults on the 81 single-digit addition problems from 1 + 1 to 9 + 9. They identified the 12 very-small addition problems with different operands both ≤ 4 (e.g., 4 + 3) as a distinct subgroup of problems solved by unconscious, automatic counting: These items yielded a near-perfectly linear increase in answer response time (RT) yoked to the sum of the operands. Using the data reported in the article, however, we show that there are clear violations of the sum-counting model's predictions among the very-small addition problems, and that there is no real RT boundary associated with addends ≤4. Furthermore, we show that a well-known associative retrieval model of addition facts-the network interference theory (Campbell, 1995)-predicts the results observed for these problems with high precision. We also review the other types of evidence adduced for the compacted procedure theory of simple addition and conclude that these findings are unconvincing in their own right and only distantly consistent with automatic counting. We conclude that the cumulative evidence for fast compacted procedures for adults' simple addition does not justify revision of the long-standing assumption that direct memory retrieval is ultimately the most efficient process of simple addition for nonzero problems, let alone sufficient to recommend significant changes to basic addition pedagogy.
Validity Evidence for the Test of Silent Reading Efficiency and Comprehension (TOSREC)

ERIC Educational Resources Information Center

Johnson, Evelyn S.; Pool, Juli L.; Carter, Deborah R.

2011-01-01

An essential component of a response to intervention (RTI) framework is a screening process that is both accurate and efficient. The purpose of this study was to analyze the validity evidence for the "Test of Silent Reading Efficiency and Comprehension" (TOSREC) to determine its potential for use within a screening process. Participants included…
Factor analysis methods and validity evidence: A systematic review of instrument development across the continuum of medical education

NASA Astrophysics Data System (ADS)

Wetzel, Angela Payne

Previous systematic reviews indicate a lack of reporting of reliability and validity evidence in subsets of the medical education literature. Psychology and general education reviews of factor analysis also indicate gaps between current and best practices; yet, a comprehensive review of exploratory factor analysis in instrument development across the continuum of medical education had not been previously identified. Therefore, the purpose for this study was critical review of instrument development articles employing exploratory factor or principal component analysis published in medical education (2006--2010) to describe and assess the reporting of methods and validity evidence based on the Standards for Educational and Psychological Testing and factor analysis best practices. Data extraction of 64 articles measuring a variety of constructs that have been published throughout the peer-reviewed medical education literature indicate significant errors in the translation of exploratory factor analysis best practices to current practice. Further, techniques for establishing validity evidence tend to derive from a limited scope of methods including reliability statistics to support internal structure and support for test content. Instruments reviewed for this study lacked supporting evidence based on relationships with other variables and response process, and evidence based on consequences of testing was not evident. Findings suggest a need for further professional development within the medical education researcher community related to (1) appropriate factor analysis methodology and reporting and (2) the importance of pursuing multiple sources of reliability and validity evidence to construct a well-supported argument for the inferences made from the instrument. Medical education researchers and educators should be cautious in adopting instruments from the literature and carefully review available evidence. Finally, editors and reviewers are encouraged to recognize
Screening for depressive symptoms in adolescents at school: New validity evidences on the short form of the Reynolds Depression Scale.

PubMed

Ortuño-Sierra, Javier; Aritio-Solana, Rebeca; Inchausti, Félix; Chocarro de Luis, Edurne; Lucas Molina, Beatriz; Pérez de Albéniz, Alicia; Fonseca-Pedrero, Eduardo

2017-01-01

The main purpose of the present study was to assess the depressive symptomatology and to gather new validity evidences of the Reynolds Depression Scale-Short form (RADS-SF) in a representative sample of youths. The sample consisted of 2914 adolescents with a mean age of 15.85 years (SD = 1.68). We calculated the descriptive statistics and internal consistency of the RADS-SF scores. Also, confirmatory factor analyses (CFAs) at the item level and successive multigroup CFAs to test measurement invariance, were conducted. Latent mean differences across gender and educational level groups were estimated, and finally, we studied the sources of validity evidences with other external variables. The level of internal consistency of the RADS-SF Total score by means of Ordinal alpha was .89. Results from CFAs showed that the one-dimensional model displayed appropriate goodness of-fit indices with CFI value over .95, and RMSEA value under .08. In addition, the results support the strong measurement invariance of the RADS-SF scores across gender and age. When latent means were compared, statistically significant differences were found by gender and age. Females scored 0.347 over than males in Depression latent variable, whereas older adolescents scored 0.111 higher than the younger group. In addition, the RADS-SF score was associated with the RADS scores. The results suggest that the RADS-SF could be used as an efficient screening test to assess self-reported depressive symptoms in adolescents from the general population.
Evaluating Evidence Regarding Relationships with Criteria

ERIC Educational Resources Information Center

Balkin, Richard S.

2017-01-01

An overview of standards related to demonstrating evidence regarding relationships with criteria as it pertains to instrument development was presented, along with heuristic examples. Additional measures and a comprehensive design are necessary to establish evidence related to the use and interpretation of test scores for the validation of a…
The behavioral regulation in sport questionnaire (BRSQ): instrument development and initial validity evidence.

PubMed

Lonsdale, Chris; Hodge, Ken; Rose, Elaine A

2008-06-01

The purpose of the four studies described in this article was to develop and test a new measure of competitive sport participants' intrinsic motivation, extrinsic motivation, and amotivation (self-determination theory; Deci & Ryan, 1985). The items for the new measure, named the Behavioral Regulation in Sport Questionnaire (BRSQ), were constructed using interviews, expert review, and pilot testing. Analyses supported the internal consistency, test-retest reliability, and factorial validity of the BRSQ scores. Nomological validity evidence was also supportive, as BRSQ subscale scores were correlated in the expected pattern with scores derived from measures of motivational consequences. When directly compared with scores derived from the Sport Motivation Scale (SMS; Pelletier, Fortier, Vallerand, Tuson, & Blais, 1995) and a revised version of that questionnaire (SMS-6; Mallett, Kawabata, Newcombe, Otero-Forero, & Jackson, 2007), BRSQ scores demonstrated equal or superior reliability and factorial validity as well as better nomological validity.
Investigating Attitudes toward Physical Education: Validation across Two Instruments

ERIC Educational Resources Information Center

Donovan, Corinne Baron; Mercier, Kevin; Phillips, Sharon R.

2015-01-01

The Centers for Disease Control have suggested that physical education plays a role in promoting healthy lifestyles. Prior research suggests a link between attitudes toward physical education and physical activity outside school. The current study provides additional evidence of construct validity through a validation across two instruments…
Project on the Good Physician: Further Evidence for the Validity of a Moral Intuitionist Model of Virtuous Caring.

PubMed

Leffel, G Michael; Oakes Mueller, Ross A; Ham, Sandra A; Karches, Kyle E; Curlin, Farr A; Yoon, John D

2018-01-19

In the Project on the Good Physician, the authors propose a moral intuitionist model of virtuous caring that places the virtues of Mindfulness, Empathic Compassion, and Generosity at the heart of medical character education. Hypothesis 1a: The virtues of Mindfulness, Empathic Compassion, and Generosity will be positively associated with one another (convergent validity). Hypothesis 1b: The virtues of Mindfulness and Empathic Compassion will explain variance in the action-related virtue of Generosity beyond that predicted by Big Five personality traits alone (discriminant validity). Hypothesis 1c: Virtuous students will experience greater well-being ("flourishing"), as measured by four indices of well-being: life meaning, life satisfaction, vocational identity, and vocational calling (predictive validity). Hypothesis 1d: Students who self-report higher levels of the virtues will be nominated by their peers for the Gold Humanism Award (predictive validity). Hypothesis 2a-2c: Neuroticism and Burnout will be positively associated with each other and inversely associated with measures of virtue and well-being. The authors used data from a 2011 nationally representative sample of U.S. medical students (n = 499) in which medical virtues (Mindfulness, Empathic Compassion, and Generosity) were measured using scales adapted from existing instruments with validity evidence. Supporting the predictive validity of the model, virtuous students were recognized by their peers to be exemplary doctors, and they were more likely to have higher ratings on measures of student well-being. Supporting the discriminant validity of the model, virtues predicted prosocial behavior (Generosity) more than personality traits alone, and students higher in the virtue of Mindfulness were less likely to be high in Neuroticism and Burnout. Data from this descriptive-correlational study offered additional support for the validity of the moral intuitionist model of virtuous caring. Applied to medical
Validity Evidence for a Serious Game to Assess Performance on Critical Pediatric Emergency Medicine Scenarios.

PubMed

Gerard, James M; Scalzo, Anthony J; Borgman, Matthew A; Watson, Christopher M; Byrnes, Chelsie E; Chang, Todd P; Auerbach, Marc; Kessler, David O; Feldman, Brian L; Payne, Brian S; Nibras, Sohail; Chokshi, Riti K; Lopreiato, Joseph O

2018-06-01

We developed a first-person serious game, PediatricSim, to teach and assess performances on seven critical pediatric scenarios (anaphylaxis, bronchiolitis, diabetic ketoacidosis, respiratory failure, seizure, septic shock, and supraventricular tachycardia). In the game, players are placed in the role of a code leader and direct patient management by selecting from various assessment and treatment options. The objective of this study was to obtain supportive validity evidence for the PediatricSim game scores. Game content was developed by 11 subject matter experts and followed the American Heart Association's 2011 Pediatric Advanced Life Support Provider Manual and other authoritative references. Sixty subjects with three different levels of experience were enrolled to play the game. Before game play, subjects completed a 40-item written pretest of knowledge. Game scores were compared between subject groups using scoring rubrics developed for the scenarios. Validity evidence was established and interpreted according to Messick's framework. Content validity was supported by a game development process that involved expert experience, focused literature review, and pilot testing. Subjects rated the game favorably for engagement, realism, and educational value. Interrater agreement on game scoring was excellent (intraclass correlation coefficient = 0.91, 95% confidence interval = 0.89-0.9). Game scores were higher for attendings followed by residents then medical students (Pc < 0.01) with large effect sizes (1.6-4.4) for each comparison. There was a very strong, positive correlation between game and written test scores (r = 0.84, P < 0.01). These findings contribute validity evidence for PediatricSim game scores to assess knowledge of pediatric emergency medicine resuscitation.
Development and Validation of the Evidence-Based Practice Process Assessment Scale: Preliminary Findings

ERIC Educational Resources Information Center

Rubin, Allen; Parrish, Danielle E.

2010-01-01

Objective: This report describes the development and preliminary findings regarding the reliability, validity, and sensitivity of a scale that has been developed to assess practitioners' perceived familiarity with, attitudes about, and implementation of the phases of the evidence-based practice (EBP) process. Method: After a panel of national…
Internal Factor Structure and Convergent Validity Evidence: The Self-Report Version of Self-Regulation Strategy Inventory

ERIC Educational Resources Information Center

Cleary, Timothy J.; Dembitzer, Leah; Kettler, Ryan J.

2015-01-01

Using a sample of 348 middle school students, we gathered evidence regarding the internal consistency of scores, as well as the internal factor structure and convergent validity evidence for inferences from a self-report questionnaire called the Self-Regulation Strategy Inventory-Self Report. Confirmatory factor analysis revealed that the fit…
Construct Validity Evidence for Single-Response Items to Estimate Physical Activity Levels in Large Sample Studies

ERIC Educational Resources Information Center

Jackson, Allen W.; Morrow, James R., Jr.; Bowles, Heather R.; FitzGerald, Shannon J.; Blair, Steven N.

2007-01-01

Valid measurement of physical activity is important for studying the risks for morbidity and mortality. The purpose of this study was to examine evidence of construct validity of two similar single-response items assessing physical activity via self-report. Both items are based on the stages of change model. The sample was 687 participants (men =…
The validation of forensic DNA extraction systems to utilize soil contaminated biological evidence.

PubMed

Kasu, Mohaimin; Shires, Karen

2015-07-01

The production of full DNA profiles from biological evidence found in soil has a high failure rate due largely to the inhibitory substance humic acid (HA). Abundant in various natural soils, HA co-extracts with DNA during extraction and inhibits DNA profiling by binding to the molecular components of the genotyping assay. To successfully utilize traces of soil contaminated evidence, such as that found at many murder and rape crime scenes in South Africa, a reliable HA removal extraction system would often be selected based on previous validation studies. However, for many standard forensic DNA extraction systems, peer-reviewed publications detailing the efficacy on soil evidence is either lacking or is incomplete. Consequently, these sample types are often not collected or fail to yield suitable DNA material due to the use of unsuitable methodology. The aim of this study was to validate the common forensic DNA collection and extraction systems used in South Africa, namely DNA IQ, FTA elute and Nucleosave for processing blood and saliva contaminated with HA. A forensic appropriate volume of biological evidence was spiked with HA (0, 0.5, 1.5 and 2.5 mg/ml) and processed through each extraction protocol for the evaluation of HA removal using QPCR and STR-genotyping. The DNA IQ magnetic bead system effectively removed HA from highly contaminated blood and saliva, and generated consistently acceptable STR profiles from both artificially spiked samples and crude soil samples. This system is highly recommended for use on soil-contaminated evidence over the cellulose card-based systems currently being preferentially used for DNA sample collection. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Validity Evidence for the Measurement of the Strength of Motivation for Medical School

ERIC Educational Resources Information Center

Kusurkar, Rashmi; Croiset, Gerda; Kruitwagen, Cas; ten Cate, Olle

2011-01-01

The Strength of Motivation for Medical School (SMMS) questionnaire is designed to determine the strength of motivation of students particularly for medical study. This research was performed to establish the validity evidence for measuring strength of motivation for medical school. Internal structure and relations to other variables were used as…
Measuring Students' Motivation: Validity Evidence for the MUSIC Model of Academic Motivation Inventory

ERIC Educational Resources Information Center

Jones, Brett D.; Skaggs, Gary

2016-01-01

This study provides validity evidence for the MUSIC Model of Academic Motivation Inventory (MUSIC Inventory; Jones, 2012), which measures college students' beliefs related to the five components of the MUSIC Model of Motivation (MUSIC model; Jones, 2009). The MUSIC model is a conceptual framework for five categories of teaching strategies (i.e.,…
Evaluating Evidence for Conceptually Related Constructs Using Bivariate Correlations

ERIC Educational Resources Information Center

Swank, Jacqueline M.; Mullen, Patrick R.

2017-01-01

The article serves as a guide for researchers in developing evidence of validity using bivariate correlations, specifically construct validity. The authors outline the steps for calculating and interpreting bivariate correlations. Additionally, they provide an illustrative example and discuss the implications.
Validation of the Portuguese version of the Evidence-Based Practice Questionnaire

PubMed Central

Pereira, Rui Pedro Gomes; Guerra, Ana Cristina Pinheiro; Cardoso, Maria José da Silva Peixoto de Oliveira; dos Santos, Alzira Teresa Vieira Martins Ferreira; de Figueiredo, Maria do Céu Aguiar Barbieri; Carneiro, António Cândido Vaz

2015-01-01

OBJECTIVES: to describe the process of translation and linguistic and cultural validation of the Evidence Based Practice Questionnaire for the Portuguese context: Questionário de Eficácia Clínica e Prática Baseada em Evidências (QECPBE). METHOD: a methodological and cross-sectional study was developed. The translation and back translation was performed according to traditional standards. Principal Components Analysis with orthogonal rotation according to the Varimax method was used to verify the QECPBE's psychometric characteristics, followed by confirmatory factor analysis. Internal consistency was determined by Cronbach's alpha. Data were collected between December 2013 and February 2014. RESULTS: 358 nurses delivering care in a hospital facility in North of Portugal participated in the study. QECPBE contains 20 items and three subscales: Practice (α=0.74); Attitudes (α=0.75); Knowledge/Skills and Competencies (α=0.95), presenting an overall internal consistency of α=0.74. The tested model explained 55.86% of the variance and presented good fit: χ2(167)=520.009; p = 0.0001; χ2df=3.114; CFI=0.908; GFI=0.865; PCFI=0.798; PGFI=0.678; RMSEA=0.077 (CI90%=0.07-0.08). CONCLUSION: confirmatory factor analysis revealed the questionnaire is valid and appropriate to be used in the studied context. PMID:26039307
Examining validity evidence for a simulation-based assessment tool for basic robotic surgical skills.

PubMed

Havemann, Maria Cecilie; Dalsgaard, Torur; Sørensen, Jette Led; Røssaak, Kristin; Brisling, Steffen; Mosgaard, Berit Jul; Høgdall, Claus; Bjerrum, Flemming

2018-05-14

Increasing focus on patient safety makes it important to ensure surgical competency among surgeons before operating on patients. The objective was to gather validity evidence for a virtual-reality simulator test for robotic surgical skills and evaluate its potential as a training tool. Surgeons with varying experience in robotic surgery were recruited: novices (zero procedures), intermediates (1-50), experienced (> 50). Five experienced surgeons rated five exercises on the da Vinci Skills Simulator. Participants were tested using the five exercises. Participants were invited back 3 times and completed a total of 10 attempts per exercise. The outcome was the average simulator performance score for the 5 exercises. 32 participants from 5 surgical specialties were included. 38 participants completed all 4 sessions. A moderate correlation between the average total score and robotic experience was identified for the first attempt (Spearman r = 0.58; p = 0.0004). A difference in average total score was observed between novices and intermediates [median score 61% (IQR 52-66) vs. 83% (IQR 75-91), adjusted p < 0.0001], as well as novices and experienced [median score 61% (IQR 52-66) vs. 80 (IQR 69-85), adjusted p = 0.002]. All three groups improved their performance between the 1st and 10th attempts (p < 0.00). This study describes validity evidence for a virtual-reality simulator for basic robotic surgical skills, which can be used for assessment of basic competency and as a training tool. However, more validity evidence is needed before it can be used for certification or high-stakes assessment.
Evidence that Additions of Grignard Reagents to Aliphatic Aldehydes Do Not Involve Single-Electron-Transfer Processes.

PubMed

Otte, Douglas A L; Woerpel, K A

2015-08-07

Addition of allylmagnesium reagents to an aliphatic aldehyde bearing a radical clock gave only addition products and no evidence of ring-opened products that would suggest single-electron-transfer reactions. The analogous Barbier reaction also did not provide evidence for a single-electron-transfer mechanism in the addition step. Other Grignard reagents (methyl-, vinyl-, t-Bu-, and triphenylmethylmagnesium halides) also do not appear to add to an alkyl aldehyde by a single-electron-transfer mechanism.
Construct Validity Evidence for Bracken School Readiness Assessment, Third Edition, Spanish Form Scores

ERIC Educational Resources Information Center

Ortiz, Arlene; Clinton, Amanda; Schaefer, Barbara A.

2015-01-01

Convergent and discriminant validity evidence was examined for scores on the Spanish Record Form of the Bracken School Readiness Assessment, Third Edition (BSRA-3). Participants included a sample of 68 Hispanic, Spanish-speaking children ages 4 to 5 years enrolled in preschool programs in Puerto Rico. Scores obtained from the BSRA-3 Spanish Record…
Validity evidence of non-technical skills assessment instruments in simulated anaesthesia crisis management.

PubMed

Jirativanont, T; Raksamani, K; Aroonpruksakul, N; Apidechakul, P; Suraseranivongse, S

2017-07-01

We sought to evaluate the validity of two non-technical skills evaluation instruments, the Anaesthetists' Non-Technical Skills (ANTS) behavioural marker system and the Ottawa Global Rating Scale (GRS), to apply them to anaesthesia training. The content validity, response process, internal structure, relations with other variables and consequences were described for validity evidence. Simulated crisis management sessions were initiated during which two trained raters evaluated the performance of postgraduate first-, second- and third-year (PGY-1, PGY-2 and PGY-3) anaesthesia residents. The study included 70 participants, composed of 24 PGY-1, 24 PGY-2 and 22 PGY-3 residents. Both instruments differentiated the non-technical skills of PGY-1 from PGY-3 residents ( P <0.05). Inter-rater agreement was measured using the intraclass correlation coefficient (ICC). For the ANTS instrument, the intraclass correlation coefficients for task management, team-working, situation awareness and decision-making were 0.79, 0.34, 0.81 and 0.70, respectively. For the Ottawa GRS, the intraclass correlation coefficients for overall performance, leadership, problem-solving, situation awareness, resource utilisation and communication skills were 0.86, 0.83, 0.84, 0.87, 0.80 and 0.86, respectively. The Cronbach's alpha for internal consistency of the ANTS instrument was 0.93, and was 0.96 for the Ottawa GRS. There was a high correlation between the ANTS and Ottawa GRS. The raters reported the ease of use of the Ottawa GRS compared to the ANTS. We found sufficient evidence of validity in the ANTS instrument and the Ottawa GRS for the evaluation of non-technical skills in a simulated anaesthesia setting, but the Ottawa GRS was more practical and had higher reliability.
Evaluating an Instrument to Measure Mental Load and Mental Effort Considering Different Sources of Validity Evidence

ERIC Educational Resources Information Center

Krell, Moritz

2017-01-01

This study evaluates a 12-item instrument for subjective measurement of mental load (ML) and mental effort (ME) by analysing different sources of validity evidence. The findings of an expert judgement (N = 8) provide "evidence based on test content" that the formulation of the items corresponds to the meaning of ML and ME. An empirical…
Additional Interventions to Enhance the Effectiveness of Individual Placement and Support: A Rapid Evidence Assessment

PubMed Central

Boycott, Naomi; Schneider, Justine; McMurran, Mary

2012-01-01

Topic. Additional interventions used to enhance the effectiveness of individual placement and support (IPS). Aim. To establish whether additional interventions improve the vocational outcomes of IPS alone for people with severe mental illness. Method. A rapid evidence assessment of the literature was conducted for studies where behavioural or psychological interventions have been used to supplement standard IPS. Published and unpublished empirical studies of IPS with additional interventions were considered for inclusion. Conclusions. Six published studies were found which compared IPS alone to IPS plus a supplementary intervention. Of these, three used skills training and three used cognitive remediation. The contribution of each discrete intervention is difficult to establish. Some evidence suggests that work-related social skills and cognitive training are effective adjuncts, but this is an area where large RCTs are required to yield conclusive evidence. PMID:22685665
Applying hospital evidence to paramedicine: issues of indirectness, validity and knowledge translation.

PubMed

Bigham, Blair; Welsford, Michelle

2015-05-01

The practice of emergency medicine (EM) has been intertwined with emergency medical services (EMS) for more than 40 years. In this commentary, we explore the practice of translating hospital based evidence into the prehospital setting. We will challenge both EMS and EM dogma-bringing hospital care to patients in the field is not always better. In providing examples of therapies championed in hospitals that have failed to translate into the field, we will discuss the unique prehospital environment, and why evidence from the hospital setting cannot necessarily be translated to the prehospital field. Paramedicine is maturing so that the capability now exists to conduct practice-specific research that can inform best practices. Before translation from the hospital environment is implemented, evidence must be evaluated by people with expertise in three domains: critical appraisal, EM, and EMS. Scientific evidence should be assessed for: quality and bias; directness, generalizability, and validity to the EMS population; effect size and anticipated benefit from prehospital application; feasibility (including economic evaluation, human resource availability in the mobile environment); and patient and provider safety.
A motor speech assessment for children with severe speech disorders: reliability and validity evidence.

PubMed

Strand, Edythe A; McCauley, Rebecca J; Weigand, Stephen D; Stoeckel, Ruth E; Baas, Becky S

2013-04-01

In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Participants were 81 children between 36 and 79 months of age who were referred to the Mayo Clinic for diagnosis of speech sound disorders. Children were given the DEMSS and a standard speech and language test battery as part of routine evaluations. Subsequently, intrajudge, interjudge, and test-retest reliability were evaluated for a subset of participants. Construct validity was explored for all 81 participants through the use of agglomerative cluster analysis, sensitivity measures, and likelihood ratios. The mean percentage of agreement for 171 judgments was 89% for test-retest reliability, 89% for intrajudge reliability, and 91% for interjudge reliability. Agglomerative hierarchical cluster analysis showed that total DEMSS scores largely differentiated clusters of children with CAS vs. mild CAS vs. other speech disorders. Positive and negative likelihood ratios and measures of sensitivity and specificity suggested that the DEMSS does not overdiagnose CAS but sometimes fails to identify children with CAS. The value of the DEMSS in differential diagnosis of severe speech impairments was supported on the basis of evidence of reliability and validity.
The Importance of Multi-Group Validity Evidence in Gifted and Talented Identification and Research

ERIC Educational Resources Information Center

Peters, Scott J.

2011-01-01

Practitioners and researchers often review the validity evidence of an instrument before using it for student assessment or in the practice of diagnosing or identifying children with exceptionalities. However, few test manuals present data on instrument measurement equivalence/ invariance or differential item functioning. This information is…
I Spy with My Little Eye: Jurors' Detection of Internal Validity Threats in Expert Evidence

PubMed Central

McAuliff, Bradley D.; Duckworth, Tejah D.

2010-01-01

This experiment examined whether jury-eligible community members (N = 223) were able to detect internally invalid psychological science presented at trial. Participants read a simulated child sexual abuse case in which the defense expert described a study he had conducted on witness memory and suggestibility. We varied the study's internal validity (valid, missing control group, confound, and experimenter bias) and publication status (published, unpublished). Expert evidence quality ratings were higher for the valid versus missing control group version only. Publication increased ratings of defendant guilt when the study was missing a control group. Variations in internal validity did not influence perceptions of child victim credibility or police interview quality. Participants' limited detection of internal validity threats underscores the need to examine the effectiveness of traditional legal safeguards against junk science in court and improve the scientific reasoning ability of lay people and legal professionals. PMID:20162342
Convergent validity evidence for the Pain and Discomfort Scale (PADS) for pain assessment among adults with intellectual disability

PubMed Central

Shinde, Satomi K.; Danov, Stacy; Chen, Chin-Chih; Clary, Jamie; Harper, Vicki; Bodfish, James W.; Symons, Frank J.

2014-01-01

Objectives The main aim of the study was to generate initial convergent validity evidence for the Pain and Discomfort Scale (PADS) for use with non-verbal adults with intellectual disabilities (ID). Methods Forty-four adults with intellectual disability (mean age = 46, 52 % male) were evaluated using a standardized sham-controlled and blinded sensory testing protocol, from which FACS and PADS scores were tested for (1) sensitivity to an array of calibrated sensory stimuli, (2) specificity (active vs. sham trials), and (3) concordance. Results The primary findings were that participants were reliably coded using both FACS and PADS approaches as being reactive to the sensory stimuli (FACS: F[2, 86] = 4.71, P < .05, PADS: F[2, 86] = 21.49, P < .05) (sensitivity evidence), not reactive during the sham stimulus trials (FACS: F[1, 43]= 3.77, p = .06, PADS: F[1, 43] = 5.87, p = .02) (specificity evidence), and there were significant (r = .41 – .51, p < .01) correlations between PADS and FACS (convergent validity evidence). Discussion FACS is an objective coding platform for facial expression. It requires intensive training and resources for scoring. As such it may be limited for clinical application. PADS was designed for clinical application. PADS scores were comparable to FACS scores under controlled evaluation conditions providing partial convergent validity evidence for its use. PMID:24135902
Validity of CBCL-derived PTSD and dissociation scales: further evidence in a sample of neglected children and adolescents.

PubMed

Milot, Tristan; Plamondon, André; Ethier, Louise S; Lemelin, Jean-Pascal; St-Laurent, Diane; Rousseau, Michel

2013-05-01

There is growing evidence that child neglect is an important risk factor for posttraumatic stress disorder (PTSD) and dissociation. Considering that the Child Behavior Checklist (CBCL) is a widely used measure, the possibility of using validated CBCL-derived trauma symptoms scales could be particularly useful to better understand how trauma symptoms develop among neglected children and adolescents. This study examined the factor structure of three CBCL-derived measures of PTSD and dissociation (namely, PTSD scale, Dissociation scale, and PTSD/Dissociation scale) in a sample of 239 neglected children and adolescents aged 6 to 18 years using the latest version of CBCL (CBCL 6-18). Evidence of convergent validity of these scales was also examined for participants aged 12 and under using two well-validated measures of PTSD and Dissociation: the Trauma Symptoms Checklist for Young Children and the Child Dissociation Checklist. Findings suggest that CBCL-derived measures of trauma symptoms, especially PTSD and Dissociations scales, may be of heuristic value in the study of trauma symptomatology in neglected samples. Factor structure and evidence of convergent validity were supported for these two scales. Results also provide further support to the well-established assumption that PTSD and dissociation are two related but different constructs.
Accumulation of Content Validation Evidence for the Critical Thinking Self-Assessment Scale.

PubMed

Nair, Girija Gopinathan; Hellsten, Laurie-Ann M; Stamler, Lynnette Leeseberg

2017-04-01

Critical thinking skills (CTS) are essential for nurses; assessing students' acquisition of these skills is a mandate of nursing curricula. This study aimed to develop a self-assessment instrument of critical thinking skills (Critical Thinking Self-Assessment Scale [CTSAS]) for students' self-monitoring. An initial pool of 196 items across 6 core cognitive skills and 16 subskills were generated using the American Philosophical Association definition of CTS. Experts' content review of the items and their ratings provided evidence of content relevance using the item-level content validity index (I-CVI) and Aiken's content validity coefficient (VIk). 115 items were retained (range of I-CVI values = .70 to .94 and range of VIk values = .69-.95; significant at p< .05). The CTSAS is the first CTS instrument designed specifically for self-assessment purposes.
Evidence of validity of the Stress-Producing Life Events (SPLE) instrument.

PubMed

Rizzini, Marta; Santos, Alcione Miranda Dos; Silva, Antônio Augusto Moura da

2018-01-01

OBJECTIVE Evaluate the construct validity of a list of eight Stressful Life Events in pregnant women. METHODS A cross-sectional study was conducted with 1,446 pregnant women in São Luís, MA, and 1,364 pregnant women in Ribeirão Preto, SP (BRISA cohort), from February 2010 to June 2011. In the exploratory factorial analysis, the promax oblique rotation was used and for the calculation of the internal consistency, we used the compound reliability. The construct validity was determined by means of the confirmatory factorial analysis with the method of estimation of weighted least squares adjusted by the mean and variance. RESULTS The model with the best fit in the exploratory analysis was the one that retained three factors with a cumulative variance of 61.1%. The one-factor model did not obtain a good fit in both samples in the confirmatory analysis. The three-factor model called Stress-Producing Life Events presented a good fit (RMSEA < 0.05; CFI/TLI > 0.90) for both samples. CONCLUSIONS The Stress-Producing Life Events constitute a second order construct with three dimensions related to health, personal and financial aspects and violence. This study found evidence that confirms the construct validity of a list of stressor events, entitled Stress-Producing Life Events Inventory.
The predictive validity of quality of evidence grades for the stability of effect estimates was low: a meta-epidemiological study.

PubMed

Gartlehner, Gerald; Dobrescu, Andreea; Evans, Tammeka Swinson; Bann, Carla; Robinson, Karen A; Reston, James; Thaler, Kylie; Skelly, Andrea; Glechner, Anna; Peterson, Kimberly; Kien, Christina; Lohr, Kathleen N

2016-02-01

To determine the predictive validity of the U.S. Evidence-based Practice Center (EPC) approach to GRADE (Grading of Recommendations Assessment, Development and Evaluation). Based on Cochrane reports with outcomes graded as high quality of evidence (QOE), we prepared 160 documents which represented different levels of QOE. Professional systematic reviewers dually graded the QOE. For each document, we determined whether estimates were concordant with high QOE estimates of the Cochrane reports. We compared the observed proportion of concordant estimates with the expected proportion from an international survey. To determine the predictive validity, we used the Hosmer-Lemeshow test to assess calibration and the C (concordance) index to assess discrimination. The predictive validity of the EPC approach to GRADE was limited. Estimates graded as high QOE were less likely, estimates graded as low or insufficient QOE more likely to remain stable than expected. The EPC approach to GRADE could not reliably predict the likelihood that individual bodies of evidence remain stable as new evidence becomes available. C-indices ranged between 0.56 (95% CI, 0.47 to 0.66) and 0.58 (95% CI, 0.50 to 0.67) indicating a low discriminatory ability. The limited predictive validity of the EPC approach to GRADE seems to reflect a mismatch between expected and observed changes in treatment effects as bodies of evidence advance from insufficient to high QOE. Copyright © 2016 Elsevier Inc. All rights reserved.
Validity Evidence in Scale Development: The Application of Cross Validation and Classification-Sequencing Validation

ERIC Educational Resources Information Center

Acar, Tu¨lin

2014-01-01

In literature, it has been observed that many enhanced criteria are limited by factor analysis techniques. Besides examinations of statistical structure and/or psychological structure, such validity studies as cross validation and classification-sequencing studies should be performed frequently. The purpose of this study is to examine cross…

A Motor Speech Assessment for Children with Severe Speech Disorders: Reliability and Validity Evidence

ERIC Educational Resources Information Center

Strand, Edythe A.; McCauley, Rebecca J.; Weigand, Stephen D.; Stoeckel, Ruth E.; Baas, Becky S.

2013-01-01

Purpose: In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Method: Participants were 81 children between 36 and 79 months of age who were referred to the…
Evidence of Convergent and Discriminant Validity of Child, Teacher, and Peer Reports of Teacher-Student Support

PubMed Central

Li, Yan; Hughes, Jan N.; Kwok, Oi-man; Hsu, Hsien-Yuan

2012-01-01

This study investigated the construct validity of measures of teacher-student support in a sample of 709 ethnically diverse second and third grade academically at-risk students. Confirmatory factor analysis investigated the convergent and discriminant validities of teacher, child, and peer reports of teacher-student support and child conduct problems. Results supported the convergent and discriminant validity of scores on the measures. Peer reports accounted for the largest proportion of trait variance and non-significant method variance. Child reports accounted for the smallest proportion of trait variance and the largest method variance. A model with two latent factors provided a better fit to the data than a model with one factor, providing further evidence of the discriminant validity of measures of teacher-student support. Implications for research, policy, and practice are discussed. PMID:21767024
Appearance motives to tan and not tan: evidence for validity and reliability of a new scale.

PubMed

Cafri, Guy; Thompson, J Kevin; Roehrig, Megan; Rojas, Ariz; Sperry, Steffanie; Jacobsen, Paul B; Hillhouse, Joel

2008-04-01

Risk for skin cancer is increased by UV exposure and decreased by sun protection. Appearance reasons to tan and not tan have consistently been shown to be related to intentions and behaviors to UV exposure and protection. This study was designed to determine the factor structure of appearance motives to tan and not tan, evaluate the extent to which this factor structure is gender invariant, test for mean differences in the identified factors, and evaluate internal consistency, temporal stability, and criterion-related validity. Five-hundred eighty-nine females and 335 male college students were used to test confirmatory factor analysis models within and across gender groups, estimate latent mean differences, and use the correlation coefficient and Cronbach's alpha to further evaluate the reliability and validity of the identified factors. A measurement invariant (i.e., factor-loading invariant) model was identified with three higher-order factors: sociocultural influences to tan (lower order factors: media, friends, family, significant others), appearance reasons to tan (general, acne, body shape), and appearance reasons not to tan (skin aging, immediate skin damage). Females had significantly higher means than males on all higher-order factors. All subscales had evidence of internal consistency, temporal stability, and criterion-related validity. This study offers a framework and measurement instrument that has evidence of validity and reliability for evaluating appearance-based motives to tan and not tan.
Additional Evidence for the Reliability and Validity of the Student Risk Screening Scale at the High School Level: A Replication and Extension

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy P.; Ennis, Robin Parks; Cox, Meredith Lucille; Schatschneider, Christopher; Lambert, Warren

2013-01-01

This study reports findings from a validation study of the Student Risk Screening Scale for use with 9th- through 12th-grade students (N = 1854) attending a rural fringe school. Results indicated high internal consistency, test-retest stability, and inter-rater reliability. Predictive validity was established across two academic years, with Spring…
Examining Evidence for the Validity of PISA Learning Strategy Scales Based on Student Response Processes

ERIC Educational Resources Information Center

Hopfenbeck, Therese N.; Maul, Andrew

2011-01-01

The aim of this study was to investigate response-process based evidence for the validity of the Programme for International Student Assessment's (PISA) self-report questionnaire scales as measures of specific psychological constructs, with a focus on scales meant to measure inclination toward specific learning strategies. Cognitive interviews (N…
Validity of premature ejaculation diagnostic tool and its association with International Index of Erectile Function-15 in Chinese men with evidence-based-defined premature ejaculation.

PubMed

Tang, Dong-Dong; Li, Chao; Peng, Dang-Wei; Zhang, Xian-Sheng

2018-01-01

The premature ejaculation diagnostic tool (PEDT) is a brief diagnostic measure to assess premature ejaculation (PE). However, there is insufficient evidence regarding its validity in the new evidence-based-defined PE. This study was performed to evaluate the validity of PEDT and its association with IIEF-15 in different types of evidence-based-defined PE. From June 2015 to January 2016, a total of 260 men complaining of PE and defined as lifelong PE (LPE)/acquired PE (APE) according to the evidence-based definition from Andrology Clinic of the First Affiliated Hospital of Anhui Medical University, along with 104 male healthy controls without PE from a medical examination center, were enrolled in this study. All individuals completed questionnaires including demographics, medical and sexual history, as well as PEDT and IIEF-15. After statistical analysis, it was found that men with PE reported higher PEDT scores (14.28 ± 3.05) and lower IIEF-15 (41.26 ± 8.20) than men without PE (PEDT: 5.32 ± 3.42, IIEF-15: 52.66 ± 6.86, P < 0.001 for both). It was suggested that a score of ≥9 indicated PE in both LPE and APE by sensitivity and specificity analyses (sensitivity: 0.875, 0.913; specificity: 0.865, 0.865, respectively). In addition, IIEF-15 were higher in men with LPE (42.64 ± 8.11) than APE (39.43 ± 7.84, P < 0.001). After adjusting for age, IIEF-15 was negatively related to PEDT in men with LPE (adjust r = -0.225, P < 0.001) and APE (adjust r = -0.378, P < 0.001). In this study, we concluded that PEDT was valid in the diagnosis of evidenced-based-defined PE. Furthermore, IIEF-15 was negatively related to PEDT in men with different types of PE.
Preliminary Evidence for the Validity of the New Test of Everyday Reading Comprehension

ERIC Educational Resources Information Center

Wheldall, Kevin; McMurtry, Sarah

2014-01-01

The Test of Everyday Reading Comprehension (TERC) has recently been presented as an addition to the armoury of tests available for assessing the skills of low-progress readers. While comparison data for students of different ages are presented together with evidence for high test reliability, there is, as yet, no published evidence for its…
Procedure-specific assessment tool for flexible pharyngo-laryngoscopy: gathering validity evidence and setting pass-fail standards.

PubMed

Melchiors, Jacob; Petersen, K; Todsen, T; Bohr, A; Konge, Lars; von Buchwald, Christian

2018-06-01

The attainment of specific identifiable competencies is the primary measure of progress in the modern medical education system. The system, therefore, requires a method for accurately assessing competence to be feasible. Evidence of validity needs to be gathered before an assessment tool can be implemented in the training and assessment of physicians. This evidence of validity must according to the contemporary theory on validity be gathered from specific sources in a structured and rigorous manner. The flexible pharyngo-laryngoscopy (FPL) is central to the otorhinolaryngologist. We aim to evaluate the flexible pharyngo-laryngoscopy assessment tool (FLEXPAT) created in a previous study and to establish a pass-fail level for proficiency. Eighteen physicians with different levels of experience (novices, intermediates, and experienced) were recruited to the study. Each performed an FPL on two patients. These procedures were video recorded, blinded, and assessed by two specialists. The score was expressed as the percentage of a possible max score. Cronbach's α was used to analyze internal consistency of the data, and a generalizability analysis was performed. The scores of the three different groups were explored, and a pass-fail level was determined using the contrasting groups' standard setting method. Internal consistency was strong with a Cronbach's α of 0.86. We found a generalizability coefficient of 0.72 sufficient for moderate stakes assessment. We found a significant difference between the novice and experienced groups (p < 0.001) and strong correlation between experience and score (Pearson's r = 0.75). The pass/fail level was established at 72% of the maximum score. Applying this pass-fail level in the test population resulted in half of the intermediary group receiving a failing score. We gathered validity evidence for the FLEXPAT according to the contemporary framework as described by Messick. Our results support a claim of validity and are
Workplace status: The development and validation of a scale.

PubMed

Djurdjevic, Emilija; Stoverink, Adam C; Klotz, Anthony C; Koopman, Joel; da Motta Veiga, Serge P; Yam, Kai Chi; Chiang, Jack Ting-Ju

2017-07-01

Research suggests that employee status, and various status proxies, relate to a number of meaningful outcomes in the workplace. The advancement of the study of status in organizational settings has, however, been stymied by the lack of a validated workplace status measure. The purpose of this manuscript, therefore, is to develop and validate a measure of workplace status based on a theoretically grounded definition of status in organizations. Subject-matter experts were used to examine the content validity of the measure. Then, 2 separate samples were employed to assess the psychometric properties (i.e., factor structure, reliability, convergent and discriminant validity) and nomological network of a 5-item, self-report Workplace Status Scale (WSS). To allow for methodological flexibility, an additional 3 samples were used to extend the WSS to coworker reports of a focal employee's status, provide additional evidence for the validity and reliability of the WSS, and to demonstrate consensus among coworker ratings. Together, these studies provide evidence of the psychometric soundness of the WSS for assessing employee status using either self-reports or other-source reports. The implications of the development of the WSS for the study of status in organizations are discussed, and suggestions for future research using the new measure are offered. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Assessing Perceived Emotional Intelligence in Adolescents: New Validity Evidence of Trait Meta-Mood Scale-24

ERIC Educational Resources Information Center

Pedrosa, Ignacio; Suárez-Álvarez, Javier; Lozano, Luis M.; Muñiz, José; García-Cueto, Eduardo

2014-01-01

Adolescence is a critical period of life during which significant psychosocial adjustment occurs and in which emotional intelligence plays an essential role. This article provides validity evidence for the Trait Meta-Mood Scale-24 (TMMS-24) scores based on an item response theory (IRT) approach. A sample of 2,693 Spanish adolescents (M = 16.52…
Comparison of consumer derived evidence with an omaha system evidence-based practice guideline for community dwelling older adults.

PubMed

Pruinelli, Lisiane; Fu, Helen; Monsen, Karen A; Westra, Bonnie L

2014-01-01

Consumer involvement in healthcare is critical to support continuity of care for consumers to manage their health while transitioning from one care setting to another. Validation of evidence-based practice (EBP) guideline by consumers is essential to achieving consumer health goals over time that is consistent with their needs and preferences. The purpose of this study was to compare an Omaha System EBP guideline for community dwelling older adults with consumer-derived evidence of their ongoing needs, resources, and strategies after home care discharge. All identified problems were relevant for all patients except for Neglect and Substance use. Ten additional problems were identified from the interviews, five of which affected at least 10% of the participants. Consumer derived evidence both validated and expanded EBP guidelines; thus further emphasizing the importance of consumer involvement in the delivery of home healthcare.
Development and validation of the Evidence Based Medicine Questionnaire (EBMQ) to assess doctors' knowledge, practice and barriers regarding the implementation of evidence-based medicine in primary care.

PubMed

Hisham, Ranita; Ng, Chirk Jenn; Liew, Su May; Lai, Pauline Siew Mei; Chia, Yook Chin; Khoo, Ee Ming; Hanafi, Nik Sherina; Othman, Sajaratulnisah; Lee, Ping Yein; Abdullah, Khatijah Lim; Chinna, Karuthan

2018-06-23

Evidence-Based Medicine (EBM) integrates best available evidence from literature and patients' values, which then informs clinical decision making. However, there is a lack of validated instruments to assess the knowledge, practice and barriers of primary care physicians in the implementation of EBM. This study aimed to develop and validate an Evidence-Based Medicine Questionnaire (EBMQ) in Malaysia. The EBMQ was developed based on a qualitative study, literature review and an expert panel. Face and content validity was verified by the expert panel and piloted among 10 participants. Primary care physicians with or without EBM training who could understand English were recruited from December 2015 to January 2016. The EBMQ was administered at baseline and two weeks later. A higher score indicates better knowledge, better practice of EBM and less barriers towards the implementation of EBM. We hypothesized that the EBMQ would have three domains: knowledge, practice and barriers. The final version of the EBMQ consists of 80 items: 62 items were measured on a nominal scale, 22 items were measured on a 5 point Likert-scale. Flesch reading ease was 61.2. A total of 343 participants were approached; of whom 320 agreed to participate (response rate = 93.2%). Factor analysis revealed that the EBMQ had eight domains after 13 items were removed: "EBM websites", "evidence-based journals", "types of studies", "terms related to EBM", "practice", "access", "patient preferences" and "support". Cronbach alpha for the overall EBMQ was 0.909, whilst the Cronbach alpha for the individual domain ranged from 0.657-0.940. The EBMQ was able to discriminate between doctors with and without EBM training for 24 out of 42 items. At test-retest, kappa values ranged from 0.155 to 0.620. The EBMQ was found to be a valid and reliable instrument to assess the knowledge, practice and barriers towards the implementation of EBM among primary care physicians in Malaysia.
Continuous track paths reveal additive evidence integration in multistep decision making.

PubMed

Buc Calderon, Cristian; Dewulf, Myrtille; Gevers, Wim; Verguts, Tom

2017-10-03

Multistep decision making pervades daily life, but its underlying mechanisms remain obscure. We distinguish four prominent models of multistep decision making, namely serial stage, hierarchical evidence integration, hierarchical leaky competing accumulation (HLCA), and probabilistic evidence integration (PEI). To empirically disentangle these models, we design a two-step reward-based decision paradigm and implement it in a reaching task experiment. In a first step, participants choose between two potential upcoming choices, each associated with two rewards. In a second step, participants choose between the two rewards selected in the first step. Strikingly, as predicted by the HLCA and PEI models, the first-step decision dynamics were initially biased toward the choice representing the highest sum/mean before being redirected toward the choice representing the maximal reward (i.e., initial dip). Only HLCA and PEI predicted this initial dip, suggesting that first-step decision dynamics depend on additive integration of competing second-step choices. Our data suggest that potential future outcomes are progressively unraveled during multistep decision making.
Consistency between direct and indirect trial evidence: is direct evidence always more reliable?

PubMed

Madan, Jason; Stevenson, Matt D; Cooper, Katy L; Ades, A E; Whyte, Sophie; Akehurst, Ron

2011-01-01

To present a case study involving the reduction in incidence of febrile neutropenia (FN) after chemotherapy with granulocyte colony-stimulating factors (G-CSFs), illustrating difficulties that may arise when following the common preference for direct evidence over indirect evidence. Evidence of the efficacy of treatments was identified from two previous systematic reviews. We used Bayesian evidence synthesis to estimate relative treatment effects based on direct evidence, indirect evidence, and both pooled together. We checked for inconsistency between direct and indirect evidence and explored the role of one specific trial using cross-validation. A subsequent review identified further studies not available at the time of the original analysis. We repeated the analyses on the enlarged evidence base. We found substantial inconsistency in the original evidence base. The median odds ratio of FN for primary pegfilgrastim versus no primary G-CSF was 0.06 (95% credible interval: 0.02-0.19) based on direct evidence, but 0.27 (95% credible interval: 0.13-0.53) based on indirect evidence (P value for consistency hypothesis 0.027). The additional trials were consistent with the earlier indirect, rather than the direct, evidence, and there was no inconsistency between direct and indirect estimates in the updated evidence. The earlier inconsistency was due to one trial comparing primary pegfilgrastim with no primary G-CSF. Predictive cross-validation showed that this study was inconsistent with the evidence as a whole and with other trials making this comparison. Both the Cochrane Handbook and the NICE Methods Guide express a preference for direct evidence. A more robust strategy, which is in line with the accepted principles of evidence synthesis, would be to combine all relevant and appropriate information, whether direct or indirect. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Validation of an instrument to assess evidence-based practice knowledge, attitudes, access, and confidence in the dental environment.

PubMed

Hendricson, William D; Rugh, John D; Hatch, John P; Stark, Debra L; Deahl, Thomas; Wallmann, Elizabeth R

2011-02-01

This article reports the validation of an assessment instrument designed to measure the outcomes of training in evidence-based practice (EBP) in the context of dentistry. Four EBP dimensions are measured by this instrument: 1) understanding of EBP concepts, 2) attitudes about EBP, 3) evidence-accessing methods, and 4) confidence in critical appraisal. The instrument-the Knowledge, Attitudes, Access, and Confidence Evaluation (KACE)-has four scales, with a total of thirty-five items: EBP knowledge (ten items), EBP attitudes (ten), accessing evidence (nine), and confidence (six). Four elements of validity were assessed: consistency of items within the KACE scales (extent to which items within a scale measure the same dimension), discrimination (capacity to detect differences between individuals with different training or experience), responsiveness (capacity to detect the effects of education on trainees), and test-retest reliability. Internal consistency of scales was assessed by analyzing responses of second-year dental students, dental residents, and dental faculty members using Cronbach coefficient alpha, a statistical measure of reliability. Discriminative validity was assessed by comparing KACE scores for the three groups. Responsiveness was assessed by comparing pre- and post-training responses for dental students and residents. To measure test-retest reliability, the full KACE was completed twice by a class of freshman dental students seventeen days apart, and the knowledge scale was completed twice by sixteen faculty members fourteen days apart. Item-to-scale consistency ranged from 0.21 to 0.78 for knowledge, 0.57 to 0.83 for attitude, 0.70 to 0.84 for accessing evidence, and 0.87 to 0.94 for confidence. For discrimination, ANOVA and post hoc testing by the Tukey-Kramer method revealed significant score differences among students, residents, and faculty members consistent with education and experience levels. For responsiveness to training, dental students
Development and preliminary evidence for the validity of an instrument assessing implementation of human-factors principles in medication-related decision-support systems—I-MeDeSA

PubMed Central

Zachariah, Marianne; Seidling, Hanna M; Neri, Pamela M; Cresswell, Kathrin M; Duke, Jon; Bloomrosen, Meryl; Volk, Lynn A; Bates, David W

2011-01-01

Background Medication-related decision support can reduce the frequency of preventable adverse drug events. However, the design of current medication alerts often results in alert fatigue and high over-ride rates, thus reducing any potential benefits. Methods The authors previously reviewed human-factors principles for relevance to medication-related decision support alerts. In this study, instrument items were developed for assessing the appropriate implementation of these human-factors principles in drug–drug interaction (DDI) alerts. User feedback regarding nine electronic medical records was considered during the development process. Content validity, construct validity through correlation analysis, and inter-rater reliability were assessed. Results The final version of the instrument included 26 items associated with nine human-factors principles. Content validation on three systems resulted in the addition of one principle (Corrective Actions) to the instrument and the elimination of eight items. Additionally, the wording of eight items was altered. Correlation analysis suggests a direct relationship between system age and performance of DDI alerts (p=0.0016). Inter-rater reliability indicated substantial agreement between raters (κ=0.764). Conclusion The authors developed and gathered preliminary evidence for the validity of an instrument that measures the appropriate use of human-factors principles in the design and display of DDI alerts. Designers of DDI alerts may use the instrument to improve usability and increase user acceptance of medication alerts, and organizations selecting an electronic medical record may find the instrument helpful in meeting their clinicians' usability needs. PMID:21946241
Assessing the Culture of Residency Using the C - Change Resident Survey: Validity Evidence in 34 U.S. Residency Programs.

PubMed

Pololi, Linda H; Evans, Arthur T; Civian, Janet T; Shea, Sandy; Brennan, Robert T

2017-07-01

A practical instrument is needed to reliably measure the clinical learning environment and professionalism for residents. To develop and present evidence of validity of an instrument to assess the culture of residency programs and the clinical learning environment. During 2014-2015, we surveyed residents using the C - Change Resident Survey to assess residents' perceptions of the culture in their programs. Residents in all years of training in 34 programs in internal medicine, pediatrics, and general surgery in 14 geographically diverse public and private academic health systems. The C - Change Resident Survey assessed residents' perceptions of 13 dimensions of the culture: Vitality, Self-Efficacy, Institutional Support, Relationships/Inclusion, Values Alignment, Ethical/Moral Distress, Respect, Mentoring, Work-Life Integration, Gender Equity, Racial/Ethnic Minority Equity, and self-assessed Competencies. We measured the internal reliability of each of the 13 dimensions and evaluated response process, content validity, and construct-related evidence validity by assessing relationships predicted by our conceptual model and prior research. We also assessed whether the measurements were sensitive to differences in specialty and across institutions. A total of 1708 residents completed the survey [internal medicine: n = 956, pediatrics: n = 411, general surgery: n = 311 (51% women; 16% underrepresented in medicine minority)], with a response rate of 70% (range across programs, 51-87%). Internal consistency of each dimension was high (Cronbach α: 0.73-0.90). The instrument was able to detect significant differences in the learning environment across programs and sites. Evidence of validity was supported by a good response process and the demonstration of several relationships predicted by our conceptual model. The C - Change Resident Survey assesses the clinical learning environment for residents, and we encourage further study of validity in different
Trusting Teachers' Judgement: Research Evidence of the Reliability and Validity of Teachers' Assessment Used for Summative Purposes

ERIC Educational Resources Information Center

Harlen, Wynne

2005-01-01

This paper summarizes the findings of a systematic review of research on the reliability and validity of teachers' assessment used for summative purposes. In addition to the main question, the review also addressed the question "What conditions affect the reliability and validity of teachers' summative assessment?" The initial search for studies…
Validation of a short measure of effort-reward imbalance in the workplace: evidence from China.

PubMed

Li, Jian; Loerbroks, Adrian; Shang, Li; Wege, Natalia; Wahrendorf, Morten; Siegrist, Johannes

2012-01-01

Work stress is an emergent risk in occupational health in China, and its measurement is still a critical issue. The aim of this study was to examine the reliability and validity of a short version of the effort-reward imbalance (ERI) questionnaire in a sample of Chinese workers. A community-based survey was conducted in 1,916 subjects aged 30-65 years with paid employment (971 men and 945 women). Acceptable internal consistencies of the three scales, effort, reward and overcommitment, were obtained. Confirmatory factor analysis showed a good model fit of the data with the theoretical structure (goodness-of-fit index = 0.95). Evidence of criterion validity was demonstrated, as all three scales were independently associated with elevated odds ratios of both poor physical and mental health. Based on the findings of our study, this short version of the ERI questionnaire is considered to be a reliable and valid tool for measuring psychosocial work environment in Chinese working populations.
The development and validation of a meta-tool for quality appraisal of public health evidence: Meta Quality Appraisal Tool (MetaQAT).

PubMed

Rosella, L; Bowman, C; Pach, B; Morgan, S; Fitzpatrick, T; Goel, V

2016-07-01

Most quality appraisal tools were developed for clinical medicine and tend to be study-specific with a strong emphasis on risk of bias. In order to be more relevant to public health, an appropriate quality appraisal tool needs to be less reliant on the evidence hierarchy and consider practice applicability. Given the broad range of study designs used in public health, the objective of this study was to develop and validate a meta-tool that combines public health-focused principles of appraisal coupled with a set of design-specific companion tools. Several design methods were used to develop and validate the tool including literature review, synthesis, and validation with a reference standard. A search of critical appraisal tools relevant to public health was conducted; core concepts were collated. The resulting framework was piloted during three feedback sessions with public health practitioners. Following subsequent revisions, the final meta-tool, the Meta Quality Appraisal Tool (MetaQAT), was then validated through a content analysis of appraisals conducted by two groups of experienced public health researchers (MetaQAT vs generic appraisal form). The MetaQAT framework consists of four domains: relevancy, reliability, validity, and applicability. In addition, a companion tool was assembled from existing critical appraisal tools to provide study design-specific guidance on validity appraisal. Content analysis showed similar methodological and generalizability concerns were raised by both groups; however, the MetaQAT appraisers commented more extensively on applicability to public health practice. Critical appraisal tools designed for clinical medicine have limitations for use in the context of public health. The meta-tool structure of the MetaQAT allows for rigorous appraisal, while allowing users to simultaneously appraise the multitude of study designs relevant to public health research and assess non-standard domains, such as applicability. Copyright © 2015

Are validated outcome measures used in distal radial fractures truly valid?

PubMed Central

Nienhuis, R. W.; Bhandari, M.; Goslings, J. C.; Poolman, R. W.; Scholtes, V. A. B.

2016-01-01

Objectives Patient-reported outcome measures (PROMs) are often used to evaluate the outcome of treatment in patients with distal radial fractures. Which PROM to select is often based on assessment of measurement properties, such as validity and reliability. Measurement properties are assessed in clinimetric studies, and results are often reviewed without considering the methodological quality of these studies. Our aim was to systematically review the methodological quality of clinimetric studies that evaluated measurement properties of PROMs used in patients with distal radial fractures, and to make recommendations for the selection of PROMs based on the level of evidence of each individual measurement property. Methods A systematic literature search was performed in PubMed, EMbase, CINAHL and PsycINFO databases to identify relevant clinimetric studies. Two reviewers independently assessed the methodological quality of the studies on measurement properties, using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Level of evidence (strong / moderate / limited / lacking) for each measurement property per PROM was determined by combining the methodological quality and the results of the different clinimetric studies. Results In all, 19 out of 1508 identified unique studies were included, in which 12 PROMs were rated. The Patient-rated wrist evaluation (PRWE) and the Disabilities of Arm, Shoulder and Hand questionnaire (DASH) were evaluated on most measurement properties. The evidence for the PRWE is moderate that its reliability, validity (content and hypothesis testing), and responsiveness are good. The evidence is limited that its internal consistency and cross-cultural validity are good, and its measurement error is acceptable. There is no evidence for its structural and criterion validity. The evidence for the DASH is moderate that its responsiveness is good. The evidence is limited that its reliability and the
The gender identity/gender dysphoria questionnaire for adolescents and adults: further validity evidence.

PubMed

Singh, Devita; Deogracias, Joseph J; Johnson, Laurel L; Bradley, Susan J; Kibblewhite, Sarah J; Owen-Anderson, Allison; Peterson-Badali, Michele; Meyer-Bahlburg, Heino F L; Zucker, Kenneth J

2010-01-01

This study aimed to provide further validity evidence for the dimensional measurement of gender identity and gender dysphoria in both adolescents and adults. Adolescents and adults with gender identity disorder (GID) were compared to clinical control (CC) adolescents and adults on the Gender Identity/Gender Dysphoria Questionnaire for Adolescents and Adults (GIDYQ-AA), a 27-item scale originally developed by Deogracias et al. (2007). In Study 1, adolescents with GID (n = 44) were compared to CC adolescents (n = 98); and in Study 2, adults with GID (n = 41) were compared to CC adults (n = 94). In both studies, clients with GID self-reported significantly more gender dysphoria than did the CCs, with excellent sensitivity and specificity rates. In both studies, degree of self-reported gender dysphoria was significantly correlated with recall of cross-gender behavior in childhood-a test of convergent validity. The research and clinical utility of the GIDYQ-AA is discussed, including directions for further research in distinct clinical populations.
Scale of attitudes toward alcohol - Spanish version: evidences of validity and reliability 1

PubMed Central

Ramírez, Erika Gisseth León; de Vargas, Divane

2017-01-01

ABSTRACT Objective: validate the Scale of attitudes toward alcohol, alcoholism and individuals with alcohol use disorders in its Spanish version. Method: methodological study, involving 300 Colombian nurses. Adopting the classical theory, confirmatory factor analysis was applied without prior examination, based on the strong historical evidence of the factorial structure of the original scale to determine the construct validity of this Spanish version. To assess the reliability, Cronbach’s Alpha and Mc Donalid’s Omega coefficients were used. Results: the confirmatory factor analysis indicated the good fit of the scale model in a four-factor distribution, with a cut-off point at 3.2, demonstrating 66.7% of sensitivity. Conclusions: the Scale of attitudes toward alcohol, alcoholism and individuals with alcohol use disorders in Spanish presented robust psychometric qualities, affirming that the instrument possesses a solid factorial structure and reliability and is capable of precisely measuring the nurses’ atittudes towards the phenomenon proposed. PMID:28793126
Validity evidence for an OSCE to assess competency in systems-based practice and practice-based learning and improvement: a preliminary investigation.

PubMed

Varkey, Prathibha; Natt, Neena; Lesnick, Timothy; Downing, Steven; Yudkowsky, Rachel

2008-08-01

To determine the psychometric properties and validity of an OSCE to assess the competencies of Practice-Based Learning and Improvement (PBLI) and Systems-Based Practice (SBP) in graduate medical education. An eight-station OSCE was piloted at the end of a three-week Quality Improvement elective for nine preventive medicine and endocrinology fellows at Mayo Clinic. The stations assessed performance in quality measurement, root cause analysis, evidence-based medicine, insurance systems, team collaboration, prescription errors, Nolan's model, and negotiation. Fellows' performance in each of the stations was assessed by three faculty experts using checklists and a five-point global competency scale. A modified Angoff procedure was used to set standards. Evidence for the OSCE's validity, feasibility, and acceptability was gathered. Evidence for content and response process validity was judged as excellent by institutional content experts. Interrater reliability of scores ranged from 0.85 to 1 for most stations. Interstation correlation coefficients ranged from -0.62 to 0.99, reflecting case specificity. Implementation cost was approximately $255 per fellow. All faculty members agreed that the OSCE was realistic and capable of providing accurate assessments. The OSCE provides an opportunity to systematically sample the different subdomains of Quality Improvement. Furthermore, the OSCE provides an opportunity for the demonstration of skills rather than the testing of knowledge alone, thus making it a potentially powerful assessment tool for SBP and PBLI. The study OSCE was well suited to assess SBP and PBLI. The evidence gathered through this study lays the foundation for future validation work.
Implementing assessments of robot-assisted technical skill in urological education: a systematic review and synthesis of the validity evidence.

PubMed

Goldenberg, Mitchell G; Lee, Jason Y; Kwong, Jethro C C; Grantcharov, Teodor P; Costello, Anthony

2018-03-31

To systematically review and synthesise the validity evidence supporting intraoperative and simulation-based assessments of technical skill in urological robot-assisted surgery (RAS), and make evidence-based recommendations for the implementation of these assessments in urological training. A literature search of the Medline, PsycINFO and Embase databases was performed. Articles using technical skill and simulation-based assessments in RAS were abstracted. Only studies involving urology trainees or faculty were included in the final analysis. Multiple tools for the assessment of technical robotic skill have been published, with mixed sources of validity evidence to support their use. These evaluations have been used in both the ex vivo and in vivo settings. Performance evaluations range from global rating scales to psychometrics, and assessments are carried out through automation, expert analysts, and crowdsourcing. There have been rapid expansions in approaches to RAS technical skills assessment, both in simulated and clinical settings. Alternative approaches to assessment in RAS, such as crowdsourcing and psychometrics, remain under investigation. Evidence to support the use of these metrics in high-stakes decisions is likely insufficient at present. © 2018 The Authors BJU International © 2018 BJU International Published by John Wiley & Sons Ltd.
Reliability and Validity Evidence of Multiple Balance Assessments in Athletes With a Concussion

PubMed Central

Murray, Nicholas; Salvatore, Anthony; Powell, Douglas; Reed-Jones, Rebecca

2014-01-01

Context: An estimated 300 000 sport-related concussion injuries occur in the United States annually. Approximately 30% of individuals with concussions experience balance disturbances. Common methods of balance assessment include the Clinical Test of Sensory Organization and Balance (CTSIB), the Sensory Organization Test (SOT), the Balance Error Scoring System (BESS), and the Romberg test; however, the National Collegiate Athletic Association recommended the Wii Fit as an alternative measure of balance in athletes with a concussion. A central concern regarding the implementation of the Wii Fit is whether it is reliable and valid for measuring balance disturbance in athletes with concussion. Objective: To examine the reliability and validity evidence for the CTSIB, SOT, BESS, Romberg test, and Wii Fit for detecting balance disturbance in athletes with a concussion. Data Sources: Literature considered for review included publications with reliability and validity data for the assessments of balance (CTSIB, SOT, BESS, Romberg test, and Wii Fit) from PubMed, PsycINFO, and CINAHL. Data Extraction: We identified 63 relevant articles for consideration in the review. Of the 63 articles, 28 were considered appropriate for inclusion and 35 were excluded. Data Synthesis: No current reliability or validity information supports the use of the CTSIB, SOT, Romberg test, or Wii Fit for balance assessment in athletes with a concussion. The BESS demonstrated moderate to high reliability (interclass correlation coefficient = 0.87) and low to moderate validity (sensitivity = 34%, specificity = 87%). However, the Romberg test and Wii Fit have been shown to be reliable tools in the assessment of balance in Parkinson patients. Conclusions: The BESS can evaluate balance problems after a concussion. However, it lacks the ability to detect balance problems after the third day of recovery. Further investigation is needed to establish the use of the CTSIB, SOT, Romberg test, and Wii Fit for
The forensic validity of visual analytics

NASA Astrophysics Data System (ADS)

Erbacher, Robert F.

2008-01-01

The wider use of visualization and visual analytics in wide ranging fields has led to the need for visual analytics capabilities to be legally admissible, especially when applied to digital forensics. This brings the need to consider legal implications when performing visual analytics, an issue not traditionally examined in visualization and visual analytics techniques and research. While digital data is generally admissible under the Federal Rules of Evidence [10][21], a comprehensive validation of the digital evidence is considered prudent. A comprehensive validation requires validation of the digital data under rules for authentication, hearsay, best evidence rule, and privilege. Additional issues with digital data arise when exploring digital data related to admissibility and the validity of what information was examined, to what extent, and whether the analysis process was sufficiently covered by a search warrant. For instance, a search warrant generally covers very narrow requirements as to what law enforcement is allowed to examine and acquire during an investigation. When searching a hard drive for child pornography, how admissible is evidence of an unrelated crime, i.e. drug dealing. This is further complicated by the concept of "in plain view". When performing an analysis of a hard drive what would be considered "in plain view" when analyzing a hard drive. The purpose of this paper is to discuss the issues of digital forensics and the related issues as they apply to visual analytics and identify how visual analytics techniques fit into the digital forensics analysis process, how visual analytics techniques can improve the legal admissibility of digital data, and identify what research is needed to further improve this process. The goal of this paper is to open up consideration of legal ramifications among the visualization community; the author is not a lawyer and the discussions are not meant to be inclusive of all differences in laws between states and
Development of, and initial validity evidence for, the referee self-efficacy scale: a multistudy report.

PubMed

Myers, Nicholas D; Feltz, Deborah L; Guillén, Félix; Dithurbide, Lori

2012-12-01

The purpose of this multistudy report was to develop, and then to provide initial validity evidence for measures derived from, the Referee Self-Efficacy Scale. Data were collected from referees (N = 1609) in the United States (n = 978) and Spain (n = 631). In Study 1 (n = 512), a single-group exploratory structural equation model provided evidence for four factors: game knowledge, decision making, pressure, and communication. In Study 2 (n = 1153), multiple-group confirmatory factor analytic models provided evidence for partial factorial invariance by country, level of competition, team gender, and sport refereed. In Study 3 (n = 456), potential sources of referee self-efficacy information combined to account for a moderate or large amount of variance in each dimension of referee self-efficacy with years of referee experience, highest level refereed, physical/mental preparation, and environmental comfort, each exerting at least two statistically significant direct effects.
Validation of markers with non-additive effects on milk yield and fertility in Holstein and Jersey cows.

PubMed

Aliloo, Hassan; Pryce, Jennie E; González-Recio, Oscar; Cocks, Benjamin G; Hayes, Ben J

2015-07-22

It has been suggested that traits with low heritability, such as fertility, may have proportionately more genetic variation arising from non-additive effects than traits with higher heritability, such as milk yield. Here, we performed a large genome scan with 408,255 single nucleotide polymorphism (SNP) markers to identify chromosomal regions associated with additive, dominance and epistatic (pairwise additive × additive) variability in milk yield and a measure of fertility, calving interval, using records from a population of 7,055 Holstein cows. The results were subsequently validated in an independent set of 3,795 Jerseys. We identified genomic regions with validated additive effects on milk yield on Bos taurus autosomes (BTA) 5, 14 and 20, whereas SNPs with suggestive additive effects on fertility were observed on BTA 5, 9, 11, 18, 22, 27, 29 and the X chromosome. We also confirmed genome regions with suggestive dominance effects for milk yield (BTA 2, 3, 5, 26 and 27) and for fertility (BTA 1, 2, 3, 7, 23, 25 and 28). A number of significant epistatic effects for milk yield on BTA 14 were found across breeds. However on close inspection, these were likely to be associated with the mutation in the diacylglycerol O-acyltransferase 1 (DGAT1) gene, given that the associations were no longer significant when the additive effect of the DGAT1 mutation was included in the epistatic model. In general, we observed a low statistical power (high false discovery rates and small number of significant SNPs) for non-additive genetic effects compared with additive effects for both traits which could be an artefact of higher dependence on linkage disequilibrium between markers and causative mutations or smaller size of non-additive effects relative to additive effects. The results of our study suggest that individual non-additive effects make a small contribution to the genetic variation of milk yield and fertility. Although we found no individual mutation with large dominance
Psychometric instrumentation: reliability and validity of instruments used for clinical practice, evidence-based practice projects and research studies.

PubMed

Mayo, Ann M

2015-01-01

It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
Using a matrix-analytical approach to synthesizing evidence solved incompatibility problem in the hierarchy of evidence.

PubMed

Walach, Harald; Loef, Martin

2015-11-01

The hierarchy of evidence presupposes linearity and additivity of effects, as well as commutativity of knowledge structures. It thereby implicitly assumes a classical theoretical model. This is an argumentative article that uses theoretical analysis based on pertinent literature and known facts to examine the standard view of methodology. We show that the assumptions of the hierarchical model are wrong. The knowledge structures gained by various types of studies are not sequentially indifferent, that is, do not commute. External validity and internal validity are at least partially incompatible concepts. Therefore, one needs a different theoretical structure, typical of quantum-type theories, to model this situation. The consequence of this situation is that the implicit assumptions of the hierarchical model are wrong, if generalized to the concept of evidence in total. The problem can be solved by using a matrix-analytical approach to synthesizing evidence. Here, research methods that produce different types of evidence that complement each other are synthesized to yield the full knowledge. We show by an example how this might work. We conclude that the hierarchical model should be complemented by a broader reasoning in methodology. Copyright © 2015 Elsevier Inc. All rights reserved.
Adaptation of the ESPA29 Parental Socialization Styles Scale to the Basque language: evidence of validity.

PubMed

López-Jáuregui, Alicia; Oliden, Paula Elosua

2009-11-01

The aim of this study is to adapt the ESPA29 scale of parental socialization styles in adolescence to the Basque language. The study of its psychometric properties is based on the search for evidence of internal and external validity. The first focuses on the assessment of the dimensionality of the scale by means of exploratory factor analysis. The relationship between the dimensions of parental socialization styles and gender and age guarantee the external validity of the scale. The study of the equivalence of the adapted and original versions is based on the comparisons of the reliability coefficients and on factor congruence. The results allow us to conclude the equivalence of the two scales.
Validation of learning assessments: A primer.

PubMed

Peeters, Michael J; Martin, Beth A

2017-09-01

The Accreditation Council for Pharmacy Education's Standards 2016 has placed greater emphasis on validating educational assessments. In this paper, we describe validity, reliability, and validation principles, drawing attention to the conceptual change that highlights one validity with multiple evidence sources; to this end, we recommend abandoning historical (confusing) terminology associated with the term validity. Further, we describe and apply Kane's framework (scoring, generalization, extrapolation, and implications) for the process of validation, with its inferences and conclusions from varied uses of assessment instruments by different colleges and schools of pharmacy. We then offer five practical recommendations that can improve reporting of validation evidence in pharmacy education literature. We describe application of these recommendations, including examples of validation evidence in the context of pharmacy education. After reading this article, the reader should be able to understand the current concept of validation, and use a framework as they validate and communicate their own institution's learning assessments. Copyright © 2017 Elsevier Inc. All rights reserved.
Further Validation of the IDAS: Evidence of Convergent, Discriminant, Criterion, and Incremental Validity

ERIC Educational Resources Information Center

Watson, David; O'Hara, Michael W.; Chmielewski, Michael; McDade-Montez, Elizabeth A.; Koffel, Erin; Naragon, Kristin; Stuart, Scott

2008-01-01

The authors explicated the validity of the Inventory of Depression and Anxiety Symptoms (IDAS; D. Watson et al., 2007) in 2 samples (306 college students and 605 psychiatric patients). The IDAS scales showed strong convergent validity in relation to parallel interview-based scores on the Clinician Rating version of the IDAS; the mean convergent…
Self-Reported Emotion Reactivity Among Early-Adolescent Girls: Evidence for Convergent and Discriminant Validity in an Urban Community Sample.

PubMed

Evans, Spencer C; Blossom, Jennifer B; Canter, Kimberly S; Poppert-Cordts, Katrina; Kanine, Rebecca; Garcia, Andrea; Roberts, Michael C

2016-05-01

Emotion reactivity, measured via the self-report Emotion Reactivity Scale (ERS), has shown unique associations with different forms of psychopathology and suicidal thoughts and behaviors; however, this limited body of research has been conducted among adults and older adolescents of predominantly White/European ethnic backgrounds. The present study investigated the validity of ERS scores for measuring emotion reactivity among an urban community sample of middle-school-age girls. Participants (N = 93, ages 11-15, 76% African-American, 18% Latina) completed the ERS and measures of emotion coping, internalizing problems, proactive and reactive aggression, negative life events, and lifetime suicidal ideation and substance use. As hypothesized, ERS scores were significantly associated with internalizing problems, poor emotion coping, negative life events, reactive aggression, and suicidal ideation (evidence for convergent validity), but showed little to no association with proactive aggression or lifetime substance use (evidence for discriminant validity). A series of logistic regressions were conducted to further explore the associations among internalizing problems, emotion reactivity, and suicidal ideation. With depressive symptoms included in the model, emotion reactivity was no longer uniquely predictive of lifetime suicidal ideation, nor did it serve as a moderator of other associations. In conjunction with previous research, these findings offer further support for the construct validity and research utility of the ERS as a self-report measure of emotion reactivity in adolescents. Copyright © 2016. Published by Elsevier Ltd.
Evidences of Validity of a Scale for Mapping Professional as Defining Competences and Performance by Brazilian Tutors

ERIC Educational Resources Information Center

Coelho, Francisco Antonio, Jr.; Ferreira, Rodrigo Rezende; Paschoal, Tatiane; Faiad, Cristiane; Meneses, Paulo Murce

2015-01-01

The purpose of this study was twofold: to assess evidences of construct validity of the Brazilian Scale of Tutors Competences in the field of Open and Distance Learning and to examine if variables such as professional experience, perception of the student´s learning performance and prior experience influence the development of technical and…
Direct Observation of Clinical Skills Feedback Scale: Development and Validity Evidence.

PubMed

Halman, Samantha; Dudek, Nancy; Wood, Timothy; Pugh, Debra; Touchie, Claire; McAleer, Sean; Humphrey-Murto, Susan

2016-01-01

Construct: This article describes the development and validity evidence behind a new rating scale to assess feedback quality in the clinical workplace. Competency-based medical education has mandated a shift to learner-centeredness, authentic observation, and frequent formative assessments with a focus on the delivery of effective feedback. Because feedback has been shown to be of variable quality and effectiveness, an assessment of feedback quality in the workplace is important to ensure we are providing trainees with optimal learning opportunities. The purposes of this project were to develop a rating scale for the quality of verbal feedback in the workplace (the Direct Observation of Clinical Skills Feedback Scale [DOCS-FBS]) and to gather validity evidence for its use. Two panels of experts (local and national) took part in a nominal group technique to identify features of high-quality feedback. Through multiple iterations and review, 9 features were developed into the DOCS-FBS. Four rater types (residents n = 21, medical students n = 8, faculty n = 12, and educators n = 12) used the DOCS-FBS to rate videotaped feedback encounters of variable quality. The psychometric properties of the scale were determined using a generalizability analysis. Participants also completed a survey to gather data on a 5-point Likert scale to inform the ease of use, clarity, knowledge acquisition, and acceptability of the scale. Mean video ratings ranged from 1.38 to 2.96 out of 3 and followed the intended pattern suggesting that the tool allowed raters to distinguish between examples of higher and lower quality feedback. There were no significant differences between rater type (range = 2.36-2.49), suggesting that all groups of raters used the tool in the same way. The generalizability coefficients for the scale ranged from 0.97 to 0.99. Item-total correlations were all above 0.80, suggesting some redundancy in items. Participants found the scale easy to use (M = 4.31/5) and clear
The Mastery Rubric for Evidence-Based Medicine: Institutional Validation via Multidimensional Scaling.

PubMed

Tractenberg, Rochelle E; Gushta, Matthew M; Weinfeld, Jeffrey M

2016-01-01

CONSTRUCT: In this study we describe a multidimensional scaling (MDS) exercise to validate the curricular elements composing a new Mastery Rubric (MR) for a curriculum in evidence-based medicine (EBM). This MR-EBM comprises 10 elements of knowledge, skills, and abilities (KSAs) representing our institutional learning goals of career-spanning engagement with EBM. An MR also includes developmental trajectories for each KSA, beginning with medical school coursework, including residency training, and outlining the qualifications of individuals to teach and mentor in EBM. The development was not part of the validation effort, as our curriculum is focused at a single stage (undergraduate medical students). An MR comprises the desired KSAs for an entire curriculum, together with descriptions of a learner's performance and/or capabilities as they develop from novice to proficiency of the curricular target(s). The MR construct is intended to support curriculum development or refinement by capturing the KSAs that support the articulation of concrete learning goals; it also promotes assessment that demonstrates development in the target KSAs and encourages reflection and self-directed learning throughout the learner's career. Two other MRs have been published, and this is the first one specific to teaching and learning in medicine; this is also the first one created specifically to evaluate an existing curriculum. To validate the dispersion of the elements of the EBM curriculum, the nine clinical instructors in the EBM two-course curriculum completed an MDS exercise, rating the similarities of the 10 curricular elements. MDS is a mathematical approach to understanding relationships among concepts/objects when these relationships are difficult to quantify. Eliciting similarity ratings biased the responses toward the null hypothesis (that the elements are not different). MDS results suggested that the MR represents 10 different, although related, facets of the construct
Oligosaccharides in infant formula: more evidence to validate the role of prebiotics.

PubMed

Vandenplas, Yvan; Zakharova, Irina; Dmitrieva, Yulia

2015-05-14

The gastrointestinal (GI) microbiota differs between breast-fed and classic infant formula-fed infants. Breast milk is rich in prebiotic oligosaccharides (OS) and may also contain some probiotics, but scientific societies do not recommend the addition of prebiotic OS or probiotics to standard infant formula. Nevertheless, many infant formula companies often add one or the other or both. Different types of prebiotic OS are used in infant formula, including galacto-oligosaccharide, fructo-oligosaccharide, polydextrose and mixtures of these OS, but none adds human milk OS. There is evidence that the addition of prebiotics to infant formula brings the GI microbiota of formula-fed infants closer to that of breast-fed infants. Prebiotics change gut metabolic activity (by decreasing stool pH and increasing SCFA), have a bifidogenic effect and bring stool consistency and defecation frequency closer to those of breast-fed infants. Although there is only limited evidence that these changes in GI microbiota induce a significant clinical benefit for the immune system, interesting positive trends have been observed in some markers. Additionally, adverse effects are extremely seldom. Prebiotics are added to infant formula because breast milk contains human milk OS. Because most studies suggest a trend of beneficial effects and because these ingredients are very safe, prebiotics bring infant formula one step closer to the golden standard of breast milk.
20 CFR 10.116 - What additional evidence is needed in cases based on occupational disease?

Code of Federal Regulations, 2012 CFR

2012-04-01

... based on occupational disease? 10.116 Section 10.116 Employees' Benefits OFFICE OF WORKERS' COMPENSATION... of Proof § 10.116 What additional evidence is needed in cases based on occupational disease? (a) The... particular occupational diseases. The medical report should also include the information specified on the...

20 CFR 10.116 - What additional evidence is needed in cases based on occupational disease?

Code of Federal Regulations, 2013 CFR

2013-04-01

... based on occupational disease? 10.116 Section 10.116 Employees' Benefits OFFICE OF WORKERS' COMPENSATION... of Proof § 10.116 What additional evidence is needed in cases based on occupational disease? (a) The... particular occupational diseases. The medical report should also include the information specified on the...
20 CFR 10.116 - What additional evidence is needed in cases based on occupational disease?

Code of Federal Regulations, 2014 CFR

2014-04-01

... based on occupational disease? 10.116 Section 10.116 Employees' Benefits OFFICE OF WORKERS' COMPENSATION... of Proof § 10.116 What additional evidence is needed in cases based on occupational disease? (a) The... particular occupational diseases. The medical report should also include the information specified on the...
Factor Analysis Methods and Validity Evidence: A Systematic Review of Instrument Development across the Continuum of Medical Education

ERIC Educational Resources Information Center

Wetzel, Angela Payne

2011-01-01

Previous systematic reviews indicate a lack of reporting of reliability and validity evidence in subsets of the medical education literature. Psychology and general education reviews of factor analysis also indicate gaps between current and best practices; yet, a comprehensive review of exploratory factor analysis in instrument development across…
20 CFR 10.116 - What additional evidence is needed in cases based on occupational disease?

Code of Federal Regulations, 2011 CFR

2011-04-01

... based on occupational disease? 10.116 Section 10.116 Employees' Benefits OFFICE OF WORKERS' COMPENSATION... of Proof § 10.116 What additional evidence is needed in cases based on occupational disease? (a) The... occupational diseases. The medical report should also include the information specified on the checklist for...
20 CFR 10.116 - What additional evidence is needed in cases based on occupational disease?

Code of Federal Regulations, 2010 CFR

2010-04-01

... based on occupational disease? 10.116 Section 10.116 Employees' Benefits OFFICE OF WORKERS' COMPENSATION... of Proof § 10.116 What additional evidence is needed in cases based on occupational disease? (a) The... occupational diseases. The medical report should also include the information specified on the checklist for...
Mutations in RIT1 cause Noonan syndrome - additional functional evidence and expanding the clinical phenotype.

PubMed

Koenighofer, M; Hung, C Y; McCauley, J L; Dallman, J; Back, E J; Mihalek, I; Gripp, K W; Sol-Church, K; Rusconi, P; Zhang, Z; Shi, G-X; Andres, D A; Bodamer, O A

2016-03-01

RASopathies are a clinically heterogeneous group of conditions caused by mutations in 1 of 16 proteins in the RAS-mitogen activated protein kinase (RAS-MAPK) pathway. Recently, mutations in RIT1 were identified as a novel cause for Noonan syndrome. Here we provide additional functional evidence for a causal role of RIT1 mutations and expand the associated phenotypic spectrum. We identified two de novo missense variants p.Met90Ile and p.Ala57Gly. Both variants resulted in increased MEK-ERK signaling compared to wild-type, underscoring gain-of-function as the primary functional mechanism. Introduction of p.Met90Ile and p.Ala57Gly into zebrafish embryos reproduced not only aspects of the human phenotype but also revealed abnormalities of eye development, emphasizing the importance of RIT1 for spatial and temporal organization of the growing organism. In addition, we observed severe lymphedema of the lower extremity and genitalia in one patient. We provide additional evidence for a causal relationship between pathogenic mutations in RIT1, increased RAS-MAPK/MEK-ERK signaling and the clinical phenotype. The mutant RIT1 protein may possess reduced GTPase activity or a diminished ability to interact with cellular GTPase activating proteins; however the precise mechanism remains unknown. The phenotypic spectrum is likely to expand and includes lymphedema of the lower extremities in addition to nuchal hygroma. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Continued Validation of the O-SCORE (Ottawa Surgical Competency Operating Room Evaluation): Use in the Simulated Environment.

PubMed

MacEwan, Matthew J; Dudek, Nancy L; Wood, Timothy J; Gofton, Wade T

2016-01-01

CONSTRUCT: The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) is a 9-item surgical evaluation tool designed to assess technical competence in surgical trainees using behavioral anchors. The initial development of the O-SCORE produced evidence for valid results. Further work is required to determine if the use of a single surgeon or an unblinded rater introduces bias. In addition, the relationship of the O-SCORE to other currently used technical assessment tools should be explored to provide validity evidence related to the relationship to other measures. We have designed this project to provide continued validity evidence for the O-SCORE related to these two issues. Nineteen residents and 2 staff Orthopedic Surgeons from the University of Ottawa volunteered to participate in a 2-part OSCE style station. Participants completed a written questionnaire followed by a videotaped 10-minute simulated open reduction and internal fixation of a midshaft radius fracture. Videos were rated individually by 2 blinded staff orthopedic surgeons using an Objective Structured Assessment of Technical Skills (OSATS) global rating scale, an OSATS checklist, and the O-SCORE in random order. O-SCORE results appeared sensitive to surgical training level even when raters were blinded. In addition, strong agreement between two independent observers using the O-SCORE suggests that the measure captures a performance easily recognized by surgical observers. Ratings on the O-SCORE also were strongly associated with global ratings on the currently most validated technical evaluation tool (OSATS). Collectively, these results suggest that the O-SCORE generates accurate, reproducible, and meaningful results when used in a randomized and blinded fashion, providing continued validity evidence for using this tool to evaluate surgical trainee competence. The O-SCORE was able to differentiate surgical trainee level using blinded raters providing further evidence of validity for the O
Validation of educational assessments: a primer for simulation and beyond.

PubMed

Cook, David A; Hatala, Rose

2016-01-01

Simulation plays a vital role in health professions assessment. This review provides a primer on assessment validation for educators and education researchers. We focus on simulation-based assessment of health professionals, but the principles apply broadly to other assessment approaches and topics. Validation refers to the process of collecting validity evidence to evaluate the appropriateness of the interpretations, uses, and decisions based on assessment results. Contemporary frameworks view validity as a hypothesis, and validity evidence is collected to support or refute the validity hypothesis (i.e., that the proposed interpretations and decisions are defensible). In validation, the educator or researcher defines the proposed interpretations and decisions, identifies and prioritizes the most questionable assumptions in making these interpretations and decisions (the "interpretation-use argument"), empirically tests those assumptions using existing or newly-collected evidence, and then summarizes the evidence as a coherent "validity argument." A framework proposed by Messick identifies potential evidence sources: content, response process, internal structure, relationships with other variables, and consequences. Another framework proposed by Kane identifies key inferences in generating useful interpretations: scoring, generalization, extrapolation, and implications/decision. We propose an eight-step approach to validation that applies to either framework: Define the construct and proposed interpretation, make explicit the intended decision(s), define the interpretation-use argument and prioritize needed validity evidence, identify candidate instruments and/or create/adapt a new instrument, appraise existing evidence and collect new evidence as needed, keep track of practical issues, formulate the validity argument, and make a judgment: does the evidence support the intended use? Rigorous validation first prioritizes and then empirically evaluates key
Validation of the FEA of a deep drawing process with additional force transmission

NASA Astrophysics Data System (ADS)

Behrens, B.-A.; Bouguecha, A.; Bonk, C.; Grbic, N.; Vucetic, M.

2017-10-01

In order to meet requirements by automotive industry like decreasing the CO2 emissions, which reflects in reducing vehicles mass in the car body, the chassis and the powertrain, the continuous innovation and further development of existing production processes are required. In sheet metal forming processes the process limits and components characteristics are defined through the process specific loads. While exceeding the load limits, a failure in the material occurs, which can be avoided by additional force transmission activated in the deep drawing process before the process limit is achieved. This contribution deals with experimental investigations of a forming process with additional force transmission regarding the extension of the process limits. Based on FEA a tool system is designed and developed by IFUM. For this purpose, the steel material HCT600 is analyzed numerically. Within the experimental investigations, the deep drawing processes, with and without the additional force transmission are carried out. Here, a comparison of the produced rectangle cups is done. Subsequently, the identical deep drawing processes are investigated numerically. Thereby, the values of the punch reaction force and displacement are estimated and compared with experimental results. Thus, the validation of material model is successfully carried out on process scale. For further quantitative verification of the FEA results the experimental determined geometry of the rectangular cup is measured optically with ATOS system of the company GOM mbH and digitally compared with external software Geomagic®QualifyTM. The goal of this paper is the verification of the transferability of the FEA model for a conventional deep drawing process to a deep drawing process with additional force transmission with a counter punch.
Montreal Battery of Evaluation of Amusia: Validity evidence and norms for adolescents in Belo Horizonte, Minas Gerais, Brazil

PubMed Central

Nunes-Silva, Marília; Haase, Vitor Geraldi

2012-01-01

The Montreal Battery of Evaluation of Amusia (MBEA) is a battery of tests that assesses six music processing components: scale, contour, interval, rhythm, metric, and music memory. The present study sought to verify the psychometric characteristics of the MBEA in a sample of 150 adolescents aged 14-18 years in the city of Belo Horizonte, Minas Gerais, Brazil, and to develop specific norms for this population. We used statistical procedures that explored the dimensional structure of the MBEA and its items, evaluating their adequacy from empirical data, verifying their reliability, and providing evidence of validity. The results for the difficult levels for each test indicated a trend toward higher scores, corroborating previous studies. From the analysis of the criterion groups, almost all of the items were considered discriminatory. The global score of the MBEA was shown to be valid and reliable (rK-R20=0.896) for assessing the musical ability of normal teenagers. Based on the analysis of the items, we proposed a short version of the MBEA. Further studies with larger samples and amusic individuals are necessary to provide evidence of the validity of the MBEA in the Brazilian milieu. The present study brings to the Brazilian context a tool for diagnosing deficits in musical skills and will serve as a basis for comparisons with single case studies and studies of populations with specific neuropsychological syndromes. PMID:29213804
Validity evidence for the Hamburg multiple mini-interview.

PubMed

Knorr, Mirjana; Schwibbe, Anja; Ehrhardt, Maren; Lackamp, Janina; Zimmermann, Stefan; Hampe, Wolfgang

2018-05-14

Multiple mini-interviews (MMI) become increasingly popular for the selection of medical students. In this work, we examine the validity evidence for the Hamburg MMI. We conducted three follow-up studies for the 2014 cohort of applicants to medical school over the course of two years. We calculated Spearman's rank correlation (ρ) between MMI results and (1) emotional intelligence measured by the Trait Emotional Intelligence Questionnaire (TEIQue-SF) and the Situational Test of Emotion Management (STEM), (2) supervisors' and practice team members' evaluations of psychosocial competencies and suitability for the medical profession after a one-week 1:1 teaching in a general practice (GP) and (3) objective structured clinical examination (OSCE) scores. There were no significant correlations between MMI results and the TEIQue-SF (ρ = .07, p > .05) or the STEM (ρ = .05, p > .05). MMI results could significantly predict GP evaluations of psychosocial competencies (ρ = .32, p < .05) and suitability for the medical profession (ρ = .42, p < .01) as well as OSCE scores (ρ = .23, p < .05). The MMI remained a significant predictor of these outcomes in a robust regression model including gender and age as control variables. Our findings suggest that MMIs can measure competencies that are relevant in a practical context. However, these competencies do not seem to be related to emotional intelligence as measured by self-report or situational judgement test.
Evidence-Based Toxicology.

PubMed

Hoffmann, Sebastian; Hartung, Thomas; Stephens, Martin

Evidence-based toxicology (EBT) was introduced independently by two groups in 2005, in the context of toxicological risk assessment and causation as well as based on parallels between the evaluation of test methods in toxicology and evidence-based assessment of diagnostics tests in medicine. The role model of evidence-based medicine (EBM) motivated both proposals and guided the evolution of EBT, whereas especially systematic reviews and evidence quality assessment attract considerable attention in toxicology.Regarding test assessment, in the search of solutions for various problems related to validation, such as the imperfectness of the reference standard or the challenge to comprehensively evaluate tests, the field of Diagnostic Test Assessment (DTA) was identified as a potential resource. DTA being an EBM discipline, test method assessment/validation therefore became one of the main drivers spurring the development of EBT.In the context of pathway-based toxicology, EBT approaches, given their objectivity, transparency and consistency, have been proposed to be used for carrying out a (retrospective) mechanistic validation.In summary, implementation of more evidence-based approaches may provide the tools necessary to adapt the assessment/validation of toxicological test methods and testing strategies to face the challenges of toxicology in the twenty first century.
Is there any evidence for the validity of diagnostic criteria used for accommodative and nonstrabismic binocular dysfunctions?

PubMed Central

Cacho-Martínez, Pilar; García-Muñoz, Ángel; Ruiz-Cantero, María Teresa

2013-01-01

Purpose To analyze the diagnostic criteria used in the scientific literature published in the past 25 years for accommodative and nonstrabismic binocular dysfunctions and to explore if the epidemiological analysis of diagnostic validity has been used to propose which clinical criteria should be used for diagnostic purposes. Methods We carried out a systematic review of papers on accommodative and non-strabic binocular disorders published from 1986 to 2012 analysing the MEDLINE, CINAHL, PsycINFO and FRANCIS databases. We admitted original articles about diagnosis of these anomalies in any population. We identified 839 articles and 12 studies were included. The quality of included articles was assessed using the QUADAS-2 tool. Results The review shows a wide range of clinical signs and cut-off points between authors. Only 3 studies (regarding accommodative anomalies) assessed diagnostic accuracy of clinical signs. Their results suggest using the accommodative amplitude and monocular accommodative facility for diagnosing accommodative insufficiency and a high positive relative accommodation for accommodative excess. The remaining 9 articles did not analyze diagnostic accuracy, assessing a diagnosis with the criteria the authors considered. We also found differences between studies in the way of considering patients’ symptomatology. 3 studies of 12 analyzed, performed a validation of a symptom survey used for convergence insufficiency. Conclusions Scientific literature reveals differences between authors according to diagnostic criteria for accommodative and nonstrabismic binocular dysfunctions. Diagnostic accuracy studies show that there is only certain evidence for accommodative conditions. For binocular anomalies there is only evidence about a validated questionnaire for convergence insufficiency with no data of diagnostic accuracy. PMID:24646897
A Practical Measure of Student Motivation: Establishing Validity Evidence for the Expectancy-Value-Cost Scale in Middle School

ERIC Educational Resources Information Center

Kosovich, Jeff J.; Hulleman, Chris S.; Barron, Kenneth E.; Getty, Steve

2015-01-01

We present validity evidence for the Expectancy-Value-Cost (EVC) Scale of student motivation. Using a brief, 10-item scale, we measured middle school students' expectancy, value, and cost for their math and science classes in the Fall and Winter of the same academic year. Confirmatory factor analyses supported the three-factor structure of the EVC…
Evidence for dose-additive effects of pyrethroids on motor activity in rats.

PubMed

Wolansky, Marcelo J; Gennings, Chris; DeVito, Michael J; Crofton, Kevin M

2009-10-01

Pyrethroids are neurotoxic insecticides used in a variety of indoor and outdoor applications. Previous research characterized the acute dose-effect functions for 11 pyrethroids administered orally in corn oil (1 mL/kg) based on assessment of motor activity. We used a mixture of these 11 pyrethroids and the same testing paradigm used in single-compound assays to test the hypothesis that cumulative neurotoxic effects of pyrethroid mixtures can be predicted using the default dose-addition theory. Mixing ratios of the 11 pyrethroids in the tested mixture were based on the ED30 (effective dose that produces a 30% decrease in response) of the individual chemical (i.e., the mixture comprised equipotent amounts of each pyrethroid). The highest concentration of each individual chemical in the mixture was less than the threshold for inducing behavioral effects. Adult male rats received acute oral exposure to corn oil (control) or dilutions of the stock mixture solution. The mixture of 11 pyrethroids was administered either simultaneously (2 hr before testing) or after a sequence based on times of peak effect for the individual chemicals (4, 2, and 1 hr before testing). A threshold additivity model was fit to the single-chemical data to predict the theoretical dose-effect relationship for the mixture under the assumption of dose additivity. When subthreshold doses of individual chemicals were combined in the mixtures, we found significant dose-related decreases in motor activity. Further, we found no departure from the predicted dose-additive curve regardless of the mixture dosing protocol used. In this article we present the first in vivo evidence on pyrethroid cumulative effects supporting the default assumption of dose addition.
Validation Evidence of the Motivation for Teaching Scale in Secondary Education.

PubMed

Abós, Ángel; Sevil, Javier; Martín-Albo, José; Aibar, Alberto; García-González, Luis

2018-04-10

Grounded in self-determination theory, the aim of this study was to develop a scale with adequate psychometric properties to assess motivation for teaching and to explain some outcomes of secondary education teachers at work. The sample comprised 584 secondary education teachers. Analyses supported the five-factor model (intrinsic motivation, identified regulation, introjected regulation, external regulation and amotivation) and indicated the presence of a continuum of self-determination. Evidence of reliability was provided by Cronbach's alpha, composite reliability and average variance extracted. Multigroup confirmatory factor analyses supported the partial invariance (configural and metric) of the scale in different sub-samples, in terms of gender and type of school. Concurrent validity was analyzed by a structural equation modeling that explained 71% of the work dedication variance and 69% of the boredom at work variance. Work dedication was positively predicted by intrinsic motivation (ß = .56, p < .001) and external regulation (ß = .29, p < .001) and negatively predicted by introjected regulation (ß = -.22, p < .001) and amotivation (ß = -.49, p < .001). Boredom at work was negatively predicted by intrinsic motivation (ß = -.28, p < .005) and positively predicted by amotivation (ß = .68, p < .001). The Motivation for Teaching Scale in Secondary Education (Spanish acronym EME-ES, Escala de Motivación por la Enseñanza en Educación Secundaria) is discussed as a valid and reliable instrument. This is the first specific scale in the work context of secondary teachers that has integrated the five-factor structure together with their dedication and boredom at work.
Sensitivity to food additives, vaso-active amines and salicylates: a review of the evidence.

PubMed

Skypala, Isabel J; Williams, M; Reeves, L; Meyer, R; Venter, C

2015-01-01

Although there is considerable literature pertaining to IgE and non IgE-mediated food allergy, there is a paucity of information on non-immune mediated reactions to foods, other than metabolic disorders such as lactose intolerance. Food additives and naturally occurring 'food chemicals' have long been reported as having the potential to provoke symptoms in those who are more sensitive to their effects. Diets low in 'food chemicals' gained prominence in the 1970s and 1980s, and their popularity remains, although the evidence of their efficacy is very limited. This review focuses on the available evidence for the role and likely adverse effects of both added and natural 'food chemicals' including benzoate, sulphite, monosodium glutamate, vaso-active or biogenic amines and salicylate. Studies assessing the efficacy of the restriction of these substances in the diet have mainly been undertaken in adults, but the paper will also touch on the use of such diets in children. The difficulty of reviewing the available evidence is that few of the studies have been controlled and, for many, considerable time has elapsed since their publication. Meanwhile dietary patterns and habits have changed hugely in the interim, so the conclusions may not be relevant for our current dietary norms. The conclusion of the review is that there may be some benefit in the removal of an additive or a group of foods high in natural food chemicals from the diet for a limited period for certain individuals, providing the diagnostic pathway is followed and the foods are reintroduced back into the diet to assess for the efficacy of removal. However diets involving the removal of multiple additives and food chemicals have the very great potential to lead to nutritional deficiency especially in the paediatric population. Any dietary intervention, whether for the purposes of diagnosis or management of food allergy or food intolerance, should be adapted to the individual's dietary habits and a suitably
Evidence for Dose-Additive Effects of Pyrethroids on Motor Activity in Rats

PubMed Central

Wolansky, Marcelo J.; Gennings, Chris; DeVito, Michael J.; Crofton, Kevin M.

2009-01-01

Background Pyrethroids are neurotoxic insecticides used in a variety of indoor and outdoor applications. Previous research characterized the acute dose–effect functions for 11 pyrethroids administered orally in corn oil (1 mL/kg) based on assessment of motor activity. Objectives We used a mixture of these 11 pyrethroids and the same testing paradigm used in single-compound assays to test the hypothesis that cumulative neurotoxic effects of pyrethroid mixtures can be predicted using the default dose–addition theory. Methods Mixing ratios of the 11 pyrethroids in the tested mixture were based on the ED30 (effective dose that produces a 30% decrease in response) of the individual chemical (i.e., the mixture comprised equipotent amounts of each pyrethroid). The highest concentration of each individual chemical in the mixture was less than the threshold for inducing behavioral effects. Adult male rats received acute oral exposure to corn oil (control) or dilutions of the stock mixture solution. The mixture of 11 pyrethroids was administered either simultaneously (2 hr before testing) or after a sequence based on times of peak effect for the individual chemicals (4, 2, and 1 hr before testing). A threshold additivity model was fit to the single-chemical data to predict the theoretical dose–effect relationship for the mixture under the assumption of dose additivity. Results When subthreshold doses of individual chemicals were combined in the mixtures, we found significant dose-related decreases in motor activity. Further, we found no departure from the predicted dose-additive curve regardless of the mixture dosing protocol used. Conclusion In this article we present the first in vivo evidence on pyrethroid cumulative effects supporting the default assumption of dose addition. PMID:20019907
Moderators and Mediators in Social Work Research: Toward a More Ecologically Valid Evidence Base for Practice

PubMed Central

Magill, Molly

2012-01-01

Summary Evidence-based practice involves the consistent and critical consumption of the social work research literature. As methodologies advance, primers to guide such efforts are often needed. In the present work, common statistical methods for testing moderation and mediation are identified, summarized, and corresponding examples, drawn from the substance abuse, domestic violence, and mental health literature, are provided. Findings While methodologically complex, analyses of these third variable effects can provide an optimal fit for the complexity involved in the provision of evidence-based social work services. While a moderator may identify the trait or state requirement for a causal relationship to occur, a mediator is concerned with the transmission of that relationship. In social work practice, these are questions of “under what conditions and for whom?” and of the “how?” of behavior change. Implications Implications include a need for greater attention to these methods among practitioners and evaluation researchers. With knowledge gained through the present review, social workers can benefit from a more ecologically valid evidence base for practice. PMID:22833701
Use of an evidence-based algorithm for patients with traumatic hemothorax reduces need for additional interventions.

PubMed

Dennis, Bradley M; Gondek, Stephen P; Guyer, Richard A; Hamblin, Susan E; Gunter, Oliver L; Guillamondegui, Oscar D

2017-04-01

Concerted management of the traumatic hemothorax is ill-defined. Surgical management of specific hemothoraces may be beneficial. A comprehensive strategy to delineate appropriate patients for additional procedures does not exist. We developed an evidence-based algorithm for hemothorax management. We hypothesize that the use of this algorithm will decrease additional interventions. A pre-/post-study was performed on all patients admitted to our trauma service with traumatic hemothorax from August 2010 to September 2013. An evidence-based management algorithm was initiated for the management of retained hemothoraces. Patients with length of stay (LOS) less than 24 hours or admitted during an implementation phase were excluded. Study data included age, Injury Severity Score, Abbreviated Injury Scale chest, mechanism of injury, ventilator days, intensive care unit (ICU) LOS, total hospital LOS, and interventions required. Our primary outcome was number of patients requiring more than 1 intervention. Secondary outcomes were empyema rate, number of patients requiring specific additional interventions, 28-day ventilator-free days, 28-day ICU-free days, hospital LOS, all-cause 6-month readmission rate. Standard statistical analysis was performed for all data. Six hundred forty-two patients (326 pre and 316 post) met the study criteria. There were no demographic differences in either group. The number of patients requiring more than 1 intervention was significantly reduced (49 pre vs. 28 post, p = 0.02). Number of patients requiring VATS decreased (27 pre vs. 10 post, p < 0.01). Number of catheters placed by interventional radiology increased (2 pre vs. 10 post, p = 0.02). Intrapleural thrombolytic use, open thoracotomy, empyema, and 6-month readmission rates were unchanged. The "post" group more ventilator-free days (median, 23.9 vs. 22.5, p = 0.04), but ICU and hospital LOS were unchanged. Using an evidence-based hemothorax algorithm reduced the number of patients

Further Validation of the Multidimensional Fatigue Symptom Inventory-Short Form

PubMed Central

Stein, Kevin D.; Jacobsen, Paul B.; Blanchard, Chris M.; Thors, Christina

2008-01-01

A growing body of evidence is documenting the multidimensional nature of cancer-related fatigue. Although several multidimensional measures of fatigue have been developed, further validation of these scales is needed. To this end, the current study sought to evaluate the factorial and construct validity of the 30-item Multidimensional Fatigue Symptom Inventory-Short Form (MFSI-SF). A heterogeneous sample of 304 cancer patients (mean age 55 years) completed the MFSI-SF, along with several other measures of psychosocial functioning including the MOS-SF-36 and Fatigue Symptom Inventory, following the fourth cycle of chemotherapy treatment. The results of a confirmatory factor analysis indicated the 5-factor model provided a good fit to the data as evidenced by commonly used goodness of fit indices (CFI 0.90 and IFI 0.90). Additional evidence for the validity of the MFSI-SF was provided via correlations with other relevant instruments (range −0.21 to 0.82). In sum, the current study provides support for the MFSI-SF as a valuable tool for the multidimensional assessment of cancer-related fatigue. PMID:14711465
Initial construct validity evidence of a virtual human application for competency assessment in breaking bad news to a cancer patient.

PubMed

Guetterman, Timothy C; Kron, Frederick W; Campbell, Toby C; Scerbo, Mark W; Zelenski, Amy B; Cleary, James F; Fetters, Michael D

2017-01-01

Despite interest in using virtual humans (VHs) for assessing health care communication, evidence of validity is limited. We evaluated the validity of a VH application, MPathic-VR, for assessing performance-based competence in breaking bad news (BBN) to a VH patient. We used a two-group quasi-experimental design, with residents participating in a 3-hour seminar on BBN. Group A (n=15) completed the VH simulation before and after the seminar, and Group B (n=12) completed the VH simulation only after the BBN seminar to avoid the possibility that testing alone affected performance. Pre- and postseminar differences for Group A were analyzed with a paired t -test, and comparisons between Groups A and B were analyzed with an independent t -test. Compared to the preseminar result, Group A's postseminar scores improved significantly, indicating that the VH program was sensitive to differences in assessing performance-based competence in BBN. Postseminar scores of Group A and Group B were not significantly different, indicating that both groups performed similarly on the VH program. Improved pre-post scores demonstrate acquisition of skills in BBN to a VH patient. Pretest sensitization did not appear to influence posttest assessment. These results provide initial construct validity evidence that the VH program is effective for assessing BBN performance-based communication competence.
Is there any evidence for the validity of diagnostic criteria used for accommodative and nonstrabismic binocular dysfunctions?

PubMed

Cacho-Martínez, Pilar; García-Muñoz, Ángel; Ruiz-Cantero, María Teresa

2014-01-01

To analyze the diagnostic criteria used in the scientific literature published in the past 25 years for accommodative and nonstrabismic binocular dysfunctions and to explore if the epidemiological analysis of diagnostic validity has been used to propose which clinical criteria should be used for diagnostic purposes. We carried out a systematic review of papers on accommodative and non-strabic binocular disorders published from 1986 to 2012 analysing the MEDLINE, CINAHL, PsycINFO and FRANCIS databases. We admitted original articles about diagnosis of these anomalies in any population. We identified 839 articles and 12 studies were included. The quality of included articles was assessed using the QUADAS-2 tool. The review shows a wide range of clinical signs and cut-off points between authors. Only 3 studies (regarding accommodative anomalies) assessed diagnostic accuracy of clinical signs. Their results suggest using the accommodative amplitude and monocular accommodative facility for diagnosing accommodative insufficiency and a high positive relative accommodation for accommodative excess. The remaining 9 articles did not analyze diagnostic accuracy, assessing a diagnosis with the criteria the authors considered. We also found differences between studies in the way of considering patients' symptomatology. 3 studies of 12 analyzed, performed a validation of a symptom survey used for convergence insufficiency. Scientific literature reveals differences between authors according to diagnostic criteria for accommodative and nonstrabismic binocular dysfunctions. Diagnostic accuracy studies show that there is only certain evidence for accommodative conditions. For binocular anomalies there is only evidence about a validated questionnaire for convergence insufficiency with no data of diagnostic accuracy. Copyright © 2012 Spanish General Council of Optometry. Published by Elsevier Espana. All rights reserved.
Beliefs about language development: construct validity evidence.

PubMed

Donahue, Mavis L; Fu, Qiong; Smith, Everett V

2012-01-01

Understanding language development is incomplete without recognizing children's sociocultural environments, including adult beliefs about language development. Yet there is a need for data supporting valid inferences to assess these beliefs. The current study investigated the psychometric properties of data from a survey (MODeL) designed to explore beliefs in the popular culture, and their alignment with more formal theories. Support for the content, substantive, structural, generalizability, and external aspects of construct validity of the data were investigated. Subscales representing Behaviorist, Cognitive, Nativist, and Sociolinguistic models were identified as dimensions of beliefs. More than half of the items showed a high degree of consensus, suggesting culturally-transmitted beliefs. Behaviorist ideas were most popular. Bilingualism and ethnicity were related to Cognitive and Sociolinguistic beliefs. Identifying these beliefs may clarify the nature of child-directed speech, and enable the design of language intervention programs that are congruent with family and cultural expectations.
Application of validity theory and methodology to patient-reported outcome measures (PROMs): building an argument for validity.

PubMed

Hawkins, Melanie; Elsworth, Gerald R; Osborne, Richard H

2018-07-01

Data from subjective patient-reported outcome measures (PROMs) are now being used in the health sector to make or support decisions about individuals, groups and populations. Contemporary validity theorists define validity not as a statistical property of the test but as the extent to which empirical evidence supports the interpretation of test scores for an intended use. However, validity testing theory and methodology are rarely evident in the PROM validation literature. Application of this theory and methodology would provide structure for comprehensive validation planning to support improved PROM development and sound arguments for the validity of PROM score interpretation and use in each new context. This paper proposes the application of contemporary validity theory and methodology to PROM validity testing. The validity testing principles will be applied to a hypothetical case study with a focus on the interpretation and use of scores from a translated PROM that measures health literacy (the Health Literacy Questionnaire or HLQ). Although robust psychometric properties of a PROM are a pre-condition to its use, a PROM's validity lies in the sound argument that a network of empirical evidence supports the intended interpretation and use of PROM scores for decision making in a particular context. The health sector is yet to apply contemporary theory and methodology to PROM development and validation. The theoretical and methodological processes in this paper are offered as an advancement of the theory and practice of PROM validity testing in the health sector.
Evidence for single metal two electron oxidative addition and reductive elimination at uranium.

PubMed

Gardner, Benedict M; Kefalidis, Christos E; Lu, Erli; Patel, Dipti; McInnes, Eric J L; Tuna, Floriana; Wooles, Ashley J; Maron, Laurent; Liddle, Stephen T

2017-12-01

Reversible single-metal two-electron oxidative addition and reductive elimination are common fundamental reactions for transition metals that underpin major catalytic transformations. However, these reactions have never been observed together in the f-block because these metals exhibit irreversible one- or multi-electron oxidation or reduction reactions. Here we report that azobenzene oxidises sterically and electronically unsaturated uranium(III) complexes to afford a uranium(V)-imido complex in a reaction that satisfies all criteria of a single-metal two-electron oxidative addition. Thermolysis of this complex promotes extrusion of azobenzene, where H-/D-isotopic labelling finds no isotopomer cross-over and the non-reactivity of a nitrene-trap suggests that nitrenes are not generated and thus a reductive elimination has occurred. Though not optimally balanced in this case, this work presents evidence that classical d-block redox chemistry can be performed reversibly by f-block metals, and that uranium can thus mimic elementary transition metal reactivity, which may lead to the discovery of new f-block catalysis.
Measuring Decision-Making During Thyroidectomy: Validity Evidence for a Web-Based Assessment Tool.

PubMed

Madani, Amin; Gornitsky, Jordan; Watanabe, Yusuke; Benay, Cassandre; Altieri, Maria S; Pucher, Philip H; Tabah, Roger; Mitmaker, Elliot J

2018-02-01

Errors in judgment during thyroidectomy can lead to recurrent laryngeal nerve injury and other complications. Despite the strong link between patient outcomes and intraoperative decision-making, methods to evaluate these complex skills are lacking. The purpose of this study was to develop objective metrics to evaluate advanced cognitive skills during thyroidectomy and to obtain validity evidence for them. An interactive online learning platform was developed ( www.thinklikeasurgeon.com ). Trainees and surgeons from four institutions completed a 33-item assessment, developed based on a cognitive task analysis and expert Delphi consensus. Sixteen items required subjects to make annotations on still frames of thyroidectomy videos, and accuracy scores were calculated based on an algorithm derived from experts' responses ("visual concordance test," VCT). Seven items were short answer (SA), requiring users to type their answers, and scores were automatically calculated based on their similarity to a pre-populated repertoire of correct responses. Test-retest reliability, internal consistency, and correlation of scores with self-reported experience and training level (novice, intermediate, expert) were calculated. Twenty-eight subjects (10 endocrine surgeons and otolaryngologists, 18 trainees) participated. There was high test-retest reliability (intraclass correlation coefficient = 0.96; n = 10) and internal consistency (Cronbach's α = 0.93). The assessment demonstrated significant differences between novices, intermediates, and experts in total score (p < 0.01), VCT score (p < 0.01) and SA score (p < 0.01). There was high correlation between total case number and total score (ρ = 0.95, p < 0.01), between total case number and VCT score (ρ = 0.93, p < 0.01), and between total case number and SA score (ρ = 0.83, p < 0.01). This study describes the development of novel metrics and provides validity evidence for an interactive Web-based platform
Predictive validity evidence for medical education research study quality instrument scores: quality of submissions to JGIM's Medical Education Special Issue.

PubMed

Reed, Darcy A; Beckman, Thomas J; Wright, Scott M; Levine, Rachel B; Kern, David E; Cook, David A

2008-07-01

Deficiencies in medical education research quality are widely acknowledged. Content, internal structure, and criterion validity evidence support the use of the Medical Education Research Study Quality Instrument (MERSQI) to measure education research quality, but predictive validity evidence has not been explored. To describe the quality of manuscripts submitted to the 2008 Journal of General Internal Medicine (JGIM) medical education issue and determine whether MERSQI scores predict editorial decisions. Cross-sectional study of original, quantitative research studies submitted for publication. Study quality measured by MERSQI scores (possible range 5-18). Of 131 submitted manuscripts, 100 met inclusion criteria. The mean (SD) total MERSQI score was 9.6 (2.6), range 5-15.5. Most studies used single-group cross-sectional (54%) or pre-post designs (32%), were conducted at one institution (78%), and reported satisfaction or opinion outcomes (56%). Few (36%) reported validity evidence for evaluation instruments. A one-point increase in MERSQI score was associated with editorial decisions to send manuscripts for peer review versus reject without review (OR 1.31, 95%CI 1.07-1.61, p = 0.009) and to invite revisions after review versus reject after review (OR 1.29, 95%CI 1.05-1.58, p = 0.02). MERSQI scores predicted final acceptance versus rejection (OR 1.32; 95% CI 1.10-1.58, p = 0.003). The mean total MERSQI score of accepted manuscripts was significantly higher than rejected manuscripts (10.7 [2.5] versus 9.0 [2.4], p = 0.003). MERSQI scores predicted editorial decisions and identified areas of methodological strengths and weaknesses in submitted manuscripts. Researchers, reviewers, and editors might use this instrument as a measure of methodological quality.
Evidence of Validity for the Japanese Version of the Foot and Ankle Ability Measure

PubMed Central

Uematsu, Daisuke; Suzuki, Hidetomo; Sasaki, Shogo; Nagano, Yasuharu; Shinozuka, Nobuyuki; Sunagawa, Norihiko; Fukubayashi, Toru

2015-01-01

Context: The Foot and Ankle Ability Measure (FAAM) is a valid, reliable, and self-reported outcome instrument for the foot and ankle region. Objective: To provide evidence for translation, cross-cultural adaptation, validity, and reliability of the Japanese version of the FAAM (FAAM-J). Design: Cross-sectional study. Setting: Collegiate athletic training/sports medicine clinical setting. Patients or Other Participants: Eighty-three collegiate athletes. Main Outcome Measure(s): All participants completed the Activities of Daily Living and Sports subscales of the FAAM-J and the Physical Functioning and Mental Health subscales of the Japanese version of the Short Form-36v2 (SF-36). Also, 19 participants (23%) whose conditions were expected to be stable completed another FAAM-J 2 to 6 days later for test-retest reliability. We analyzed the scores of those subscales for convergent and divergent validity, internal consistency, and test-retest reliability. Results: The Activities of Daily Living and Sports subscales of the FAAM-J had correlation coefficients of 0.86 and 0.75, respectively, with the Physical Functioning section of the SF-36 for convergent validity. For divergent validity, the correlation coefficients with Mental Health of the SF-36 were 0.29 and 0.27 for each subscale, respectively. Cronbach α for internal consistency was 0.99 for the Activities of Daily Living and 0.98 for the Sports subscale. A 95% confidence interval with a single measure was ±8.1 and ±14.0 points for each subscale. The test-retest reliability measures revealed intraclass correlation coefficient values of 0.87 for the Activities of Daily Living and 0.91 for the Sports subscales with minimal detectable changes of ±6.8 and ±13.7 for the respective subscales. Conclusions: The FAAM was successfully translated for a Japanese version, and the FAAM-J was adapted cross-culturally. Thus, the FAAM-J can be used as a self-reported outcome measure for Japanese-speaking individuals; however
Assessing Procedural Competence: Validity Considerations.

PubMed

Pugh, Debra M; Wood, Timothy J; Boulet, John R

2015-10-01

Simulation-based medical education (SBME) offers opportunities for trainees to learn how to perform procedures and to be assessed in a safe environment. However, SBME research studies often lack robust evidence to support the validity of the interpretation of the results obtained from tools used to assess trainees' skills. The purpose of this paper is to describe how a validity framework can be applied when reporting and interpreting the results of a simulation-based assessment of skills related to performing procedures. The authors discuss various sources of validity evidence because they relate to SBME. A case study is presented.
The Validity of Adding ECG to the Preparticipation Screening of Athletes An Evidence Based Literature Review

PubMed Central

Alattar, A; Maffulli, N

2015-01-01

Objective: To review the available evidence establishing the validity of adding electrocardiogram to the preparticipation cardiac screening in athletes. Data Sources: MEDLINE and CINAHL databases were searched. Additional references from the bibliographies of retrieved articles were also reviewed and experts in the area were contacted. Selection Criteria: Only original research articles seeking to establish the use of electrocardiography followed by second line investigations in athletes under 36 years of age were reviewed. Search Result and Quality Assessment: The initial literature search identified 226 papers. Of these, 16 original articles (all type II evidence—population-based clinical studies) met the selection criteria and directly related to the use of electrocardiography in athletes cardiac screening. The methodological qualities of included studies were assessed using the Downs and Black checklist. Conclusion: Screening with electrocardiography represents best clinical practice to prevent or reduce the risk of sudden cardiac death in athletes. It significantly improves the sensitivity of history and physical examination alone; it has reasonable specificity and excellent negative predictive value; and it is cost-effective. Future studies must be large, multicentre, multination, prospective trials powered to determine how different screening options affect the incidence of sudden cardiac death. Efforts should also be targeted toward secondary prevention of sudden cardiac death with pitch side cardiac resuscitation and the immediate use of defibrillator. PMID:25674543
Educational testing validity and reliability in pharmacy and medical education literature.

PubMed

Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J

2013-12-16

To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
First evidence on the validity and reliability of the Safety Organizing Scale-Nursing Home version (SOS-NH).

PubMed

Ausserhofer, Dietmar; Anderson, Ruth A; Colón-Emeric, Cathleen; Schwendimann, René

2013-08-01

The Safety Organizing Scale is a valid and reliable measure on safety behaviors and practices in hospitals. This study aimed to explore the psychometric properties of the Safety Organizing Scale-Nursing Home version (SOS-NH). In a cross-sectional analysis of staff survey data, we examined validity and reliability of the 9-item Safety SOS-NH using American Educational Research Association guidelines. This substudy of a larger trial used baseline survey data collected from staff members (n = 627) in a variety of work roles in 13 nursing homes (NHs) in North Carolina and Virginia. Psychometric evaluation of the SOS-NH revealed good response patterns with low average of missing values across all items (3.05%). Analyses of the SOS-NH's internal structure (eg, comparative fit indices = 0.929, standardized root mean square error of approximation = 0.045) and consistency (composite reliability = 0.94) suggested its 1-dimensionality. Significant between-facility variability, intraclass correlations, within-group agreement, and design effect confirmed appropriateness of the SOS-NH for measurement at the NH level, justifying data aggregation. The SOS-NH showed discriminate validity from one related concept: communication openness. Initial evidence regarding validity and reliability of the SOS-NH supports its utility in measuring safety behaviors and practices among a wide range of NH staff members, including those with low literacy. Further psychometric evaluation should focus on testing concurrent and criterion validity, using resident outcome measures (eg, patient fall rates). Copyright © 2013 American Medical Directors Association, Inc. All rights reserved.
Validity of the SAT® for Predicting First-Year Grades: 2010 SAT Validity Sample. Statistical Report 2013-2

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2013-01-01

The continued accumulation of validity evidence for the core uses of educational assessments is critical to ensure that proper inferences will be made for those core purposes. To that end, the College Board has continued to follow previous cohorts of college students and this report provides updated validity evidence for using the SAT to predict…
Validity and reliability of instruments aimed at measuring Evidence-Based Practice in Physical Therapy: a systematic review of the literature.

PubMed

Fernández-Domínguez, Juan Carlos; Sesé-Abad, Albert; Morales-Asencio, Jose Miguel; Oliva-Pascual-Vaca, Angel; Salinas-Bueno, Iosune; de Pedro-Gómez, Joan Ernest

2014-12-01

Our goal is to compile and analyse the characteristics - especially validity and reliability - of all the existing international tools that have been used to measure evidence-based clinical practice in physiotherapy. A systematic review conducted with data from exclusively quantitative-type studies synthesized in narrative format. An in-depth search of the literature was conducted in two phases: initial, structured, electronic search of databases and also journals with summarized evidence; followed by a residual-directed search in the bibliographical references of the main articles found in the primary search procedure. The studies included were assigned to members of the research team who acted as peer reviewers. Relevant information was extracted from each of the selected articles using a template that included the general characteristics of the instrument as well as an analysis of the quality of the validation processes carried out, by following the criteria of Terwee. Twenty-four instruments were found to comply with the review screening criteria; however, in all cases, they were found to be limited as regards the 'constructs' included. Besides, they can all be seen to be lacking as regards comprehensiveness associated to the validation process of the psychometric tests used. It seems that what constitutes a rigorously developed assessment instrument for EBP in physical therapy continues to be a challenge. © 2014 John Wiley & Sons, Ltd.
Initial construct validity evidence of a virtual human application for competency assessment in breaking bad news to a cancer patient

PubMed Central

Guetterman, Timothy C; Kron, Frederick W; Campbell, Toby C; Scerbo, Mark W; Zelenski, Amy B; Cleary, James F; Fetters, Michael D

2017-01-01

Background Despite interest in using virtual humans (VHs) for assessing health care communication, evidence of validity is limited. We evaluated the validity of a VH application, MPathic-VR, for assessing performance-based competence in breaking bad news (BBN) to a VH patient. Methods We used a two-group quasi-experimental design, with residents participating in a 3-hour seminar on BBN. Group A (n=15) completed the VH simulation before and after the seminar, and Group B (n=12) completed the VH simulation only after the BBN seminar to avoid the possibility that testing alone affected performance. Pre- and postseminar differences for Group A were analyzed with a paired t-test, and comparisons between Groups A and B were analyzed with an independent t-test. Results Compared to the preseminar result, Group A’s postseminar scores improved significantly, indicating that the VH program was sensitive to differences in assessing performance-based competence in BBN. Postseminar scores of Group A and Group B were not significantly different, indicating that both groups performed similarly on the VH program. Conclusion Improved pre–post scores demonstrate acquisition of skills in BBN to a VH patient. Pretest sensitization did not appear to influence posttest assessment. These results provide initial construct validity evidence that the VH program is effective for assessing BBN performance-based communication competence. PMID:28794664
Validating the Implementation Climate Scale (ICS) in Child Welfare Organizations

PubMed Central

Ehrhart, Mark G.; Torres, Elisa M.; Wright, Lisa A.; Martinez, Sandra Y.; Aarons, Gregory A.

2015-01-01

There is increasing emphasis on the use of evidence-based practices (EBPs) in child welfare settings and growing recognition of the importance of the organizational environment, and the organization’s climate in particular, for how employees perceive and support EBP implementation. Recently, Ehrhart, Aarons, and Farahnak (2014) reported on the development and validation of a measure of EBP implementation climate, the Implementation Climate Scale (ICS), in a sample of mental health clinicians. The ICS consists of 18 items and measures six critical dimensions of implementation climate: focus on EBP, educational support for EBP, recognition for EBP, rewards for EBP, selection or EBP, and selection for openness. The goal of the current study is to extend this work by providing evidence for the factor structure, reliability, and validity of the ICS in a sample of child welfare service providers. Survey data were collected from 215 child welfare providers across three states, 12 organizations, and 43 teams. Confirmatory factor analysis demonstrated good fit to the six-factor model and the alpha reliabilities for the overall measure and its subscales was acceptable. In addition, there was general support for the invariance of the factor structure across the child welfare and mental health sectors. In conclusion, this study provides evidence for the factor structure, reliability, and validity of the ICS measure for use in child welfare service organizations. PMID:26563643
Validating the Implementation Climate Scale (ICS) in child welfare organizations.

PubMed

Ehrhart, Mark G; Torres, Elisa M; Wright, Lisa A; Martinez, Sandra Y; Aarons, Gregory A

2016-03-01

There is increasing emphasis on the use of evidence-based practices (EBPs) in child welfare settings and growing recognition of the importance of the organizational environment, and the organization's climate in particular, for how employees perceive and support EBP implementation. Recently, Ehrhart, Aarons, and Farahnak (2014) reported on the development and validation of a measure of EBP implementation climate, the Implementation Climate Scale (ICS), in a sample of mental health clinicians. The ICS consists of 18 items and measures six critical dimensions of implementation climate: focus on EBP, educational support for EBP, recognition for EBP, rewards for EBP, selection or EBP, and selection for openness. The goal of the current study is to extend this work by providing evidence for the factor structure, reliability, and validity of the ICS in a sample of child welfare service providers. Survey data were collected from 215 child welfare providers across three states, 12 organizations, and 43 teams. Confirmatory factor analysis demonstrated good fit to the six-factor model and the alpha reliabilities for the overall measure and its subscales was acceptable. In addition, there was general support for the invariance of the factor structure across the child welfare and mental health sectors. In conclusion, this study provides evidence for the factor structure, reliability, and validity of the ICS measure for use in child welfare service organizations. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integrating Validity Theory with Use of Measurement Instruments in Clinical Settings

PubMed Central

Kelly, P Adam; O'Malley, Kimberly J; Kallen, Michael A; Ford, Marvella E

2005-01-01

Objective To present validity concepts in a conceptual framework useful for research in clinical settings. Principal Findings We present a three-level decision rubric for validating measurement instruments, to guide health services researchers step-by-step in gathering and evaluating validity evidence within their specific situation. We address construct precision, the capacity of an instrument to measure constructs it purports to measure and differentiate from other, unrelated constructs; quantification precision, the reliability of the instrument; and translation precision, the ability to generalize scores from an instrument across subjects from the same or similar populations. We illustrate with specific examples, such as an approach to validating a measurement instrument for veterans when prior evidence of instrument validity for this population does not exist. Conclusions Validity should be viewed as a property of the interpretations and uses of scores from an instrument, not of the instrument itself: how scores are used and the consequences of this use are integral to validity. Our advice is to liken validation to building a court case, including discovering evidence, weighing the evidence, and recognizing when the evidence is weak and more evidence is needed. PMID:16178998
Using the Multiple-Choice Procedure to Measure the Relative Reinforcing Efficacy of Gambling: Initial Validity Evidence Among College Students.

PubMed

Butler, Leon H; Irons, Jessica G; Bassett, Drew T; Correia, Christopher J

2018-06-01

The multiple choice procedure (MCP) is used to assess the relative reinforcing value of concurrently available stimuli. The MCP was originally developed to assess the reinforcing value of drugs; the current within-subjects study employed the MCP to assess the reinforcing value of gambling behavior. Participants (N = 323) completed six versions of the MCP that presented hypothetical choices between money to be used while gambling ($10 or $25) versus escalating amounts of guaranteed money available immediately or after delays of either 1 week or 1 month. Results suggest that choices on the MCP are correlated with other measures of gambling behavior, thus providing concurrent validity data for using the MCP to quantify the relative reinforcing value of gambling. The MCP for gambling also displayed sensitivity to reinforcer magnitude and delay effects, which provides evidence of criterion validity. The results are consistent with a behavioral economic model of addiction and suggest that the MCP could be a valid tool for future research on gambling behavior.

Further evidence for the reliability and validity of the Modified Dental Anxiety Scale.

PubMed

Humphris, G M; Freeman, R; Campbell, J; Tuutti, H; D'Souza, V

2000-12-01

To gain further evidence of the psychometric properties of the Modified Dental Anxiety Scale. Dental admission clinics. Consecutive sampling, cross-sectional survey. Patients (n = 800) in four cities (Belfast, Northern Ireland; Helsinki, Finland; Jyväskylä, Finland and Dubai, UAE). Questionnaire booklet handed to patients, attending clinics, for completion following an invitation by the researcher to be included in the study. Modified Dental Anxiety Scale (MDAS), together with further questions concerning dental attendance and nervousness about dental procedures. Overall 9.3 per cent of patients indicated high dental anxiety. MDAS showed high levels of internal consistency, and good construct validity. The relationship of dental anxiety with age was similar to previous reports and showed lowered anxiety levels in older patients. Data from three countries has supported the psychometric properties of this modified and brief dental anxiety scale.
Multi-analyte validation in heterogeneous solution by ELISA.

PubMed

Lakshmipriya, Thangavel; Gopinath, Subash C B; Hashim, Uda; Murugaiyah, Vikneswaran

2017-12-01

Enzyme Linked Immunosorbent Assay (ELISA) is a standard assay that has been used widely to validate the presence of analyte in the solution. With the advancement of ELISA, different strategies have shown and became a suitable immunoassay for a wide range of analytes. Herein, we attempted to provide additional evidence with ELISA, to show its suitability for multi-analyte detection. To demonstrate, three clinically relevant targets have been chosen, which include 16kDa protein from Mycobacterium tuberculosis, human blood clotting Factor IXa and a tumour marker Squamous Cell Carcinoma antigen. Indeed, we adapted the routine steps from the conventional ELISA to validate the occurrence of analytes both in homogeneous and heterogeneous solutions. With the homogeneous and heterogeneous solutions, we could attain the sensitivity of 2, 8 and 1nM for the targets 16kDa protein, FIXa and SSC antigen, respectively. Further, the specific multi-analyte validations were evidenced with the similar sensitivities in the presence of human serum. ELISA assay in this study has proven its applicability for the genuine multiple target validation in the heterogeneous solution, can be followed for other target validations. Copyright © 2017 Elsevier B.V. All rights reserved.
Are the available apathy measures reliable and valid? A review of the psychometric evidence

PubMed Central

Clarke, Diana E.; Ko, Jean Y.; Kuhl, Emily A.; van Reekum, Robert; Salvador, Rocio; Marin, Robert S.

2014-01-01

Objective Apathy is highly prevalent among neuropsychiatric populations and is associated with greater morbidity and worse functional outcomes. Despite this, it remains understudied and poorly understood, primarily due to lack of consensus definition and clear diagnostic criteria for apathy. Without a gold standard for defining and measuring apathy, the availability of empirically sound measures is imperative. This paper provides a psychometric review of the most commonly used apathy measures and provides recommendations for use and further research. Methods Pertinent literature databases were searched to identify all available assessment tools for apathy in adults aged 18 and older. Evidence of the reliability and validity of the scales were examined. Alternate variations of scales (e.g., non-English versions) were also evaluated if the validating articles were written in English. Results Fifteen apathy scales or subscales were examined. The most psychometrically robust measures for assessing apathy across any disease population appear to be the Apathy Evaluation Scale and the apathy subscale of the Neuropsychiatric Inventory based on the criteria set in this review. For assessment in specific populations, the Dementia Apathy Interview and Rating for patients with Alzheimer’s dementia, the Positive and Negative Symptom Scale for schizophrenia populations, and the Frontal System Behavior Scale for patients with fronto-temporal deficits are reliable and valid measures. Conclusion Clinicians and researchers have numerous apathy scales for use in broad and disease-specific neuropsychiatric populations. Our understanding of apathy would be advanced by research that helps build a consensus as to the definition and diagnosis of apathy, and further refine the psychometric properties of all apathy assessment tools. PMID:21193104
Validity of three clinical performance assessments of internal medicine clerks.

PubMed

Hull, A L; Hodder, S; Berger, B; Ginsberg, D; Lindheim, N; Quan, J; Kleinhenz, M E

1995-06-01

To analyze the construct validity of three methods to assess the clinical performances of internal medicine clerks. A multitrait-multimethod (MTMM) study was conducted at the Case Western Reserve University School of Medicine to determine the convergent and divergent validity of a clinical evaluation form (CEF) completed by faculty and residents, an objective structured clinical examination (OSCE), and the medicine subject test of the National Board of Medical Examiners. Three traits were involved in the analysis: clinical skills, knowledge, and personal characteristics. A correlation matrix was computed for 410 third-year students who completed the clerkship between August 1988 and July 1991. There was a significant (p < .01) convergence of the four correlations that assessed the same traits by using different methods. However, the four convergent correlations were of moderate magnitude (ranging from .29 to .47). Divergent validity was assessed by comparing the magnitudes of the convergence correlations with the magnitudes of correlations among unrelated assessments (i.e., different traits by different methods). Seven of nine possible coefficients were smaller than the convergent coefficients, suggesting evidence of divergent validity. A significant CEF method effect was identified. There was convergent validity and some evidence of divergent validity with a significant method effect. The findings were similar for correlations corrected for attenuation. Four conclusions were reached: (1) the reliability of the OSCE must be improved, (2) the CEF ratings must be redesigned to further discriminate among the specific traits assessed, (3) additional methods to assess personal characteristics must be instituted, and (4) several assessment methods should be used to evaluate individual student performances.
Validation of gamma irradiator controls for quality and regulatory compliance

NASA Astrophysics Data System (ADS)

Harding, Rorry B.; Pinteric, Francis J. A.

1995-09-01

Since 1978 the U.S. Food and Drug Administration (FDA) has had both the legal authority and the Current Good Manufacturing Practice (CGMP) regulations in place to require irradiator owners who process medical devices to produce evidence of Irradiation Process Validation. One of the key components of Irradiation Process Validation is the validation of the irradiator controls. However, it is only recently that FDA audits have focused on this component of the process validation. What is Irradiator Control System Validation? What constitutes evidence of control? How do owners obtain evidence? What is the irradiator supplier's role in validation? How does the ISO 9000 Quality Standard relate to the FDA's CGMP requirement for evidence of Control System Validation? This paper presents answers to these questions based on the recent experiences of Nordion's engineering and product management staff who have worked with several US-based irradiator owners. This topic — Validation of Irradiator Controls — is a significant regulatory compliance and operations issue within the irradiator suppliers' and users' community.
Capability beliefs regarding evidence-based practice are associated with application of EBP and research use: validation of a new measure.

PubMed

Wallin, Lars; Boström, Anne-Marie; Gustavsson, J Petter

2012-08-01

Beliefs about capabilities, or self-efficacy, is a construct originating in social cognitive psychology. Capability beliefs have been found to be positively associated with intention and healthcare practice behaviour. A measure of an individual's beliefs about his/her capability to apply the components of evidence-based practice (EBP) has potential to be useful in implementation research. To evaluate the concurrent validity and internal structure of a new scale measuring nurses' capability beliefs regarding EBP. Data were taken from a prospective longitudinal study in Sweden (the Longitudinal Analyses of Nursing Education and Entry in Worklife [LANE]). A cohort of nursing students who graduated in the autumn of 2004 that was followed up 2 years after their graduation was used (n= 1,256). Concurrent validity was tested relating different levels of capability beliefs to extent of research use and application of EBP. An item-response approach was applied in the evaluation of internal structure of the proposed scale (six items). The psychometric analyses indicated that the six items could be summed to reflect a one-dimensional scale. Nurses with the highest level of capability beliefs reported that they used research findings in clinical practice more than twice as often as those with lower levels of capability beliefs. They also participated in the implementation of evidence seven times more often. There is a need for further studies of the construct and predictive validity of the scale. It should also be validated in other groups of health professionals. Learning including mastery experiences, role modelling, social persuasion, and manageable stress could be used in undergraduate education as well as practice development to increase beliefs about capabilities which might open the way to increased application of EBP in healthcare practice. This new measure is well grounded in social cognitive theory, functions as a one-dimensional scale and possesses promising
Stopcocks for Infusion Therapy: Evidence and Experience.

PubMed

Hadaway, Lynn

Stopcocks have been used for decades to deliver infusion therapy in patients of all ages and in all health care settings. During the past 20 years, a growing number of studies have validated concern about the risk of the open lumen allowing intraluminal contamination. Additional studies highlight fluid flow dynamics associated with stopcocks. This integrative literature review and clinician practice survey analyzes the published evidence and reports of actual practices with stopcocks, and raises issues about practice changes that could reduce these risks.
Evidence on the validity and reliability of the German, French and Italian nursing home version of the Basel Extent of Rationing of Nursing Care instrument.

PubMed

Zúñiga, Franziska; Schubert, Maria; Hamers, Jan P H; Simon, Michael; Schwendimann, René; Engberg, Sandra; Ausserhofer, Dietmar

2016-08-01

To develop and test psychometrically the Basel Extent of Rationing of Nursing Care for Nursing Homes instrument, providing initial evidence on the validity and reliability of the German, French and Italian-language versions. In the hospital setting, implicit rationing of nursing care is defined as the withholding of nursing activities due to lack of resources, such as staffing or time. No instrument existed to measure this concept in nursing homes. Cross-sectional study. We developed the instrument in three phases: (1) adaption and translation; (2) content validity testing; and (3) initial validity and reliability testing. For phase 3, we analysed survey data from 4748 care workers collected between May 2012-April 2013 from a randomly selected sample of 162 nursing homes in the German-, French- and Italian-speaking regions of Switzerland to provide evidence from response processes (e.g. missing), internal structure (exploratory factor analysis), inter-item inconsistencies (e.g. Cronbach's alpha) and interscorer differences (e.g. within-group agreement). Exploratory factor analysis revealed a four-factor structure with good fit statistics. Rationing of nursing care was structured in four domains: (1) activities of daily living; (2) caring, rehabilitation and monitoring; (3) documentation; and (4) social care. Items of the social care subscale showed lower content validity and more missing values than items of other subscales. First evidence indicates that the new instrument can be recommended for research and practice to measure implicit rationing of nursing care in nursing homes. Further refinements of single items are needed. © 2016 John Wiley & Sons Ltd.
Development of the Mini-Assisting Hand Assessment: evidence for content and internal scale validity.

PubMed

Greaves, Susan; Imms, Christine; Dodd, Karen; Krumlinde-Sundholm, Lena

2013-11-01

To describe the development of the Mini-Assisting Hand Assessment (Mini-AHA) for children with signs of unilateral cerebral palsy (CP) aged 8 to 18 months, and evaluate aspects of content and internal scale validity. The ability of the video-recorded Mini-AHA play session to provoke bimanual performance in children with unilateral CP and typical development was evaluated. Original AHA test items were examined for their suitability for younger children and possible new items were generated. Data from 108 assessments of children with unilateral CP (86 children, 53 males, 33 females; mean age 13 mo, SD 3 mo, range 8-18 mo) were entered into a Rasch measurement model analysis to evaluate internal scale validity. A Spearman's correlation analysis explored the relationship between age and ability measures for children with unilateral CP. The frequency of maximum scores in 40 children with typical development (22 males, 18 females; mean age 12 mo, SD 3 mo) was examined. The Mini-AHA play session provoked bimanual responses in typically developing children 99% of the time. Person and item fit criteria established 20 items for the scale. The resultant unidimensional scale also demonstrated excellent discriminative features through high separation reliability. The item calibration values covered the range of person ability measures well. Age was not related to the ability measures for children with unilateral CP (rs =0.178). All children with typical development achieved maximum scores. Accumulated evidence shows that the Mini-AHA validly measures use of the affected hand during bimanual performance for children with unilateral CP aged 8 to 18 months. The Mini-AHA has the potential to be a useful assessment to evaluate functional hand use and the effects of intervention in an age group when potential for change is high. © 2013 Mac Keith Press.
Additivity pretraining and cue competition effects: developmental evidence for a reasoning-based account of causal learning.

PubMed

Simms, Victoria; McCormack, Teresa; Beckers, Tom

2012-04-01

The effect of additivity pretraining on blocking has been taken as evidence for a reasoning account of human and animal causal learning. If inferential reasoning underpins this effect, then developmental differences in the magnitude of this effect in children would be expected. Experiment 1 examined cue competition effects in children's (4- to 5-year-olds and 6- to 7-year-olds) causal learning using a new paradigm analogous to the food allergy task used in studies of human adult causal learning. Blocking was stronger in the older than the younger children, and additivity pretraining only affected blocking in the older group. Unovershadowing was not affected by age or by pretraining. In experiment 2, levels of blocking were found to be correlated with the ability to answer questions that required children to reason about additivity. Our results support an inferential reasoning explanation of cue competition effects. (c) 2012 APA, all rights reserved.
Validity evidence for the adaptation of the State Mindfulness Scale for Physical Activity (SMS-PA) in Spanish youth.

PubMed

Ullrich-French, Sarah; González Hernández, Juan; Hidalgo Montesinos, María D

2017-02-01

Mindfulness is an increasingly popular construct with promise in enhancing multiple positive health outcomes. Physical activity is an important behavior for enhancing overall health, but no Spanish language scale exists to test how mindfulness during physical activity may facilitate physical activity motivation or behavior. This study examined the validity of a Spanish adaption of a new scale, the State Mindfulness Scale for Physical Activity, to assess mindfulness during a specific experience of physical activity. Spanish youths (N = 502) completed a cross-sectional survey of state mindfulness during physical activity and physical activity motivation regulations based on Self-Determination Theory. A high-order model fit the data well and supports the use of one general state mindfulness factor or the use of separate subscales of mindfulness of mental (e.g., thoughts, emotions) and body (physical movement, muscles) aspects of the experience. Internal consistency reliability was good for the general scale and both sub-scales. The pattern of correlations with motivation regulations provides further support for construct validity with significant and positive correlations with self-determined forms of motivation and significant and negative correlations with external regulation and amotivation. Initial validity evidence is promising for the use of the adapted measure.
Evidence-based hypnotherapy for depression.

PubMed

Alladin, Assen

2010-04-01

Cognitive hypnotherapy (CH) is a comprehensive evidence-based hypnotherapy for clinical depression. This article describes the major components of CH, which integrate hypnosis with cognitive-behavior therapy as the latter provides an effective host theory for the assimilation of empirically supported treatment techniques derived from various theoretical models of psychotherapy and psychopathology. CH meets criteria for an assimilative model of psychotherapy, which is considered to be an efficacious model of psychotherapy integration. The major components of CH for depression are described in sufficient detail to allow replication, verification, and validation of the techniques delineated. CH for depression provides a template that clinicians and investigators can utilize to study the additive effects of hypnosis in the management of other psychological or medical disorders. Evidence-based hypnotherapy and research are encouraged; such a movement is necessary if clinical hypnosis is to integrate into mainstream psychotherapy.
Model testing for reliability and validity of the Outcome Expectations for Exercise Scale.

PubMed

Resnick, B; Zimmerman, S; Orwig, D; Furstenberg, A L; Magaziner, J

2001-01-01

Development of a reliable and valid measure of outcome expectations for exercise appropriate for older adults will help establish the relationship between outcome expectations and exercise. Once established, this measure can be used to facilitate the development of interventions to strengthen outcome expectations and improve adherence to regular exercise in older adults. Building on initial psychometrics of the Outcome Expectation for Exercise (OEE) Scale, the purpose of the current study was to use structural equation modeling to provide additional support for the reliability and validity of this measure. The OEE scale is a 9-item measure specifically focusing on the perceived consequences of exercise for older adults. The OEE scale was given to 191 residents in a continuing care retirement community. The mean age of the participants was 85 +/- 6.1 and the majority were female (76%), White (99%), and unmarried (76%). Using structural equation modeling, reliability was based on R2 values, and validity was based on a confirmatory factor analysis and path coefficients. There was continued evidence for reliability of the OEE based on R2 values ranging from .42 to .77, and validity with path coefficients ranging from .69 to .87, and evidence of model fit (X2 of 69, df = 27, p < .05, NFI = .98, RMSEA = .07). The evidence of reliability and validity of this measure has important implications for clinical work and research. The OEE scale can be used to identify older adults who have low outcome expectations for exercise, and interventions can then be implemented to strengthen these expectations and thereby improve exercise behavior.
[Evidence and Evidence Gaps - an Introduction].

PubMed

Dreier, G; Löhler, J

2016-04-01

Treating patients requires the inclusion of existing evidence in any health care decision, to be able to choose the best diagnosis or treatment measure or to make valid prognosis statements for a particular patient in consideration of the physician's own expertise.The basis are clinical trials, the results of which are ideally gathered in systematic reviews, rated, summarized and published. In addition to the GCP (Good Clinical Practice)-compliant planning, conducting and analysis of clinical studies it is essential, that all study results are made publicly available, in order to avoid publication bias. This includes the public registration of planned and discontinued trials. In the last 25 years, the evidence-based medicine (EbM) has increasingly found its way into clinical practice and research. Here EbM is closely associated with the names Archibald Cochrane and David Sackett. In Germany, both the German Cochrane Centre (DCZ) and the network of evidence-based medicine (DNEbM) were established approximately 15 years ago. In the international Cochrane Collaboration clinicians and other scientists like statisticians interdisciplinary work side by side to develop the methods of evidence-based medicine and to address the topics of evidence generation and processing as well as the transfer of knowledge. Challenge: Existing evidence primarily serves doctors to support their decision-making, but is also the basis for providing scientific proof for a health care intervention's benefit to patients and ultimately payers/health insurances. The closure of existing evidence gaps requires substantial human and financial resources, a complex organizational structure and can only succeed with the involvement of clinical and methodological expertise and specific knowledge in the field of clinical research. In addition, the knowledge must be transferred into practice, using journals, guidelines, conferences, databases, information portals with processed evidence and not least the
Pediatric bipolar disorder: validity, phenomenology, and recommendations for diagnosis

PubMed Central

Youngstrom, Eric A; Birmaher, Boris; Findling, Robert L

2013-01-01

Objective To find, review, and critically evaluate evidence pertaining to the phenomenology of pediatric bipolar disorder and its validity as a diagnosis. Methods The present qualitative review summarizes and synthesizes available evidence about the phenomenology of bipolar disorder (BD) in youths, including description of the diagnostic sensitivity and specificity of symptoms, clarification about rates of cycling and mixed states, and discussion about chronic versus episodic presentations of mood dysregulation. The validity of the diagnosis of BD in youths is also evaluated based on traditional criteria including associated demographic characteristics, family environmental features, genetic bases, longitudinal studies of youths at risk of developing BD as well as youths already manifesting symptoms on the bipolar spectrum, treatment studies and pharmacologic dissection, neurobiological findings (including morphological and functional data), and other related laboratory findings. Additional sections review impairment and quality of life, personality and temperamental correlates, the clinical utility of a bipolar diagnosis in youths, and the dimensional versus categorical distinction as it applies to mood disorder in youths. Results A schema for diagnosis of BD in youths is developed, including a review of different operational definitions of `bipolar not otherwise specified.' Principal areas of disagreement appear to include the relative role of elated versus irritable mood in assessment, and also the limits of the extent of the bipolar spectrum – when do definitions become so broad that they are no longer describing `bipolar' cases? Conclusions In spite of these areas of disagreement, considerable evidence has amassed supporting the validity of the bipolar diagnosis in children and adolescents. PMID:18199237
Anger Rumination Scale: Validation in Mexico.

PubMed

Ortega Andrade, Norma; Alcázar-Olán, Raúl; Matías, Oscar Mariano; Rivera Guerrero, Ana; Domínguez Espinosa, Alejandra

2017-01-18

The aim of the study was to assess the validity of the Anger Rumination Scale (ARS; Sukhodolsky, Golub, & Cromwell, 2001) in a Mexican sample (n = 700, M age = 38.6, SD = 12.42). Through confirmatory factor analysis and using modification indices, the four-factor structure of the original scale was replicated: angry afterthoughts, thoughts of revenge, angry memories, and understanding of causes. In addition, the four-factor model had better goodness of fit indices than rival models with three and two factors. Alpha reliabilities were acceptable (.72 -.89). ARS results correlated with measures of state anger, trait anger, anger expression, and anger control (negatively); correlations were significant (ps < .001) ARS outcomes also correlated (ps < .001) with physical and verbal aggression, hostility, anger, and emotion suppression, suggesting convergent validity. Men reported more thoughts of revenge than women (p < .001; Eta squared = .026), but there was no evidence of gender differences on the other anger rumination scales, or in total scores.
Assessing Students' Understanding of Macroevolution: Concerns regarding the validity of the MUM

NASA Astrophysics Data System (ADS)

Novick, Laura R.; Catley, Kefyn M.

2012-11-01

In a recent article, Nadelson and Southerland (2010. Development and preliminary evaluation of the Measure of Understanding of Macroevolution: Introducing the MUM. The Journal of Experimental Education, 78, 151-190) reported on their development of a multiple-choice concept inventory intended to assess college students' understanding of macroevolutionary concepts, the Measure of Understanding Macroevolution (MUM). Given that the only existing evolution inventories assess understanding of natural selection, a microevolutionary concept, a valid assessment of students' understanding of macroevolution would be a welcome and necessary addition to the field of science education. Although the conceptual framework underlying Nadelson and Southerland's test is promising, we believe the test has serious shortcomings with respect to validity evidence for the construct being tested. We argue and provide evidence that these problems are serious enough that the MUM should not be used in its current form to measure students' understanding of macroevolution.
Validity evidence and reliability of a simulated patient feedback instrument

PubMed Central

2012-01-01

Background In the training of healthcare professionals, one of the advantages of communication training with simulated patients (SPs) is the SP's ability to provide direct feedback to students after a simulated clinical encounter. The quality of SP feedback must be monitored, especially because it is well known that feedback can have a profound effect on student performance. Due to the current lack of valid and reliable instruments to assess the quality of SP feedback, our study examined the validity and reliability of one potential instrument, the 'modified Quality of Simulated Patient Feedback Form' (mQSF). Methods Content validity of the mQSF was assessed by inviting experts in the area of simulated clinical encounters to rate the importance of the mQSF items. Moreover, generalizability theory was used to examine the reliability of the mQSF. Our data came from videotapes of clinical encounters between six simulated patients and six students and the ensuing feedback from the SPs to the students. Ten faculty members judged the SP feedback according to the items on the mQSF. Three weeks later, this procedure was repeated with the same faculty members and recordings. Results All but two items of the mQSF received importance ratings of > 2.5 on a four-point rating scale. A generalizability coefficient of 0.77 was established with two judges observing one encounter. Conclusions The findings for content validity and reliability with two judges suggest that the mQSF is a valid and reliable instrument to assess the quality of feedback provided by simulated patients. PMID:22284898
Validity evidence and reliability of a simulated patient feedback instrument.

PubMed

Schlegel, Claudia; Woermann, Ulrich; Rethans, Jan-Joost; van der Vleuten, Cees

2012-01-27

In the training of healthcare professionals, one of the advantages of communication training with simulated patients (SPs) is the SP's ability to provide direct feedback to students after a simulated clinical encounter. The quality of SP feedback must be monitored, especially because it is well known that feedback can have a profound effect on student performance. Due to the current lack of valid and reliable instruments to assess the quality of SP feedback, our study examined the validity and reliability of one potential instrument, the 'modified Quality of Simulated Patient Feedback Form' (mQSF). Content validity of the mQSF was assessed by inviting experts in the area of simulated clinical encounters to rate the importance of the mQSF items. Moreover, generalizability theory was used to examine the reliability of the mQSF. Our data came from videotapes of clinical encounters between six simulated patients and six students and the ensuing feedback from the SPs to the students. Ten faculty members judged the SP feedback according to the items on the mQSF. Three weeks later, this procedure was repeated with the same faculty members and recordings. All but two items of the mQSF received importance ratings of > 2.5 on a four-point rating scale. A generalizability coefficient of 0.77 was established with two judges observing one encounter. The findings for content validity and reliability with two judges suggest that the mQSF is a valid and reliable instrument to assess the quality of feedback provided by simulated patients.
Financial decision-making abilities and financial exploitation in older African Americans: Preliminary validity evidence for the Lichtenberg Financial Decision Rating Scale (LFDRS).

PubMed

Lichtenberg, Peter A; Ficker, Lisa J; Rahman-Filipiak, Annalise

2016-01-01

This study examines preliminary evidence for the Lichtenberg Financial Decision Rating Scale (LFDRS), a new person-centered approach to assessing capacity to make financial decisions, and its relationship to self-reported cases of financial exploitation in 69 older African Americans. More than one third of individuals reporting financial exploitation also had questionable decisional abilities. Overall, decisional ability score and current decision total were significantly associated with cognitive screening test and financial ability scores, demonstrating good criterion validity. Study findings suggest that impaired decisional abilities may render older adults more vulnerable to financial exploitation, and that the LFDRS is a valid tool.

Reliability and validity: Part II.

PubMed

Davis, Debora Winders

2004-01-01

Determining measurement reliability and validity involves complex processes. There is usually room for argument about most instruments. It is important that the researcher clearly describes the processes upon which she made the decision to use a particular instrument, and presents the evidence available showing that the instrument is reliable and valid for the current purposes. In some cases, the researcher may need to conduct pilot studies to obtain evidence upon which to decide whether the instrument is valid for a new population or a different setting. In all cases, the researcher must present a clear and complete explanation for the choices, she has made regarding reliability and validity. The consumer must then judge the degree to which the researcher has provided adequate and theoretically sound rationale. Although I have tried to touch on most of the important concepts related to measurement reliability and validity, it is beyond the scope of this column to be exhaustive. There are textbooks devoted entirely to specific measurement issues if readers require more in-depth knowledge.
Pressure ulcer prevention algorithm content validation: a mixed-methods, quantitative study.

PubMed

van Rijswijk, Lia; Beitz, Janice M

2015-04-01

Translating pressure ulcer prevention (PUP) evidence-based recommendations into practice remains challenging for a variety of reasons, including the perceived quality, validity, and usability of the research or the guideline itself. Following the development and face validation testing of an evidence-based PUP algorithm, additional stakeholder input and testing were needed. Using convenience sampling methods, wound care experts attending a national wound care conference and a regional wound ostomy continence nursing (WOCN) conference and/or graduates of a WOCN program were invited to participate in an Internal Review Board-approved, mixed-methods quantitative survey with qualitative components to examine algorithm content validity. After participants provided written informed consent, demographic variables were collected and participants were asked to comment on and rate the relevance and appropriateness of each of the 26 algorithm decision points/steps using standard content validation study procedures. All responses were anonymous. Descriptive summary statistics, mean relevance/appropriateness scores, and the content validity index (CVI) were calculated. Qualitative comments were transcribed and thematically analyzed. Of the 553 wound care experts invited, 79 (average age 52.9 years, SD 10.1; range 23-73) consented to participate and completed the study (a response rate of 14%). Most (67, 85%) were female, registered (49, 62%) or advanced practice (12, 15%) nurses, and had > 10 years of health care experience (88, 92%). Other health disciplines included medical doctors, physical therapists, nurse practitioners, and certified nurse specialists. Almost all had received formal wound care education (75, 95%). On a Likert-type scale of 1 (not relevant/appropriate) to 4 (very relevant and appropriate), the average score for the entire algorithm/all decision points (N = 1,912) was 3.72 with an overall CVI of 0.94 (out of 1). The only decision point/step recommendation
English Placement Testing, Multiple Measures, and Disproportionate Impact: An Analysis of the Criterion- and Content-Related Validity Evidence for the Reading & Writing Placement Tests in the San Diego Community College District.

ERIC Educational Resources Information Center

Armstrong, William B.

As part of an effort to statistically validate the placement tests used in California's San Diego Community College District (SDCCD) a study was undertaken to review the criteria- and content-related validity of the Assessment and Placement Services (APS) reading and writing tests. Evidence of criteria and content validity was gathered from…
Development and testing of a DVT risk assessment tool: providing evidence of validity and reliability.

PubMed

McCaffrey, Ruth; Bishop, Mary; Adonis-Rizzo, Marie; Williamson, Ellen; McPherson, Melanie; Cruikshank, Alice; Carrier, Vicki Jo; Sands, Simone; Pigano, Diane; Girard, Patricia; Lauzon, Cathy

2007-01-01

Hospital-acquired deep vein thrombosis (DVT) and pulmonary embolisms (PE) are preventable problems that can increase mortality. Early assessment and recognition of risk as well as initiating appropriate prevention measures can prevent DVT or PE. The purpose of this research project was to develop a DVT risk assessment tool and test the tool for validity and reliability. Three phases were undertaken in developing and testing the JFK Medical Center DVT risk assessment tool. Investigation and clarification of risk and predisposing factors for DVT were identified from the literature, expert nursing knowledge, and medical staff input. Second, item development and weighting were undertaken. Third, parametric testing for content validity measured the differences in mean assessment tool scores between a group of patients who developed DVT in the hospital and a demographically similar group who did not develop DVT. Interrater reliability was measured by having three different nurses score each patient and compare the differences in scores among the three. The DVT group had significantly higher scores on the JFK DVT assessment scale than did those who did not experience DVT. Interrater reliability showed a strong correlation among the scores of the three nurses (.98). Providing a valid and reliable tool for measuring the risk for DVT or PE in hospitalized patients will enable nurses to intervene early in patients at risk. Basing DVT risk assessment on the evidence provided in this study will assist nurses in becoming more confident in recognizing the necessity for interventions in hospitalized patients and decreasing risk. Nurses can now evaluate patients at risk for DVT or PE using the JFK Medial Center's risk assessment tool.
Culture Training: Validation Evidence for the Culture Assimilator.

ERIC Educational Resources Information Center

Mitchell, Terence R.; And Others

The culture assimilator, a programed self-instructional approach to culture training, is described and a series of laboratory experiments and field studies validating the culture assimilator are reviewed. These studies show that the culture assimilator is an effective method of decreasing some of the stress experienced when one works with people…
Ethical leadership: meta-analytic evidence of criterion-related and incremental validity.

PubMed

Ng, Thomas W H; Feldman, Daniel C

2015-05-01

This study examines the criterion-related and incremental validity of ethical leadership (EL) with meta-analytic data. Across 101 samples published over the last 15 years (N = 29,620), we observed that EL demonstrated acceptable criterion-related validity with variables that tap followers' job attitudes, job performance, and evaluations of their leaders. Further, followers' trust in the leader mediated the relationships of EL with job attitudes and performance. In terms of incremental validity, we found that EL significantly, albeit weakly in some cases, predicted task performance, citizenship behavior, and counterproductive work behavior-even after controlling for the effects of such variables as transformational leadership, use of contingent rewards, management by exception, interactional fairness, and destructive leadership. The article concludes with a discussion of ways to strengthen the incremental validity of EL. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Implicit attentional bias for facial emotion in dissociative seizures: Additional evidence.

PubMed

Pick, Susannah; Mellers, John D C; Goldstein, Laura H

2018-03-01

This study sought to extend knowledge about the previously reported preconscious attentional bias (AB) for facial emotion in patients with dissociative seizures (DS) by exploring whether the finding could be replicated, while controlling for concurrent anxiety, depression, and potentially relevant cognitive impairments. Patients diagnosed with DS (n=38) were compared with healthy controls (n=43) on a pictorial emotional Stroop test, in which backwardly masked emotional faces (angry, happy, neutral) were processed implicitly. The group with DS displayed a significantly greater AB to facial emotion relative to controls; however, the bias was not specific to negative or positive emotions. The group effect could not be explained by performance on standardized cognitive tests or self-reported depression/anxiety. The study provides additional evidence of a disproportionate and automatic allocation of attention to facial affect in patients with DS, including both positive and negative facial expressions. Such a tendency could act as a predisposing factor for developing DS initially, or may contribute to triggering individuals' seizures on an ongoing basis. Psychological interventions such as Cognitive Behavioral Therapy (CBT) or AB modification might be suitable approaches to target this bias in clinical practice. Copyright © 2018 Elsevier Inc. All rights reserved.
Exploring the validity and reliability of a questionnaire for evaluating veterinary clinical teachers' supervisory skills during clinical rotations.

PubMed

Boerboom, T B B; Dolmans, D H J M; Jaarsma, A D C; Muijtjens, A M M; Van Beukelen, P; Scherpbier, A J J A

2011-01-01

Feedback to aid teachers in improving their teaching requires validated evaluation instruments. When implementing an evaluation instrument in a different context, it is important to collect validity evidence from multiple sources. We examined the validity and reliability of the Maastricht Clinical Teaching Questionnaire (MCTQ) as an instrument to evaluate individual clinical teachers during short clinical rotations in veterinary education. We examined four sources of validity evidence: (1) Content was examined based on theory of effective learning. (2) Response process was explored in a pilot study. (3) Internal structure was assessed by confirmatory factor analysis using 1086 student evaluations and reliability was examined utilizing generalizability analysis. (4) Relations with other relevant variables were examined by comparing factor scores with other outcomes. Content validity was supported by theory underlying the cognitive apprenticeship model on which the instrument is based. The pilot study resulted in an additional question about supervision time. A five-factor model showed a good fit with the data. Acceptable reliability was achievable with 10-12 questionnaires per teacher. Correlations between the factors and overall teacher judgement were strong. The MCTQ appears to be a valid and reliable instrument to evaluate clinical teachers' performance during short rotations.
Agreeing on Validity Arguments

ERIC Educational Resources Information Center

Sireci, Stephen G.

2013-01-01

Kane (this issue) presents a comprehensive review of validity theory and reminds us that the focus of validation is on test score interpretations and use. In reacting to his article, I support the argument-based approach to validity and all of the major points regarding validation made by Dr. Kane. In addition, I call for a simpler, three-step…
Cluster analysis of novel isometric strength measures produces a valid and evidence-based classification structure for wheelchair track racing.

PubMed

Connick, Mark J; Beckman, Emma; Vanlandewijck, Yves; Malone, Laurie A; Blomqvist, Sven; Tweedy, Sean M

2017-11-25

The Para athletics wheelchair-racing classification system employs best practice to ensure that classes comprise athletes whose impairments cause a comparable degree of activity limitation. However, decision-making is largely subjective and scientific evidence which reduces this subjectivity is required. To evaluate whether isometric strength tests were valid for the purposes of classifying wheelchair racers and whether cluster analysis of the strength measures produced a valid classification structure. Thirty-two international level, male wheelchair racers from classes T51-54 completed six isometric strength tests evaluating elbow extensors, shoulder flexors, trunk flexors and forearm pronators and two wheelchair performance tests-Top-Speed (0-15 m) and Top-Speed (absolute). Strength tests significantly correlated with wheelchair performance were included in a cluster analysis and the validity of the resulting clusters was assessed. All six strength tests correlated with performance (r=0.54-0.88). Cluster analysis yielded four clusters with reasonable overall structure (mean silhouette coefficient=0.58) and large intercluster strength differences. Six athletes (19%) were allocated to clusters that did not align with their current class. While the mean wheelchair racing performance of the resulting clusters was unequivocally hierarchical, the mean performance of current classes was not, with no difference between current classes T53 and T54. Cluster analysis of isometric strength tests produced classes comprising athletes who experienced a similar degree of activity limitation. The strength tests reported can provide the basis for a new, more transparent, less subjective wheelchair racing classification system, pending replication of these findings in a larger, representative sample. This paper also provides guidance for development of evidence-based systems in other Para sports. © Article author(s) (or their employer(s) unless otherwise stated in the text of
Validity evidence for Surgical Improvement of Clinical Knowledge Ops: a novel gaming platform to assess surgical decision making.

PubMed

Lin, Dana T; Park, Julia; Liebert, Cara A; Lau, James N

2015-01-01

Current surgical education curricula focus mainly on the acquisition of technical skill rather than clinical and operative judgment. SICKO (Surgical Improvement of Clinical Knowledge Ops) is a novel gaming platform developed to address this critical need. A pilot study was performed to collect validity evidence for SICKO as an assessment for surgical decision making. Forty-nine subjects stratified into 4 levels of expertise were recruited to play SICKO. Later, players were surveyed regarding the realism of the gaming platform as well as the clinical competencies required of them while playing SICKO. Each group of increasing expertise outperformed the less experienced groups. Mean total game scores for the novice, junior resident, senior resident, and expert groups were 5,461, 8,519, 11,404, and 13,913, respectively (P = .001). Survey results revealed high scores for realism and content. SICKO holds the potential to be not only an engaging and immersive educational tool, but also a valid assessment in the armamentarium of surgical educators. Published by Elsevier Inc.
Validity evidence for procedural competency in virtual reality robotic simulation, establishing a credible pass/fail standard for the vaginal cuff closure procedure.

PubMed

Hovgaard, Lisette Hvid; Andersen, Steven Arild Wuyts; Konge, Lars; Dalsgaard, Torur; Larsen, Christian Rifbjerg

2018-03-30

The use of robotic surgery for minimally invasive procedures has increased considerably over the last decade. Robotic surgery has potential advantages compared to laparoscopic surgery but also requires new skills. Using virtual reality (VR) simulation to facilitate the acquisition of these new skills could potentially benefit training of robotic surgical skills and also be a crucial step in developing a robotic surgical training curriculum. The study's objective was to establish validity evidence for a simulation-based test for procedural competency for the vaginal cuff closure procedure that can be used in a future simulation-based, mastery learning training curriculum. Eleven novice gynaecological surgeons without prior robotic experience and 11 experienced gynaecological robotic surgeons (> 30 robotic procedures) were recruited. After familiarization with the VR simulator, participants completed the module 'Guided Vaginal Cuff Closure' six times. Validity evidence was investigated for 18 preselected simulator metrics. The internal consistency was assessed using Cronbach's alpha and a composite score was calculated based on metrics with significant discriminative ability between the two groups. Finally, a pass/fail standard was established using the contrasting groups' method. The experienced surgeons significantly outperformed the novice surgeons on 6 of the 18 metrics. The internal consistency was 0.58 (Cronbach's alpha). The experienced surgeons' mean composite score for all six repetitions were significantly better than the novice surgeons' (76.1 vs. 63.0, respectively, p < 0.001). A pass/fail standard of 75/100 was established. Four novice surgeons passed this standard (false positives) and three experienced surgeons failed (false negatives). Our study has gathered validity evidence for a simulation-based test for procedural robotic surgical competency in the vaginal cuff closure procedure and established a credible pass/fail standard for future
Reliability and Validity Evidence of Scores on the Achievement Goal Tendencies Questionnaire in a Sample of Spanish Students of Compulsory Secondary Education

ERIC Educational Resources Information Center

Ingles, Candido J.; Garcia-Fernandez, Jose M.; Castejon, Juan L.; Valle, Antonio; Delgado, Beatriz; Marzo, Juan C.

2009-01-01

This study examined the reliability and validity evidence drawn from the scores of the Spanish version of the Achievement Goal Tendencies Questionnaire (AGTQ) using a sample of 2,022 (51.1% boys) Spanish students from grades 7 to 10. Confirmatory factor analysis replicated the correlated three-factor structure of the AGTQ in this sample: Learning…
Contemporary Test Validity in Theory and Practice: A Primer for Discipline-Based Education Researchers

PubMed Central

Reeves, Todd D.; Marbach-Ad, Gili

2016-01-01

Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology—either quantitative or qualitative—on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations. PMID:26903498
Validating the Proposed Structure of the Relationships among Test Anxiety and Its Predictors Based on Control-Value Theory: Evidence for Gender-Specific Patterns

ERIC Educational Resources Information Center

Ringeisen, Tobias; Raufelder, Diana; Schnell, Kerstin; Rohrmann, Sonja

2016-01-01

Control-value theory (CVT) proposes a framework for the structure of the relationships between the various predictors of achievement-related emotions, particularly anxiety. Despite existing evidence for the role of anxiety predictors, research has not yet justified their proposed structure. Hence, the current study validated the structure of test…
Validity Evidence and Scoring Guidelines for Standardized Patient Encounters and Patient Notes From a Multisite Study of Clinical Performance Examinations in Seven Medical Schools.

PubMed

Park, Yoon Soo; Hyderi, Abbas; Heine, Nancy; May, Win; Nevins, Andrew; Lee, Ming; Bordage, Georges; Yudkowsky, Rachel

2017-11-01

To examine validity evidence of local graduation competency examination scores from seven medical schools using shared cases and to provide rater training protocols and guidelines for scoring patient notes (PNs). Between May and August 2016, clinical cases were developed, shared, and administered across seven medical schools (990 students participated). Raters were calibrated using training protocols, and guidelines were developed collaboratively across sites to standardize scoring. Data included scores from standardized patient encounters for history taking, physical examination, and PNs. Descriptive statistics were used to examine scores from the different assessment components. Generalizability studies (G-studies) using variance components were conducted to estimate reliability for composite scores. Validity evidence was collected for response process (rater perception), internal structure (variance components, reliability), relations to other variables (interassessment correlations), and consequences (composite score). Student performance varied by case and task. In the PNs, justification of differential diagnosis was the most discriminating task. G-studies showed that schools accounted for less than 1% of total variance; however, for the PNs, there were differences in scores for varying cases and tasks across schools, indicating a school effect. Composite score reliability was maximized when the PN was weighted between 30% and 40%. Raters preferred using case-specific scoring guidelines with clear point-scoring systems. This multisite study presents validity evidence for PN scores based on scoring rubric and case-specific scoring guidelines that offer rigor and feedback for learners. Variability in PN scores across participating sites may signal different approaches to teaching clinical reasoning among medical schools.
Evidence-Based Communication Practices for Children with Visual Impairments and Additional Disabilities: An Examination of Single-Subject Design Studies

ERIC Educational Resources Information Center

Parker, Amy T.; Grimmett, Eric S.; Summers, Sharon

2008-01-01

This review examines practices for building effective communication strategies for children with visual impairments, including those with additional disabilities, that have been tested by single-subject design methodology. The authors found 30 studies that met the search criteria and grouped intervention strategies to align any evidence of the…
Highlights from the 1998-2000 SHADOZ (Southern Hemisphere Additional Ozonesondes) Satellite Validation Project

NASA Technical Reports Server (NTRS)

Witte, J. C.; Thompson, A. M.; Fortuin, P.; Einsudi, Franco (Technical Monitor)

2001-01-01

There are three years of data (more than 1000 individual ozone profiles) available from a network of 10 southern hemisphere tropical and subtropical stations, designated the Southern Hemisphere ADditional OZonesondes (SHADOZ) project. Since late 1999, a tropical station in the northern hemisphere (Paramaribo, Surinam; lat/long) joined SHADOZ, providing coordinated weekly ozone and radiosonde data from the surface to approx. 7 hPa for satellite validation, process studies, and model evaluation. Profiles are also collected at: Ascension Island; Nairobi, Kenya; Irene, South Africa; R (union Island; Watukosek, Java; Fiji; Tahiti; American Samoa; San Cristobal, Galapagos; Natal, Brazil. The archive, station characteristics and photos are available at http://code9l6.gsfc.nasa.gov/Data_ services/shadoz>. SHADOZ ozone time-series and profiles in 1998-2000 display highly variable tropospheric ozone, a zonal wave-one pattern in total (and tropospheric) column ozone, and signatures of the Quasi-Biennial Oscillation (QBO) in stratospheric ozone. Total, stratospheric and tropospheric column ozone amounts peak between August and November and are lowest between March and May. Integrated total ozone column amounts from the sondes are lower than independent measurements from a ground-based network and from the TOMS (Total Ozone Mapping Spectrometer) satellite (version 7 data).
Limitations in conduct and reporting of cochrane reviews rarely inhibit the determination of the validity of evidence for clinical decision-making.

PubMed

Alper, Brian S; Fedorowicz, Zbys; van Zuuren, Esther J

2015-08-01

To determine how often clinical conclusions derived from Cochrane Reviews have uncertain validity due to review conduct and reporting deficiencies. We evaluated 5142 clinical conclusions in DynaMed (an evidence-based point-of-care clinical reference) based on 4743 Cochrane Reviews. Clinical conclusions with level 2 evidence due to shortcomings in the review's conduct or reporting (rather than deficiencies in the underlying evidence) were confirmed by a DynaMed editor and two Cochrane Review authors. Thirty-one Cochrane Reviews (0.65%) had confirmed deficiencies in conduct and reporting as the reason for classifying 37 assessed clinical conclusions (0.72%) as level 2 evidence. In all cases, it was not feasible for the assessors to specify a clear criticism of the studies included in the reviews. The deficiencies were specific to not accounting for dropouts (2) or inadequate assessment and reporting of allocation concealment (11), other specific trial quality criteria (14), or all trial quality criteria (4). Cochrane Reviews provide high-quality assessment and synthesis of evidence, with fewer than 1% of Cochrane Reviews having limitations which hinder the summary of best current evidence for clinical decision-making. We expect this will further decrease following recent Cochrane quality initiatives. © 2015 Chinese Cochrane Center, West China Hospital of Sichuan University and Wiley Publishing Asia Pty Ltd.
A contemporary approach to validity arguments: a practical guide to Kane's framework.

PubMed

Cook, David A; Brydges, Ryan; Ginsburg, Shiphra; Hatala, Rose

2015-06-01

Assessment is central to medical education and the validation of assessments is vital to their use. Earlier validity frameworks suffer from a multiplicity of types of validity or failure to prioritise among sources of validity evidence. Kane's framework addresses both concerns by emphasising key inferences as the assessment progresses from a single observation to a final decision. Evidence evaluating these inferences is planned and presented as a validity argument. We aim to offer a practical introduction to the key concepts of Kane's framework that educators will find accessible and applicable to a wide range of assessment tools and activities. All assessments are ultimately intended to facilitate a defensible decision about the person being assessed. Validation is the process of collecting and interpreting evidence to support that decision. Rigorous validation involves articulating the claims and assumptions associated with the proposed decision (the interpretation/use argument), empirically testing these assumptions, and organising evidence into a coherent validity argument. Kane identifies four inferences in the validity argument: Scoring (translating an observation into one or more scores); Generalisation (using the score[s] as a reflection of performance in a test setting); Extrapolation (using the score[s] as a reflection of real-world performance), and Implications (applying the score[s] to inform a decision or action). Evidence should be collected to support each of these inferences and should focus on the most questionable assumptions in the chain of inference. Key assumptions (and needed evidence) vary depending on the assessment's intended use or associated decision. Kane's framework applies to quantitative and qualitative assessments, and to individual tests and programmes of assessment. Validation focuses on evaluating the key claims, assumptions and inferences that link assessment scores with their intended interpretations and uses. The Implications

Identifying and Evaluating External Validity Evidence for Passing Scores

ERIC Educational Resources Information Center

Davis-Becker, Susan L.; Buckendahl, Chad W.

2013-01-01

A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1)…
Evidence-Based Review of Subjective Pediatric Sleep Measures

PubMed Central

Toliver-Sokol, Marisol; Palermo, Tonya M.

2011-01-01

Objective This manuscript provides an evidence-based psychometric review of parent and child-report pediatric sleep measures using criteria developed by the American Psychological Association (APA) Division 54 Evidence-Based Assessment (EBA) Task Force. Methods Twenty-one measures were reviewed: four measures of daytime sleepiness, four measures of sleep habits/hygiene, two measures assessing sleep-related attitudes/cognitions, five measures of sleep initiation/maintenance, and six multidimensional sleep measures. Results Six of the 21 measures met “well-established” evidence-based assessment criteria. An additional eight measures were rated as “approaching well-established” and seven were rated as “promising.” Conclusions Overall, the multidimensional sleep measures received the highest ratings. Strengths and weaknesses of the measures are described. Recommendations for future pediatric sleep assessment are presented including further validation of measures, use of multiple informants, and stability of sleep measures over time. PMID:21227912
Validation of the Child Sport Cohesion Questionnaire

ERIC Educational Resources Information Center

Martin, Luc J.; Carron, Albert V.; Eys, Mark A.; Loughead, Todd

2013-01-01

The purpose of the present study was to test the validity evidence of the Child Sport Cohesion Questionnaire (CSCQ). To accomplish this task, convergent, discriminant, and known-group difference validity were examined, along with factorial validity via confirmatory factor analysis (CFA). Child athletes (N = 290, M[subscript age] = 10.73 plus or…
Validity for What? The Peril of Overclarifying

ERIC Educational Resources Information Center

Murphy, Kevin R.

2012-01-01

As Paul Newton so ably demonstrates, the concept of validity is both important and problematic. Over the last several decades, a consensus definition of validity has emerged; the current edition of "Standards for Educational and Psychological Testing" notes, "Validity refers to the degree to which evidence and theory support the interpretations of…
Therapist self-report of evidence-based practices in usual care for adolescent behavior problems: factor and construct validity.

PubMed

Hogue, Aaron; Dauber, Sarah; Henderson, Craig E

2014-01-01

This study introduces a therapist-report measure of evidence-based practices for adolescent conduct and substance use problems. The Inventory of Therapy Techniques-Adolescent Behavior Problems (ITT-ABP) is a post-session measure of 27 techniques representing four approaches: cognitive-behavioral therapy (CBT), family therapy (FT), motivational interviewing (MI), and drug counseling (DC). A total of 822 protocols were collected from 32 therapists treating 71 adolescents in six usual care sites. Factor analyses identified three clinically coherent scales with strong internal consistency across the full sample: FT (8 items; α = .79), MI/CBT (8 items; α = .87), and DC (9 items, α = .90). The scales discriminated between therapists working in a family-oriented site versus other sites and showed moderate convergent validity with therapist reports of allegiance and skill in each approach. The ITT-ABP holds promise as a cost-efficient quality assurance tool for supporting high-fidelity delivery of evidence-based practices in usual care.
On-the-Job Evidence-Based Medicine Training for Clinician-Scientists of the Next Generation.

PubMed

Leung, Elaine Yl; Malick, Sadia M; Khan, Khalid S

2013-08-01

Clinical scientists are at the unique interface between laboratory science and frontline clinical practice for supporting clinical partnerships for evidence-based practice. In an era of molecular diagnostics and personalised medicine, evidence-based laboratory practice (EBLP) is also crucial in aiding clinical scientists to keep up-to-date with this expanding knowledge base. However, there are recognised barriers to the implementation of EBLP and its training. The aim of this review is to provide a practical summary of potential strategies for training clinician-scientists of the next generation. Current evidence suggests that clinically integrated evidence-based medicine (EBM) training is effective. Tailored e-learning EBM packages and evidence-based journal clubs have been shown to improve knowledge and skills of EBM. Moreover, e-learning is no longer restricted to computer-assisted learning packages. For example, social media platforms such as Twitter have been used to complement existing journal clubs and provide additional post-publication appraisal information for journals. In addition, the delivery of an EBLP curriculum has influence on its success. Although e-learning of EBM skills is effective, having EBM trained teachers available locally promotes the implementation of EBM training. Training courses, such as Training the Trainers, are now available to help trainers identify and make use of EBM training opportunities in clinical practice. On the other hand, peer-assisted learning and trainee-led support networks can strengthen self-directed learning of EBM and research participation among clinical scientists in training. Finally, we emphasise the need to evaluate any EBLP training programme using validated assessment tools to help identify the most crucial ingredients of effective EBLP training. In summary, we recommend on-the-job training of EBM with additional focus on overcoming barriers to its implementation. In addition, future studies evaluating the
On-the-Job Evidence-Based Medicine Training for Clinician-Scientists of the Next Generation

PubMed Central

Leung, Elaine YL; Malick, Sadia M; Khan, Khalid S

2013-01-01

Clinical scientists are at the unique interface between laboratory science and frontline clinical practice for supporting clinical partnerships for evidence-based practice. In an era of molecular diagnostics and personalised medicine, evidence-based laboratory practice (EBLP) is also crucial in aiding clinical scientists to keep up-to-date with this expanding knowledge base. However, there are recognised barriers to the implementation of EBLP and its training. The aim of this review is to provide a practical summary of potential strategies for training clinician-scientists of the next generation. Current evidence suggests that clinically integrated evidence-based medicine (EBM) training is effective. Tailored e-learning EBM packages and evidence-based journal clubs have been shown to improve knowledge and skills of EBM. Moreover, e-learning is no longer restricted to computer-assisted learning packages. For example, social media platforms such as Twitter have been used to complement existing journal clubs and provide additional post-publication appraisal information for journals. In addition, the delivery of an EBLP curriculum has influence on its success. Although e-learning of EBM skills is effective, having EBM trained teachers available locally promotes the implementation of EBM training. Training courses, such as Training the Trainers, are now available to help trainers identify and make use of EBM training opportunities in clinical practice. On the other hand, peer-assisted learning and trainee-led support networks can strengthen self-directed learning of EBM and research participation among clinical scientists in training. Finally, we emphasise the need to evaluate any EBLP training programme using validated assessment tools to help identify the most crucial ingredients of effective EBLP training. In summary, we recommend on-the-job training of EBM with additional focus on overcoming barriers to its implementation. In addition, future studies evaluating the
Developing Validity Evidence for the Written Pediatric History and Physical Exam Evaluation Rubric.

PubMed

King, Marta A; Phillipi, Carrie A; Buchanan, Paula M; Lewin, Linda O

The written history and physical examination (H&P) is an underutilized source of medical trainee assessment. The authors describe development and validity evidence for the Pediatric History and Physical Exam Evaluation (P-HAPEE) rubric: a novel tool for evaluating written H&Ps. Using an iterative process, the authors drafted, revised, and implemented the 10-item rubric at 3 academic institutions in 2014. Eighteen attending physicians and 5 senior residents each scored 10 third-year medical student H&Ps. Inter-rater reliability (IRR) was determined using intraclass correlation coefficients. Cronbach α was used to report consistency and Spearman rank-order correlations to determine relationships between rubric items. Raters provided a global assessment, recorded time to review and score each H&P, and completed a rubric utility survey. Overall intraclass correlation was 0.85, indicating adequate IRR. Global assessment IRR was 0.89. IRR for low- and high-quality H&Ps was significantly greater than for medium-quality ones but did not differ on the basis of rater category (attending physician vs. senior resident), note format (electronic health record vs nonelectronic), or student diagnostic accuracy. Cronbach α was 0.93. The highest correlation between an individual item and total score was for assessments was 0.84; the highest interitem correlation was between assessment and differential diagnosis (0.78). Mean time to review and score an H&P was 16.3 minutes; residents took significantly longer than attending physicians. All raters described rubric utility as "good" or "very good" and endorsed continued use. The P-HAPEE rubric offers a novel, practical, reliable, and valid method for supervising physicians to assess pediatric written H&Ps. Copyright © 2016 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Hardiness scales in Iranian managers: evidence of incremental validity in relationships with the five factor model and with organizational and psychological adjustment.

PubMed

Ghorbani, Nima; Watson, P J

2005-06-01

This study examined the incremental validity of Hardiness scales in a sample of Iranian managers. Along with measures of the Five Factor Model and of Organizational and Psychological Adjustment, Hardiness scales were administered to 159 male managers (M age = 39.9, SD = 7.5) who had worked in their organizations for 7.9 yr. (SD=5.4). Hardiness predicted greater Job Satisfaction, higher Organization-based Self-esteem, and perceptions of the work environment as being less stressful and constraining. Hardiness also correlated positively with Assertiveness, Emotional Stability, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness and negatively with Depression, Anxiety, Perceived Stress, Chance External Control, and a Powerful Others External Control. Evidence of incremental validity was obtained when the Hardiness scales supplemented the Five Factor Model in predicting organizational and psychological adjustment. These data documented the incremental validity of the Hardiness scales in a non-Western sample and thus confirmed once again that Hardiness has a relevance that extends beyond the culture in which it was developed.
Do College Student Surveys Have Any Validity?

ERIC Educational Resources Information Center

Porter, Stephen R.

2011-01-01

Using standards established for validation research, I review the theory and evidence underlying the validity argument of the National Survey of Student Engagement (NSSE). I use the NSSE because it is the preeminent survey of college students, arguing that if it lacks validity, then so do almost all other college student surveys. I find that it…
Validation and Trustworthiness of Multiscale Models of Cardiac Electrophysiology

PubMed Central

Pathmanathan, Pras; Gray, Richard A.

2018-01-01

Computational models of cardiac electrophysiology have a long history in basic science applications and device design and evaluation, but have significant potential for clinical applications in all areas of cardiovascular medicine, including functional imaging and mapping, drug safety evaluation, disease diagnosis, patient selection, and therapy optimisation or personalisation. For all stakeholders to be confident in model-based clinical decisions, cardiac electrophysiological (CEP) models must be demonstrated to be trustworthy and reliable. Credibility, that is, the belief in the predictive capability, of a computational model is primarily established by performing validation, in which model predictions are compared to experimental or clinical data. However, there are numerous challenges to performing validation for highly complex multi-scale physiological models such as CEP models. As a result, credibility of CEP model predictions is usually founded upon a wide range of distinct factors, including various types of validation results, underlying theory, evidence supporting model assumptions, evidence from model calibration, all at a variety of scales from ion channel to cell to organ. Consequently, it is often unclear, or a matter for debate, the extent to which a CEP model can be trusted for a given application. The aim of this article is to clarify potential rationale for the trustworthiness of CEP models by reviewing evidence that has been (or could be) presented to support their credibility. We specifically address the complexity and multi-scale nature of CEP models which makes traditional model evaluation difficult. In addition, we make explicit some of the credibility justification that we believe is implicitly embedded in the CEP modeling literature. Overall, we provide a fresh perspective to CEP model credibility, and build a depiction and categorisation of the wide-ranging body of credibility evidence for CEP models. This paper also represents a step
Construct Validation in Counseling Psychology Research

ERIC Educational Resources Information Center

Hoyt, William T.; Warbasse, Rosalia E.; Chu, Erica Y.

2006-01-01

Counseling psychology researchers devote little attention to theory-based measurement validation, as evidenced by cursory mention of validity issues in the method and discussion sections of published research reports. Especially, many researchers appear unaware of the limitations of correlations between pairs of self-report measures as evidence of…
Health system context and implementation of evidence-based practices-development and validation of the Context Assessment for Community Health (COACH) tool for low- and middle-income settings.

PubMed

Bergström, Anna; Skeen, Sarah; Duc, Duong M; Blandon, Elmer Zelaya; Estabrooks, Carole; Gustavsson, Petter; Hoa, Dinh Thi Phuong; Källestål, Carina; Målqvist, Mats; Nga, Nguyen Thu; Persson, Lars-Åke; Pervin, Jesmin; Peterson, Stefan; Rahman, Anisur; Selling, Katarina; Squires, Janet E; Tomlinson, Mark; Waiswa, Peter; Wallin, Lars

2015-08-15

The gap between what is known and what is practiced results in health service users not benefitting from advances in healthcare, and in unnecessary costs. A supportive context is considered a key element for successful implementation of evidence-based practices (EBP). There were no tools available for the systematic mapping of aspects of organizational context influencing the implementation of EBPs in low- and middle-income countries (LMICs). Thus, this project aimed to develop and psychometrically validate a tool for this purpose. The development of the Context Assessment for Community Health (COACH) tool was premised on the context dimension in the Promoting Action on Research Implementation in Health Services framework, and is a derivative product of the Alberta Context Tool. Its development was undertaken in Bangladesh, Vietnam, Uganda, South Africa and Nicaragua in six phases: (1) defining dimensions and draft tool development, (2) content validity amongst in-country expert panels, (3) content validity amongst international experts, (4) response process validity, (5) translation and (6) evaluation of psychometric properties amongst 690 health workers in the five countries. The tool was validated for use amongst physicians, nurse/midwives and community health workers. The six phases of development resulted in a good fit between the theoretical dimensions of the COACH tool and its psychometric properties. The tool has 49 items measuring eight aspects of context: Resources, Community engagement, Commitment to work, Informal payment, Leadership, Work culture, Monitoring services for action and Sources of knowledge. Aspects of organizational context that were identified as influencing the implementation of EBPs in high-income settings were also found to be relevant in LMICs. However, there were additional aspects of context of relevance in LMICs specifically Resources, Community engagement, Commitment to work and Informal payment. Use of the COACH tool will allow
Evidence of Construct Validity in the Assessment of Hebephilia.

PubMed

Stephens, Skye; Seto, Michael C; Goodwill, Alasdair M; Cantor, James M

2017-01-01

Hebephilia refers to a persistent intense sexual interest in pubescent children. Although not as widely studied as pedophilia, studies of hebephilia have indicated convergence in self-report and sexual arousal. The present study expanded on previous work by examining convergent and divergent validity across indicators of hebephilia that included self-report, sexual behavior, and sexual arousal in a sample of 2238 men who had sexually offended. We included men who denied such interest and specifically examined the overlap between hebephilia and pedophilia and examined pedohebephilia (i.e., sexual interests in both prepubescent and pubescent children). Results indicated that there was considerable convergence across indicators of hebephilia. The results suggested poor divergent validity between hebephilia and pedophilia, as there was substantial overlap between the two constructs across analyses. Finally, a distinct pattern of sexual arousal was found in offenders with pedohebephilia. The results of the present study were discussed with a focus on implications for the assessment of sexual interest in children and the conceptualization of pedohebephilia.
A systematic review of validated sinus surgery simulators.

PubMed

Stew, B; Kao, S S-T; Dharmawardana, N; Ooi, E H

2018-06-01

Simulation provides a safe and effective opportunity to develop surgical skills. A variety of endoscopic sinus surgery (ESS) simulators has been described in the literature. Validation of these simulators allows for effective utilisation in training. To conduct a systematic review of the published literature to analyse the evidence for validated ESS simulation. Pubmed, Embase, Cochrane and Cinahl were searched from inception of the databases to 11 January 2017. Twelve thousand five hundred and sixteen articles were retrieved of which 10 112 were screened following the removal of duplicates. Thirty-eight full-text articles were reviewed after meeting search criteria. Evidence of face, content, construct, discriminant and predictive validity was extracted. Twenty articles were included in the analysis describing 12 ESS simulators. Eleven of these simulators had undergone validation: 3 virtual reality, 7 physical bench models and 1 cadaveric simulator. Seven of the simulators were shown to have face validity, 7 had construct validity and 1 had predictive validity. None of the simulators demonstrated discriminate validity. This systematic review demonstrates that a number of ESS simulators have been comprehensively validated. Many of the validation processes, however, lack standardisation in outcome reporting, thus limiting a meta-analysis comparison between simulators. © 2017 John Wiley & Sons Ltd.
Validation of Evidence-Based Fall Prevention Programs for Adults with Intellectual and/or Developmental Disorders: A Modified Otago Exercise Program.

PubMed

Renfro, Mindy; Bainbridge, Donna B; Smith, Matthew Lee

2016-01-01

Evidence-based fall prevention (EBFP) programs significantly decrease fall risk, falls, and fall-related injuries in community-dwelling older adults. To date, EBFP programs are only validated for use among people with normal cognition and, therefore, are not evidence-based for adults with intellectual and/or developmental disorders (IDD) such as Alzheimer's disease and related dementias, cerebral vascular accident, or traumatic brain injury. Adults with IDD experience not only a higher rate of falls than their community-dwelling, cognitively intact peers but also higher rates and earlier onset of chronic diseases, also known to increase fall risk. Adults with IDD experience many barriers to health care and health promotion programs. As the lifespan for people with IDD continues to increase, issues of aging (including falls with associated injury) are on the rise and require effective and efficient prevention. A modified group-based version of the Otago Exercise Program (OEP) was developed and implemented at a worksite employing adults with IDD in Montana. Participants were tested pre- and post-intervention using the Center for Disease Control and Prevention's (CDC) Stopping Elderly Accidents Deaths and Injuries (STEADI) tool kit. Participants participated in progressive once weekly, 1-h group exercise classes and home programs over a 7-week period. Discharge planning with consumers and caregivers included home exercise, walking, and an optional home assessment. Despite the limited number of participants ( n = 15) and short length of participation, improvements were observed in the 30-s Chair Stand Test, 4-Stage Balance Test, and 2-Minute Walk Test. Additionally, three individuals experienced an improvement in ambulation independence. Participants reported no falls during the study period. Promising results of this preliminary project underline the need for further study of this modified OEP among adults with IDD. Future multicenter study should include more
Evaluation of advanced laparoscopic skills tasks for validity evidence.

PubMed

Nepomnayshy, Dmitry; Whitledge, James; Birkett, Richard; Delmonico, Theodore; Ruthazer, Robin; Sillin, Lelan; Seymour, Neal E

2015-02-01

Since fundamentals of laparoscopic surgery (FLS) represents a minimum proficiency standard for laparoscopic surgery, more advanced proficiency standards are required to address the needs of current surgical training. We wanted to evaluate the acceptance and discriminative ability of a novel set of skills building on the FLS model that could represent a more advanced proficiency standard-advanced laparoscopic surgery (ALS). Qualitative and quantitative analyses were employed. Quantitative analysis involved comparison of expert (PGY 5+), intermediate (PGY 3-4) and novice (PGY 1-2) surgeons on FLS and ALS tasks. Composite scores included time and errors. Standard FLS errors were added to task time to create the composite score. Qualitative analysis involved thematic review of open-ended questions provided to experts participating in the study. Out of 48 participants, there were 15 (31 %) attendings, 3 (6 %) fellows and 30 (63 %) residents. By specialty, 54 % were general/MIS/bariatric/colorectal (GMBC) and 46 % were other (urology and gynecology). There was no difference between experience level and performance on FLS and ALS tasks for the entire cohort. However, looking at the GMBC subgroup, experts performed better than novices (p = 0.012) and intermediates performed better than novices (p = 0.057) on ALS tasks. There was no difference for the same group in FLS performance. Also, GMBC subgroup performed significantly better on FLS (p = 0.0035) and ALS (p = 0.0027) than the other subgroup. Thematic analysis revealed that the majority of experts felt that ALS was more realistic, challenging and clinically relevant for specific situations compared to FLS. For GMBC surgeons, we were able to show evidence of validity for a series of advanced laparoscopic tasks and their relationship to surgeon skill level. This study may represent the first step in the development of an advanced laparoscopic skills curriculum. Given the high degree of specialization in surgery, different
Content Validation and Evaluation of an Endovascular Teamwork Assessment Tool.

PubMed

Hull, L; Bicknell, C; Patel, K; Vyas, R; Van Herzeele, I; Sevdalis, N; Rudarakanchana, N

2016-07-01

To modify, content validate, and evaluate a teamwork assessment tool for use in endovascular surgery. A multistage, multimethod study was conducted. Stage 1 included expert review and modification of the existing Observational Teamwork Assessment for Surgery (OTAS) tool. Stage 2 included identification of additional exemplar behaviours contributing to effective teamwork and enhanced patient safety in endovascular surgery (using real-time observation, focus groups, and semistructured interviews of multidisciplinary teams). Stage 3 included content validation of exemplar behaviours using expert consensus according to established psychometric recommendations and evaluation of structure, content, feasibility, and usability of the Endovascular Observational Teamwork Assessment Tool (Endo-OTAS) by an expert multidisciplinary panel. Stage 4 included final team expert review of exemplars. OTAS core team behaviours were maintained (communication, coordination, cooperation, leadership team monitoring). Of the 114 OTAS behavioural exemplars, 19 were modified, four removed, and 39 additional endovascular-specific behaviours identified. Content validation of these 153 exemplar behaviours showed that 113/153 (73.9%) reached the predetermined Item-Content Validity Index rating for teamwork and/or patient safety. After expert team review, 140/153 (91.5%) exemplars were deemed to warrant inclusion in the tool. More than 90% of the expert panel agreed that Endo-OTAS is an appropriate teamwork assessment tool with observable behaviours. Some concerns were noted about the time required to conduct observations and provide performance feedback. Endo-OTAS is a novel teamwork assessment tool, with evidence for content validity and relevance to endovascular teams. Endo-OTAS enables systematic objective assessment of the quality of team performance during endovascular procedures. Copyright © 2016. Published by Elsevier Ltd.
Observational Evidence for Atoms.

ERIC Educational Resources Information Center

Jones, Edwin R., Jr.; Childers, Richard L.

1984-01-01

Discusses the development of the concept of atomicity and some of the many which can be used to establish its validity. Chemical evidence, evidence from crystals, Faraday's law of electrolysis, and Avogadro's number are among the areas which show how the concept originally developed from a purely philosophical idea. (JN)
A conservative method of testing whether combination analgesics produce additive or synergistic effects using evidence from acute pain and migraine.

PubMed

Moore, R A; Derry, C J; Derry, S; Straube, S; McQuay, H J

2012-04-01

Fixed-dose combination analgesics are used widely, and available both on prescription and over-the-counter. Combination drugs should provide more analgesia than with any single drug in the combination, but there is no evidence in humans about whether oral combinations have just additive effects, or are synergistic or even subadditive. We suggest that the measured result for the combination would be the summation of the absolute benefit increase (effect of active drug minus effect of placebo) of each component of a combination if effects were (merely) additive, and greater than the sum of the absolute benefits if they were synergistic. We tested measured effects of combination analgesics against the sum of the absolute benefits in acute pain and migraine using meta-analysis where individual components and combinations were tested against placebo in the same trials, and verified the result with meta-analyses where individual components and combinations were tested against placebo in different trials. Results showed that expected numbers needed to treat (NNT) for additive effects were generally within the 95% confidence interval of measured NNTs. This was true for combinations of paracetamol plus ibuprofen and paracetamol plus opioids in acute pain, and naproxen plus sumatriptan in migraine, but not where efficacy was very low or very high, nor combinations of paracetamol plus dextropropoxyphene. There was no evidence of synergy, defined as supra-additive effects. © 2011 European Federation of International Association for the Study of Pain Chapters.

Validity of Cognitive Load Measures in Simulation-Based Training: A Systematic Review.

PubMed

Naismith, Laura M; Cavalcanti, Rodrigo B

2015-11-01

Cognitive load theory (CLT) provides a rich framework to inform instructional design. Despite the applicability of CLT to simulation-based medical training, findings from multimedia learning have not been consistently replicated in this context. This lack of transferability may be related to issues in measuring cognitive load (CL) during simulation. The authors conducted a review of CLT studies across simulation training contexts to assess the validity evidence for different CL measures. PRISMA standards were followed. For 48 studies selected from a search of MEDLINE, EMBASE, PsycInfo, CINAHL, and ERIC databases, information was extracted about study aims, methods, validity evidence of measures, and findings. Studies were categorized on the basis of findings and prevalence of validity evidence collected, and statistical comparisons between measurement types and research domains were pursued. CL during simulation training has been measured in diverse populations including medical trainees, pilots, and university students. Most studies (71%; 34) used self-report measures; others included secondary task performance, physiological indices, and observer ratings. Correlations between CL and learning varied from positive to negative. Overall validity evidence for CL measures was low (mean score 1.55/5). Studies reporting greater validity evidence were more likely to report that high CL impaired learning. The authors found evidence that inconsistent correlations between CL and learning may be related to issues of validity in CL measures. Further research would benefit from rigorous documentation of validity and from triangulating measures of CL. This can better inform CLT instructional design for simulation-based medical training.
Validity of Childhood Career Development Scale Scores in South Africa

ERIC Educational Resources Information Center

Stead, Graham B.; Schultheiss, Donna E. Palladino

2010-01-01

The purpose of this study was to provide evidence of the construct and concurrent validity of the Childhood Career Development Scale's (CCDS) scores among South African primary school children. Using a sample of 808 children in grades four through seven, evidence for the CCDS's construct validity was provided using confirmatory factor analysis,…
Clarifying the Consensus Definition of Validity

ERIC Educational Resources Information Center

Newton, Paul E.

2012-01-01

The 1999 "Standards for Educational and Psychological Testing" defines validity as the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although quite explicit, there are ways in which this definition lacks precision, consistency, and clarity. The history of validity has taught us…
Evaluating Test Validity: Reprise and Progress

ERIC Educational Resources Information Center

Shepard, Lorrie A.

2016-01-01

The AERA, APA, NCME Standards define validity as "the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests". A century of disagreement about validity does not mean that there has not been substantial progress. This consensus definition brings together interpretations and use so that it…
Construct validity evidence for the Male Role Norms Inventory-Short Form: A structural equation modeling approach using the bifactor model.

PubMed

Levant, Ronald F; Hall, Rosalie J; Weigold, Ingrid K; McCurdy, Eric R

2016-10-01

The construct validity of the Male Role Norms Inventory-Short Form (MRNI-SF) was assessed using a latent variable approach implemented with structural equation modeling (SEM). The MRNI-SF was specified as having a bifactor structure, and validation scales were also specified as latent variables. The latent variable approach had the advantages of separating effects of general and specific factors and controlling for some sources of measurement error. Data (N = 484) were from a diverse sample (38.8% men of color, 22.3% men of diverse sexualities) of community-dwelling and college men who responded to an online survey. The construct validity of the MRNI-SF General Traditional Masculinity Ideology factor was supported for all 4 of the proposed latent correlations with: (a) Male Role Attitudes Scale; (b) general factor of Conformity to Masculine Norms Inventory-46; (c) higher-order factor of Gender Role Conflict Scale; and (d) Personal Attributes Questionnaire-Masculinity Scale. Significant correlations with relevant other latent factors provided concurrent validity evidence for the MRNI-SF specific factors of Negativity toward Sexual Minorities, Importance of Sex, Restrictive Emotionality, and Toughness, with all 8 of the hypothesized relationships supported. However, 3 relationships concerning Dominance were not supported. (The construct validity of the remaining 2 MRNI-SF specific factors-Avoidance of Femininity and Self-Reliance through Mechanical Skills was not assessed.) Comparisons were made, and meaningful differences noted, between the latent correlations emphasized in this study and their raw variable counterparts. Results are discussed in terms of the advantages of an SEM approach and the unique characteristics of the bifactor model. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Antiandrogenic activity of phthalate mixtures: Validity of concentration addition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Christen, Verena; Crettaz, Pierre; Oberli-Schrämmli, Aurelia

2012-03-01

Phthalates and bisphenol A have very widespread use leading to significant exposure of humans. They are suspected to interfere with the endocrine system, including the androgen, estrogen and the thyroid hormone system. Here we analyzed the antiandrogenic activity of six binary, and one ternary mixture of phthalates exhibiting complete antiandrogenic dose–response curves, and binary mixtures of phthalates and bisphenol A at equi-effective concentrations of EC{sub 10}, EC{sub 25} and EC{sub 50} in MDA-kb2 cells. Mixture activity followed the concentration addition (CA) model with a tendency to synergism at high and antagonism at low concentrations. Isoboles and the toxic unit approachmore » (TUA) confirmed the additive to synergistic activity of the binary mixtures BBP + DBP, DBP + DEP and DEP + BPA at high concentrations. Both methods indicate a tendency to antagonism for the EC{sub 10} mixtures BBP + DBP, BBP + DEP and DBP + DEP, and the EC{sub 25} mixture of DBP + BPA. A ternary mixture revealed synergism at the EC{sub 50}, and weak antagonistic activity at the EC{sub 25} level by the TUA. A mixture of five phthalates representing a human urine composition and reflecting exposure to corresponding parent compounds showed no antiandrogenic activity. Our study demonstrates that CA is an appropriate concept to account for mixture effects of antiandrogenic phthalates and bisphenol A. The interaction indicates a departure from additivity to antagonism at low concentrations, probably due to interaction with the androgen receptor and/or cofactors. This study emphasizes that a risk assessment of phthalates should account for mixture effects by applying the CA concept. -- Highlights: ► Antiandrogenic activity of mixtures of 2 and 3 phthalates are assessed in MDA-kb2 cells. ► Mixture activities followed the concentration addition model. ► A tendency to synergism at high and antagonism at low levels occurred.« less
The Development and Validation of the Rational and Intuitive Decision Styles Scale.

PubMed

Hamilton, Katherine; Shih, Shin-I; Mohammed, Susan

2016-01-01

Decision styles reflect the typical manner by which individuals make decisions. The purpose of this research was to develop and validate a decision style scale that addresses conceptual and psychometric problems with current measures. The resulting 10-item scale captures a broad range of the rational and intuitive styles construct domain. Results from 5 independent samples provide initial support for the dimensionality and reliability of the new scale, as demonstrated by a clear factor structure and high internal consistency. In addition, our results show evidence of convergent and discriminant validity through expected patterns of correlations across decision-making individual differences and the International Personality Item Pool (IPIP) Big Five traits. Research domains that would benefit from incorporating the concept of decision styles are discussed.
The scoring of arousal in sleep: reliability, validity, and alternatives.

PubMed

Bonnet, Michael H; Doghramji, Karl; Roehrs, Timothy; Stepanski, Edward J; Sheldon, Stephen H; Walters, Arthur S; Wise, Merrill; Chesson, Andrew L

2007-03-15

The reliability and validity of EEG arousals and other types of arousal are reviewed. Brief arousals during sleep had been observed for many years, but the evolution of sleep medicine in the 1980s directed new attention to these events. Early studies at that time in animals and humans linked brief EEG arousals and associated fragmentation of sleep to daytime sleepiness and degraded performance. Increasing interest in scoring of EEG arousals led the ASDA to publish a scoring manual in 1992. The current review summarizes numerous studies that have examined scoring reliability for these EEG arousals. Validity of EEG arousals was explored by review of studies that empirically varied arousals and found deficits similar to those found after total sleep deprivation depending upon the rate and extent of sleep fragmentation. Additional data from patients with clinical sleep disorders prior to and after effective treatment has also shown a continuing relationship between reduction in pathology-related arousals and improved sleep and daytime function. Finally, many suggestions have been made to refine arousal scoring to include additional elements (e.g., CAP), change the time frame, or focus on other physiological responses such as heart rate or blood pressure changes. Evidence to support the reliability and validity of these measures is presented. It was concluded that the scoring of EEG arousals has added much to our understanding of the sleep process but that significant work on the neurophysiology of arousal needs to be done. Additional refinement of arousal scoring will provide improved insight into sleep pathology and recovery.
Symptom validity issues in the psychological consultative examination for social security disability.

PubMed

Chafetz, Michael D

2010-08-01

This article is about Social Security Administration (SSA) policy with regard to the Psychological Consultative Examination (PCE) for Social Security Disability, particularly with respect to validation of the responses and findings. First, the nature of the consultation and the importance of understanding the boundaries and ethics of the psychologist's role are described. Issues particular to working with low-functioning claimants usually form a large part of these examinations. The psychologist must understand various forms of non-credible behavior during the PCE, and how malingering might be considered among other non-credible presentations. Issues pertaining to symptom validity testing in low-functioning claimants are further explored. SSA policy with respect to symptom validity testing is carefully examined, with an attempt to answer specific concerns and show how psychological science can be of assistance, particularly with evidence-based practice. Additionally, the nature and importance of techniques to avoid the mislabeling of claimants as malingerers are examined. SSA requires the use of accepted diagnostic techniques with which to establish impairment, and this article describes the implementation of that requirement, particularly with respect to validating the findings.
A non-additive repulsive contribution in an equation of state: The development for homonuclear square well chains equation of state validated against Monte Carlo simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trinh, Thi-Kim-Hoang; Laboratoire de Science des Procédés et des Matériaux; Passarello, Jean-Philippe, E-mail: Jean-Philippe.Passarello@lspm.cnrs.fr

This work consists of the adaptation of a non-additive hard sphere theory inspired by Malakhov and Volkov [Polym. Sci., Ser. A 49(6), 745–756 (2007)] to a square-well chain. Using the thermodynamic perturbation theory, an additional term is proposed that describes the effect of perturbing the chain of square well spheres by a non-additive parameter. In order to validate this development, NPT Monte Carlo simulations of thermodynamic and structural properties of the non-additive square well for a pure chain and a binary mixture of chains are performed. Good agreements are observed between the compressibility factors originating from the theory and thosemore » from molecular simulations.« less
Interpreting Variance Components as Evidence for Reliability and Validity.

ERIC Educational Resources Information Center

Kane, Michael T.

The reliability and validity of measurement is analyzed by a sampling model based on generalizability theory. A model for the relationship between a measurement procedure and an attribute is developed from an analysis of how measurements are used and interpreted in science. The model provides a basis for analyzing the concept of an error of…
Cloud computing and validation of expandable in silico livers

PubMed Central

2010-01-01

Background In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. Results The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. Conclusions The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide
Cloud computing and validation of expandable in silico livers.

PubMed

Ropella, Glen E P; Hunt, C Anthony

2010-12-03

In Silico Livers (ISLs) are works in progress. They are used to challenge multilevel, multi-attribute, mechanistic hypotheses about the hepatic disposition of xenobiotics coupled with hepatic responses. To enhance ISL-to-liver mappings, we added discrete time metabolism, biliary elimination, and bolus dosing features to a previously validated ISL and initiated re-validated experiments that required scaling experiments to use more simulated lobules than previously, more than could be achieved using the local cluster technology. Rather than dramatically increasing the size of our local cluster we undertook the re-validation experiments using the Amazon EC2 cloud platform. So doing required demonstrating the efficacy of scaling a simulation to use more cluster nodes and assessing the scientific equivalence of local cluster validation experiments with those executed using the cloud platform. The local cluster technology was duplicated in the Amazon EC2 cloud platform. Synthetic modeling protocols were followed to identify a successful parameterization. Experiment sample sizes (number of simulated lobules) on both platforms were 49, 70, 84, and 152 (cloud only). Experimental indistinguishability was demonstrated for ISL outflow profiles of diltiazem using both platforms for experiments consisting of 84 or more samples. The process was analogous to demonstration of results equivalency from two different wet-labs. The results provide additional evidence that disposition simulations using ISLs can cover the behavior space of liver experiments in distinct experimental contexts (there is in silico-to-wet-lab phenotype similarity). The scientific value of experimenting with multiscale biomedical models has been limited to research groups with access to computer clusters. The availability of cloud technology coupled with the evidence of scientific equivalency has lowered the barrier and will greatly facilitate model sharing as well as provide straightforward tools for scaling
Validation of the Impostor Phenomenon among Managers

PubMed Central

Rohrmann, Sonja; Bechtoldt, Myriam N.; Leonhardt, Mona

2016-01-01

Following up on earlier investigations, the present research aims at validating the construct impostor phenomenon by taking other personality correlates into account and to examine whether the impostor phenomenon is a construct in its own right. In addition, gender effects as well as associations with dispositional working styles and strain are examined. In an online study we surveyed a sample of N = 242 individuals occupying leadership positions in different sectors. Confirmatory factor analyses provide empirical evidence for the discriminant validity of the impostor phenomenon. In accord with earlier studies we show that the impostor phenomenon is accompanied by higher levels of anxiety, dysphoric moods, emotional instability, a generally negative self-evaluation, and perfectionism. The study does not reveal any gender differences concerning the impostor phenomenon. With respect to working styles, persons with an impostor self-concept tend to show perfectionist as well as procrastinating behaviors. Moreover, they report being more stressed and strained by their work. In sum, the findings show that the impostor phenomenon constitutes a dysfunctional personality style. Practical implications are discussed. PMID:27313554
The evidence for clinically significant bias in plasma glucose between liquid and lyophilized citrate buffer additive.

PubMed

Juricic, Gordana; Saracevic, Andrea; Kopcinovic, Lara Milevoj; Bakliza, Ana; Simundic, Ana-Maria

2016-12-01

Citrate buffer additive has been suggested to be of supreme performance in inhibiting glycolysis. However, there is little evidence in the literature regarding the comparability of glucose concentrations in liquid and lyophilized citrate buffer containing tubes. The aim of this study was to compare glucose concentrations in tubes containing liquid (Glucomedics) and lyophilized citrate buffer (Terumo VENOSAFE™ Glycemia) additive, measured immediately after centrifugation. Blood was collected from forty volunteers into both Glucomedics and Venosafe Glycemia tubes. Blood was centrifuged within 15min from venipuncture and glucose concentration was measured immediately after centrifugation, on the Abbott Architect analyzer. Differences between glucose concentrations in Glucomedics and Terumo tubes were tested using the paired t-test. Mean bias was calculated and compared to recommended quality specification for glucose (i.e. 2.2%). Glucose concentration in Terumo tubes was 3.4% lower than in Glucomedics tubes (P<0.001). The mean bias was clinically significant. There is a clinically significant difference between glucose concentrations in liquid and lyophilized citrate buffer additive tubes (Glucomedics vs. Terumo tubes) measured immediately after centrifugation. This difference may affect the patient outcome due to the misclassification of diabetes. Copyright Â© 2016 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
20 CFR 220.14 - Weighing of evidence.

Code of Federal Regulations, 2010 CFR

2010-04-01

... capacity evaluation is based upon functional objective tests with high validity and reliability; (2) The... consists of objective findings of exams that have poor reliability or validity; (7) The evidence consists...
Performance of the Tariff Method: validation of a simple additive algorithm for analysis of verbal autopsies

PubMed Central

2011-01-01

Background Verbal autopsies provide valuable information for studying mortality patterns in populations that lack reliable vital registration data. Methods for transforming verbal autopsy results into meaningful information for health workers and policymakers, however, are often costly or complicated to use. We present a simple additive algorithm, the Tariff Method (termed Tariff), which can be used for assigning individual cause of death and for determining cause-specific mortality fractions (CSMFs) from verbal autopsy data. Methods Tariff calculates a score, or "tariff," for each cause, for each sign/symptom, across a pool of validated verbal autopsy data. The tariffs are summed for a given response pattern in a verbal autopsy, and this sum (score) provides the basis for predicting the cause of death in a dataset. We implemented this algorithm and evaluated the method's predictive ability, both in terms of chance-corrected concordance at the individual cause assignment level and in terms of CSMF accuracy at the population level. The analysis was conducted separately for adult, child, and neonatal verbal autopsies across 500 pairs of train-test validation verbal autopsy data. Results Tariff is capable of outperforming physician-certified verbal autopsy in most cases. In terms of chance-corrected concordance, the method achieves 44.5% in adults, 39% in children, and 23.9% in neonates. CSMF accuracy was 0.745 in adults, 0.709 in children, and 0.679 in neonates. Conclusions Verbal autopsies can be an efficient means of obtaining cause of death data, and Tariff provides an intuitive, reliable method for generating individual cause assignment and CSMFs. The method is transparent and flexible and can be readily implemented by users without training in statistics or computer science. PMID:21816107
Construction and Initial Validation of the Multiracial Experiences Measure (MEM)

PubMed Central

Yoo, Hyung Chol; Jackson, Kelly; Guevarra, Rudy P.; Miller, Matthew J.; Harrington, Blair

2015-01-01

This article describes the development and validation of the Multiracial Experiences Measure (MEM): a new measure that assesses uniquely racialized risks and resiliencies experienced by individuals of mixed racial heritage. Across two studies, there was evidence for the validation of the 25-item MEM with 5 subscales including Shifting Expressions, Perceived Racial Ambiguity, Creating Third Space, Multicultural Engagement, and Multiracial Discrimination. The 5-subscale structure of the MEM was supported by a combination of exploratory and confirmatory factor analyses. Evidence of criterion-related validity was partially supported with MEM subscales correlating with measures of racial diversity in one’s social network, color-blind racial attitude, psychological distress, and identity conflict. Evidence of discriminant validity was supported with MEM subscales not correlating with impression management. Implications for future research and suggestions for utilization of the MEM in clinical practice with multiracial adults are discussed. PMID:26460977
Construction and initial validation of the Multiracial Experiences Measure (MEM).

PubMed

Yoo, Hyung Chol; Jackson, Kelly F; Guevarra, Rudy P; Miller, Matthew J; Harrington, Blair

2016-03-01

This article describes the development and validation of the Multiracial Experiences Measure (MEM): a new measure that assesses uniquely racialized risks and resiliencies experienced by individuals of mixed racial heritage. Across 2 studies, there was evidence for the validation of the 25-item MEM with 5 subscales including Shifting Expressions, Perceived Racial Ambiguity, Creating Third Space, Multicultural Engagement, and Multiracial Discrimination. The 5-subscale structure of the MEM was supported by a combination of exploratory and confirmatory factor analyses. Evidence of criterion-related validity was partially supported with MEM subscales correlating with measures of racial diversity in one's social network, color-blind racial attitude, psychological distress, and identity conflict. Evidence of discriminant validity was supported with MEM subscales not correlating with impression management. Implications for future research and suggestions for utilization of the MEM in clinical practice with multiracial adults are discussed. (c) 2016 APA, all rights reserved).
Automating Electronic Clinical Data Capture for Quality Improvement and Research: The CERTAIN Validation Project of Real World Evidence.

PubMed

Devine, Emily Beth; Van Eaton, Erik; Zadworny, Megan E; Symons, Rebecca; Devlin, Allison; Yanez, David; Yetisgen, Meliha; Keyloun, Katelyn R; Capurro, Daniel; Alfonso-Cristancho, Rafael; Flum, David R; Tarczy-Hornoch, Peter

2018-05-22

The availability of high fidelity electronic health record (EHR) data is a hallmark of the learning health care system. Washington State's Surgical Care Outcomes and Assessment Program (SCOAP) is a network of hospitals participating in quality improvement (QI) registries wherein data are manually abstracted from EHRs. To create the Comparative Effectiveness Research and Translation Network (CERTAIN), we semi-automated SCOAP data abstraction using a centralized federated data model, created a central data repository (CDR), and assessed whether these data could be used as real world evidence for QI and research. Describe the validation processes and complexities involved and lessons learned. Investigators installed a commercial CDR to retrieve and store data from disparate EHRs. Manual and automated abstraction systems were conducted in parallel (10/2012-7/2013) and validated in three phases using the EHR as the gold standard: 1) ingestion, 2) standardization, and 3) concordance of automated versus manually abstracted cases. Information retrieval statistics were calculated. Four unaffiliated health systems provided data. Between 6 and 15 percent of data elements were abstracted: 51 to 86 percent from structured data; the remainder using natural language processing (NLP). In phase 1, data ingestion from 12 out of 20 feeds reached 95 percent accuracy. In phase 2, 55 percent of structured data elements performed with 96 to 100 percent accuracy; NLP with 89 to 91 percent accuracy. In phase 3, concordance ranged from 69 to 89 percent. Information retrieval statistics were consistently above 90 percent. Semi-automated data abstraction may be useful, although raw data collected as a byproduct of health care delivery is not immediately available for use as real world evidence. New approaches to gathering and analyzing extant data are required.

Changing paradigms from a historical DSM-III and DSM-IV view toward an evidence-based definition of premature ejaculation. Part I--validity of DSM-IV-TR.

PubMed

Waldinger, Marcel D; Schweitzer, Dave H

2006-07-01

In former days, information obtained from randomized well-controlled clinical trials and epidemiological studies on premature ejaculation (PE) was not available, thereby hampering the efforts of the consecutive DSM Work Groups on Sexual Disorders to formulate an evidence-based definition of PE. The current DSM-IV-TR definition of PE is still nonevidence based. In addition, the requirement that persistent self-perceived PE, distress, and interpersonal difficulties, in absence of a quantified ejaculation time, are necessary to establish the diagnosis remains disputable. To investigate the validity and reliability of DSM and ICD diagnosis of premature ejaculation. The historical development of DSM and ICD classification of mental disorders is critically reviewed, and two studies using the DSM-IV-TR definition of PE is critically reanalyzed. Reanalysis of two studies using the DSM-IV-TR definition of PE has shown that DSM-diagnosed PE can be accompanied by long intravaginal ejaculation latency time (IELT) values. The reanalysis revealed a low positive predictive value for the DSM-IV-TR definition when used as a diagnostic test. A similar situation pertains to the American Urological Association (AUA) definition of PE, which is practically a copy of the DSM-IV-TR definition. It should be emphasized that any evidence-based definition of PE needs objectively collected patient-reported outcome (PRO) data from epidemiological studies, as well as reproducible quantifications of the IELT.
Valid and Reliable Science Content Assessments for Science Teachers

NASA Astrophysics Data System (ADS)

Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn

2013-03-01

Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper describes multiple sources of validity and reliability (Cronbach's alpha greater than 0.8) evidence for physical, life, and earth/space science assessments—part of the Diagnostic Teacher Assessments of Mathematics and Science (DTAMS) project. Validity was strengthened by systematic synthesis of relevant documents, extensive use of external reviewers, and field tests with 900 teachers during assessment development process. Subsequent results from 4,400 teachers, analyzed with Rasch IRT modeling techniques, offer construct and concurrent validity evidence.
A Validation of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors: Patterns in Rural and Urban Elementary Schools

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Menzies, Holly M.; Oakes, Wendy P.; Lambert, Warren; Cox, Meredith; Hankins, Katy

2012-01-01

We report findings of two studies, one conducted in a rural school district (N = 982) and a second conducted in an urban district (N = 1,079), offering additional evidence of the reliability and validity of a revised instrument, the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE), to accurately detect internalizing and…
The 2018 Definition of Periprosthetic Hip and Knee Infection: An Evidence-Based and Validated Criteria.

PubMed

Parvizi, Javad; Tan, Timothy L; Goswami, Karan; Higuera, Carlos; Della Valle, Craig; Chen, Antonia F; Shohat, Noam

2018-05-01

The introduction of the Musculoskeletal Infection Society (MSIS) criteria for periprosthetic joint infection (PJI) in 2011 resulted in improvements in diagnostic confidence and research collaboration. The emergence of new diagnostic tests and the lessons we have learned from the past 7 years using the MSIS definition, prompted us to develop an evidence-based and validated updated version of the criteria. This multi-institutional study of patients undergoing revision total joint arthroplasty was conducted at 3 academic centers. For the development of the new diagnostic criteria, PJI and aseptic patient cohorts were stringently defined: PJI cases were defined using only major criteria from the MSIS definition (n = 684) and aseptic cases underwent one-stage revision for a noninfective indication and did not fail within 2 years (n = 820). Serum C-reactive protein (CRP), D-dimer, erythrocyte sedimentation rate were investigated, as well as synovial white blood cell count, polymorphonuclear percentage, leukocyte esterase, alpha-defensin, and synovial CRP. Intraoperative findings included frozen section, presence of purulence, and isolation of a pathogen by culture. A stepwise approach using random forest analysis and multivariate regression was used to generate relative weights for each diagnostic marker. Preoperative and intraoperative definitions were created based on beta coefficients. The new definition was then validated on an external cohort of 222 patients with PJI who subsequently failed with reinfection and 200 aseptic patients. The performance of the new criteria was compared to the established MSIS and the prior International Consensus Meeting definitions. Two positive cultures or the presence of a sinus tract were considered as major criteria and diagnostic of PJI. The calculated weights of an elevated serum CRP (>1 mg/dL), D-dimer (>860 ng/mL), and erythrocyte sedimentation rate (>30 mm/h) were 2, 2, and 1 points, respectively. Furthermore, elevated
Evidence-based radiology: how to quickly assess the validity and strength of publications in the diagnostic radiology literature.

PubMed

Dodd, Jonathan D; MacEneaney, Peter M; Malone, Dermot E

2004-05-01

The aim of this study was to show how evidence-based medicine (EBM) techniques can be applied to the appraisal of diagnostic radiology publications. A clinical scenario is described: a gastroenterologist has questioned the diagnostic performance of magnetic resonance cholangiopancreatography (MRCP) in a patient who may have common bile duct (CBD) stones. His opinion was based on an article on MRCP published in "Gut." The principles of EBM are described and then applied to the critical appraisal of this paper. Another paper on the same subject was obtained from the radiology literature and was also critically appraised using explicit EBM criteria. The principles for assessing the validity and strength of both studies are outlined. All statistical parameters were generated quickly using a spreadsheet in Excel format. The results of EBM assessment of both papers are presented. The calculation and application of confidence intervals (CIs) and likelihood ratios (LRs) for both studies are described. These statistical results are applied to individual patient scenarios using graphs of conditional probability (GCP). Basic EBM principles are described and additional points relevant to radiologists discussed. Online resources for EBR practice are identified. The principles of EBM and their application to radiology are discussed. It is emphasized that sensitivity and specificity are point estimates of the "true" characteristics of a test in clinical practice. A spreadsheet can be used to quickly calculate CIs, LRs and GCPs. These give the radiologist a better understanding of the meaning of diagnostic test results in any patient or population of patients.
Evidence on existing caries risk assessment systems: are they predictive of future caries?

PubMed

Tellez, M; Gomez, J; Pretty, I; Ellwood, R; Ismail, A I

2013-02-01

To critically appraise evidence for the prediction of caries using four caries risk assessment (CRA) systems/guidelines (Cariogram, Caries Management by Risk Assessment (CAMBRA), American Dental Association (ADA), and American Academy of Pediatric Dentistry (AAPD)). This review focused on prospective cohort studies or randomized controlled trials. A systematic search strategy was developed to locate papers published in Medline Ovid and Cochrane databases. The search identified 539 scientific reports, and after title and abstract review, 137 were selected for full review and 14 met the following inclusion criteria: (i) used as validating criterion caries incidence/increment, (ii) involved human subjects and natural carious lesions, and (iii) published in peer-reviewed journals. In addition, papers were excluded if they met one or more of the following criteria: (i) incomplete description of sample selection, outcomes, or small sample size and (ii) not meeting the criteria for best evidence under the prognosis category of the Oxford Centre for Evidence-Based Medicine. There are wide variations among the systems in terms of definitions of caries risk categories, type and number of risk factors/markers, and disease indicators. The Cariogram combined sensitivity and specificity for predicting caries in permanent dentition ranges from 110 to 139 and is the only system for which prospective studies have been conducted to assess its validity. The Cariogram had limited prediction utility in preschool children, and a moderate to good performance for sorting out elderly individuals into caries risk groups. One retrospective analysis on CAMBRA's CRA reported higher incidence of cavitated lesions among those assessed as extreme-risk patients when compared with those at low risk. The evidence on the validity for existing systems for CRA is limited. It is unknown if the identification of high-risk individuals can lead to more effective long-term patient management that prevents
The Alcohol Sensitivity Questionnaire: Evidence for Construct Validity

PubMed Central

Fleming, Kimberly A.; Bartholow, Bruce D.; Hilgard, Joseph B.; McCarthy, Denis M.; O’Neill, Susan E.; Steinley, Douglas; Sher, Kenneth J.

2016-01-01

Background Variability in sensitivity to the acute effects of alcohol is an important risk factor for the development of alcohol use disorder (AUD). The most commonly used retrospective self-report measure of sensitivity, the Self-Rating of the Effects of Alcohol form (SRE), queries a limited number of alcohol effects and relies on respondents’ ability to recall experiences that might have occurred in the distant past. Here, we investigated the construct validity of an alternative measure that queries a larger number of alcohol effects, the Alcohol Sensitivity Questionnaire (ASQ), and compared it to the SRE in predicting momentary subjective responses to an acute dose of alcohol. Method Healthy young adults (N = 423) completed the SRE and the ASQ and then were randomly assigned to consume either alcohol or a placebo beverage (between-subjects manipulation). Stimulation and sedation (Biphasic Alcohol Effects Scale) and subjective intoxication were measured multiple times after drinking. Results Hierarchical linear models showed that the ASQ reliably predicted each of these outcomes following alcohol but not placebo consumption, provided unique prediction beyond that associated with differences in recent alcohol involvement, and was preferred over the SRE (in terms of model fit) in direct model comparisons of stimulation and sedation. Conclusions The ASQ compared favorably with the better-known SRE in predicting increased stimulation and reduced sedation following an acute alcohol challenge. The ASQ appears to be a valid self-report measure of alcohol sensitivity and therefore holds promise for identifying individuals at-risk for AUD and related problems. PMID:27012527
Apology in the criminal justice setting: evidence for including apology as an additional component in the legal system.

PubMed

Petrucci, Carrie J

2002-01-01

The criminal justice system has reached unprecedented scope in the United States, with over 6.4 million people under some type of supervision. Remedies that have the potential to reduce this number are continually being sought. This article analyzes an innovative strategy currently being reconsidered in criminal justice: the apology. Despite a legal system that only sporadically acknowledges it, evidence for the use of apology is supported by social science research, current criminal justice theories, case law, and empirical studies. Social psychological, sociological and socio-legal studies pinpoint the elements and function of apology, what makes apologies effective, and concerns about apology if it were implemented in the criminal justice system. Theoretical evidence is examined (including restorative justice, therapeutic jurisprudence, crime, shame, and reintegration) to explore the process of apology in the criminal justice context. Attribution theory and social conduct theory are used to explain the apology process specifically for victims and offenders. A brief examination of case law reveals that though apology has no formal place in criminal law, it has surfaced recently under the federal sentencing guidelines. Finally, empirical evidence in criminal justice settings reveals that offenders want to apologize and victims desire an apology. Moreover, by directly addressing the harmful act, apology may be the link to reduced recidivism for offenders, as well as empowerment for victims. This evidence combined suggests that apology is worthy of further study as a potentially valuable addition to the criminal justice process. Copyright 2002 John Wiley & Sons, Ltd.
The 2014 Sandia Verification and Validation Challenge: Problem statement

DOE PAGES

Hu, Kenneth; Orient, George

2016-01-18

This paper presents a case study in utilizing information from experiments, models, and verification and validation (V&V) to support a decision. It consists of a simple system with data and models provided, plus a safety requirement to assess. The goal is to pose a problem that is flexible enough to allow challengers to demonstrate a variety of approaches, but constrained enough to focus attention on a theme. This was accomplished by providing a good deal of background information in addition to the data, models, and code, but directing the participants' activities with specific deliverables. In this challenge, the theme ismore » how to gather and present evidence about the quality of model predictions, in order to support a decision. This case study formed the basis of the 2014 Sandia V&V Challenge Workshop and this resulting special edition of the ASME Journal of Verification, Validation, and Uncertainty Quantification.« less
Validity of Willingness to Pay Measures under Preference Uncertainty.

PubMed

Braun, Carola; Rehdanz, Katrin; Schmidt, Ulrich

2016-01-01

Recent studies in the marketing literature developed a new method for eliciting willingness to pay (WTP) with an open-ended elicitation format: the Range-WTP method. In contrast to the traditional approach of eliciting WTP as a single value (Point-WTP), Range-WTP explicitly allows for preference uncertainty in responses. The aim of this paper is to apply Range-WTP to the domain of contingent valuation and to test for its theoretical validity and robustness in comparison to the Point-WTP. Using data from two novel large-scale surveys on the perception of solar radiation management (SRM), a little-known technique for counteracting climate change, we compare the performance of both methods in the field. In addition to the theoretical validity (i.e. the degree to which WTP values are consistent with theoretical expectations), we analyse the test-retest reliability and stability of our results over time. Our evidence suggests that the Range-WTP method clearly outperforms the Point-WTP method.
Validity of Willingness to Pay Measures under Preference Uncertainty

PubMed Central

Braun, Carola; Rehdanz, Katrin; Schmidt, Ulrich

2016-01-01

Recent studies in the marketing literature developed a new method for eliciting willingness to pay (WTP) with an open-ended elicitation format: the Range-WTP method. In contrast to the traditional approach of eliciting WTP as a single value (Point-WTP), Range-WTP explicitly allows for preference uncertainty in responses. The aim of this paper is to apply Range-WTP to the domain of contingent valuation and to test for its theoretical validity and robustness in comparison to the Point-WTP. Using data from two novel large-scale surveys on the perception of solar radiation management (SRM), a little-known technique for counteracting climate change, we compare the performance of both methods in the field. In addition to the theoretical validity (i.e. the degree to which WTP values are consistent with theoretical expectations), we analyse the test-retest reliability and stability of our results over time. Our evidence suggests that the Range-WTP method clearly outperforms the Point-WTP method. PMID:27096163
20 CFR 219.33 - Evidence of a deemed valid marriage.

Code of Federal Regulations, 2010 CFR

2010-04-01

... marriage was valid; or if the employee is dead, the widow or widower's signed statement to that effect; (3... household when the employee applied for payments; or, if the employee is dead, when he or she died. See...
20 CFR 219.33 - Evidence of a deemed valid marriage.

Code of Federal Regulations, 2014 CFR

2014-04-01

... marriage was valid; or if the employee is dead, the widow or widower's signed statement to that effect; (3... household when the employee applied for payments; or, if the employee is dead, when he or she died. See...
20 CFR 219.33 - Evidence of a deemed valid marriage.

Code of Federal Regulations, 2012 CFR

2012-04-01

... marriage was valid; or if the employee is dead, the widow or widower's signed statement to that effect; (3... household when the employee applied for payments; or, if the employee is dead, when he or she died. See...
20 CFR 219.33 - Evidence of a deemed valid marriage.

Code of Federal Regulations, 2011 CFR

2011-04-01

... marriage was valid; or if the employee is dead, the widow or widower's signed statement to that effect; (3... household when the employee applied for payments; or, if the employee is dead, when he or she died. See...
20 CFR 219.33 - Evidence of a deemed valid marriage.

Code of Federal Regulations, 2013 CFR

2013-04-01

... marriage was valid; or if the employee is dead, the widow or widower's signed statement to that effect; (3... household when the employee applied for payments; or, if the employee is dead, when he or she died. See...
An Additional Baurusuchid from the Cretaceous of Brazil with Evidence of Interspecific Predation among Crocodyliformes

PubMed Central

Godoy, Pedro L.; Montefeltro, Felipe C.; Norell, Mark A.; Langer, Max C.

2014-01-01

A new Baurusuchidae (Crocodyliformes, Mesoeucrocodylia), Aplestosuchus sordidus, is described based on a nearly complete skeleton collected in deposits of the Adamantina Formation (Bauru Group, Late Cretaceous) of Brazil. The nesting of the new taxon within Baurusuchidae can be ensured based on several exclusive skull features of this clade, such as the quadrate depression, medial approximation of the prefrontals, rostral extension of palatines (not reaching the level of the rostral margin of suborbital fenestrae), cylindrical dorsal portion of palatine bar, ridge on the ectopterygoid-jugal articulation, and supraoccipital with restricted thin transversal exposure in the caudalmost part of the skull roof. A newly proposed phylogeny of Baurusuchidae encompasses A. sordidus and recently described forms, suggesting its sixter-taxon relationship to Baurusuchus albertoi, within Baurusuchinae. Additionally, the remains of a sphagesaurid crocodyliform were preserved in the abdominal cavity of the new baurusuchid. Direct fossil evidence of behavioral interaction among fossil crocodyliforms is rare and mostly restricted to bite marks resulting from predation, as well as possible conspecific male-to-male aggression. This is the first time that a direct and unmistaken evidence of predation between different taxa of this group is recorded as fossils. This discovery confirms that baurusuchids were top predators of their time, with sphagesaurids occupying a lower trophic position, possibly with a more generalist diet. PMID:24809508
Development and validation of a rapid multi-class method for the confirmation of fourteen prohibited medicinal additives in pig and poultry compound feed by liquid chromatography-tandem mass spectrometry.

PubMed

Cronly, Mark; Behan, P; Foley, B; Malone, E; Earley, S; Gallagher, M; Shearan, P; Regan, L

2010-12-01

A confirmatory method has been developed to allow for the analysis of fourteen prohibited medicinal additives in pig and poultry compound feed. These compounds are prohibited for use as feed additives although some are still authorised for use in medicated feed. Feed samples are extracted by acetonitrile with addition of sodium sulfate. The extracts undergo a hexane wash to aid with sample purification. The extracts are then evaporated to dryness and reconstituted in initial mobile phase. The samples undergo an ultracentrifugation step prior to injection onto the LC-MS/MS system and are analysed in a run time of 26 min. The LC-MS/MS system is run in MRM mode with both positive and negative electrospray ionisation. The method was validated over three days and is capable of quantitatively analysing for metronidazole, dimetridazole, ronidazole, ipronidazole, chloramphenicol, sulfamethazine, dinitolimide, ethopabate, carbadox and clopidol. The method is also capable of qualitatively analysing for sulfadiazine, tylosin, virginiamycin and avilamycin. A level of 100 microg kg(-1) was used for validation purposes and the method is capable of analysing to this level for all the compounds. Validation criteria of trueness, precision, repeatability and reproducibility along with measurement uncertainty are calculated for all analytes. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Development and Validation of a Multimedia-based Assessment of Scientific Inquiry Abilities

NASA Astrophysics Data System (ADS)

Kuo, Che-Yu; Wu, Hsin-Kai; Jen, Tsung-Hau; Hsu, Ying-Shao

2015-09-01

The potential of computer-based assessments for capturing complex learning outcomes has been discussed; however, relatively little is understood about how to leverage such potential for summative and accountability purposes. The aim of this study is to develop and validate a multimedia-based assessment of scientific inquiry abilities (MASIA) to cover a more comprehensive construct of inquiry abilities and target secondary school students in different grades while this potential is leveraged. We implemented five steps derived from the construct modeling approach to design MASIA. During the implementation, multiple sources of evidence were collected in the steps of pilot testing and Rasch modeling to support the validity of MASIA. Particularly, through the participation of 1,066 8th and 11th graders, MASIA showed satisfactory psychometric properties to discriminate students with different levels of inquiry abilities in 101 items in 29 tasks when Rasch models were applied. Additionally, the Wright map indicated that MASIA offered accurate information about students' inquiry abilities because of the comparability of the distributions of student abilities and item difficulties. The analysis results also suggested that MASIA offered precise measures of inquiry abilities when the components (questioning, experimenting, analyzing, and explaining) were regarded as a coherent construct. Finally, the increased mean difficulty thresholds of item responses along with three performance levels across all sub-abilities supported the alignment between our scoring rubrics and our inquiry framework. Together with other sources of validity in the pilot testing, the results offered evidence to support the validity of MASIA.
Validity of Secondary Retail Food Outlet Data

PubMed Central

Fleischhacker, Sheila E.; Evenson, Kelly R.; Sharkey, Joseph; Pitts, Stephanie B.J.; Rodriguez, Daniel A.

2013-01-01

Context Improving access to healthy foods is a promising strategy to prevent nutrition-related chronic diseases. To characterize retail food environments and identify areas with limited retail access, researchers, government programs, and community advocates have primarily used secondary retail food outlet data sources (e.g., InfoUSA or government food registries). To advance the state of the science on measuring retail food environments, this systematic review examined the evidence for validity reported for secondary retail food outlet data sources for characterizing retail food environments. Evidence acquisition A literature search was conducted through December 31, 2012 to identify peer-reviewed published literature that compared secondary retail food outlet data sources to primary data sources (i.e., field observations) for accuracy of identifying the type and location of retail food outlets. Data were analyzed in 2013. Evidence synthesis Nineteen studies met the inclusion criteria. The evidence for validity reported varied by secondary data sources examined, primary data–gathering approaches, retail food outlets examined, and geographic and sociodemographic characteristics. More than half of the studies (53%) did not report evidence for validity by type of food outlet examined and by a particular secondary data source. Conclusions Researchers should strive to gather primary data but if relying on secondary data sources, InfoUSA and government food registries had higher levels of agreement than reported by other secondary data sources and may provide sufficient accuracy for exploring these associations in large study areas. PMID:24050423

Validating workplace performance assessments in health sciences students: a case study from speech pathology.

PubMed

McAllister, Sue; Lincoln, Michelle; Ferguson, Allison; McAllister, Lindy

2013-01-01

Valid assessment of health science students' ability to perform in the real world of workplace practice is critical for promoting quality learning and ultimately certifying students as fit to enter the world of professional practice. Current practice in performance assessment in the health sciences field has been hampered by multiple issues regarding assessment content and process. Evidence for the validity of scores derived from assessment tools are usually evaluated against traditional validity categories with reliability evidence privileged over validity, resulting in the paradoxical effect of compromising the assessment validity and learning processes the assessments seek to promote. Furthermore, the dominant statistical approaches used to validate scores from these assessments fall under the umbrella of classical test theory approaches. This paper reports on the successful national development and validation of measures derived from an assessment of Australian speech pathology students' performance in the workplace. Validation of these measures considered each of Messick's interrelated validity evidence categories and included using evidence generated through Rasch analyses to support score interpretation and related action. This research demonstrated that it is possible to develop an assessment of real, complex, work based performance of speech pathology students, that generates valid measures without compromising the learning processes the assessment seeks to promote. The process described provides a model for other health professional education programs to trial.
Examining the ecological validity of the Talent Development Environment Questionnaire.

PubMed

Martindale, Russell J J; Collins, Dave; Douglas, Carl; Whike, Ally

2013-01-01

It is clear that high class expertise and effective practice exists within many talent development environments across the world. However, there is also a general consensus that widespread evidence-based policy and practice is lacking. As such, it is crucial to develop solutions which can facilitate effective dissemination of knowledge and promotion of evidence-based talent development systems. While the Talent Development Environment Questionnaire (Martindale et al., 2010 ) provides a method through which this could be facilitated, its ecological validity has remained untested. As such, this study aimed to investigate the real world applicability of the questionnaire through discriminant function analysis. Athletes across ten distinct regional squads and academies were identified and separated into two broad levels, 'higher quality' (n = 48) and 'lower quality' (n = 51) environments, based on their process quality and productivity. Results revealed that the Talent Development Environment Questionnaire was able to discriminate with 77.8% accuracy. Furthermore, in addition to the questionnaire as a whole, two individual features, 'quality preparation' (P < 0.01) and 'understanding the athlete' (P < 0.01), were found to be significant discriminators. In conclusion, the results indicate robust structural properties and sound ecological validity, allowing the questionnaire to be used with more confidence in applied and research settings.
The validity and clinical utility of purging disorder.

PubMed

Keel, Pamela K; Striegel-Moore, Ruth H

2009-12-01

To review evidence of the validity and clinical utility of Purging Disorder and examine options for the Diagnostic and Statistical Manual of Mental Disorders fifth edition (DSM-V). Articles were identified by computerized and manual searches and reviewed to address five questions about Purging Disorder: Is there "ample" literature? Is the syndrome clearly defined? Can it be measured and diagnosed reliably? Can it be differentiated from other eating disorders? Is there evidence of syndrome validity? Although empirical classification and concurrent validity studies provide emerging support for the distinctiveness of Purging Disorder, questions remain about definition, diagnostic reliability in clinical settings, and clinical utility (i.e., prognostic validity). We discuss strengths and weaknesses associated with various options for the status of Purging Disorder in the DSM-V ranging from making no changes from DSM-IV to designating Purging Disorder a diagnosis on equal footing with Anorexia Nervosa and Bulimia Nervosa.
The validity of upper-limb neurodynamic tests for detecting peripheral neuropathic pain.

PubMed

Nee, Robert J; Jull, Gwendolen A; Vicenzino, Bill; Coppieters, Michel W

2012-05-01

The validity of upper-limb neurodynamic tests (ULNTs) for detecting peripheral neuropathic pain (PNP) was assessed by reviewing the evidence on plausibility, the definition of a positive test, reliability, and concurrent validity. Evidence was identified by a structured search for peer-reviewed articles published in English before May 2011. The quality of concurrent validity studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies tool, where appropriate. Biomechanical and experimental pain data support the plausibility of ULNTs. Evidence suggests that a positive ULNT should at least partially reproduce the patient's symptoms and that structural differentiation should change these symptoms. Data indicate that this definition of a positive ULNT is reliable when used clinically. Limited evidence suggests that the median nerve test, but not the radial nerve test, helps determine whether a patient has cervical radiculopathy. The median nerve test does not help diagnose carpal tunnel syndrome. These findings should be interpreted cautiously, because diagnostic accuracy might have been distorted by the investigators' definitions of a positive ULNT. Furthermore, patients with PNP who presented with increased nerve mechanosensitivity rather than conduction loss might have been incorrectly classified by electrophysiological reference standards as not having PNP. The only evidence for concurrent validity of the ulnar nerve test was a case study on cubital tunnel syndrome. We recommend that researchers develop more comprehensive reference standards for PNP to accurately assess the concurrent validity of ULNTs and continue investigating the predictive validity of ULNTs for prognosis or treatment response.
[Validity evidence of the Health-Related Quality of Life for Drug Abusers Test based on the Biaxial Model of Addiction].

PubMed

Lozano, Oscar M; Rojas, Antonio J; Pérez, Cristino; González-Sáiz, Francisco; Ballesta, Rosario; Izaskun, Bilbao

2008-05-01

The aim of this work is to show evidence of the validity of the Health-Related Quality of Life for Drug Abusers Test (HRQoLDA Test). This test was developed to measure specific HRQoL for drugs abusers, within the theoretical addiction framework of the biaxial model. The sample comprised 138 patients diagnosed with opiate drug dependence. In this study, the following constructs and variables of the biaxial model were measured: severity of dependence, physical health status, psychological adjustment and substance consumption. Results indicate that the HRQoLDA Test scores are related to dependency and consumption-related problems. Multiple regression analysis reveals that HRQoL can be predicted from drug dependence, physical health status and psychological adjustment. These results contribute empirical evidence of the theoretical relationships established between HRQoL and the biaxial model, and they support the interpretation of the HRQoLDA Test to measure HRQoL in drug abusers, thus providing a test to measure this specific construct in this population.
Reliability and concurrent and construct validity of the Strategies for Weight Management measure for adults.

PubMed

Kolodziejczyk, Julia K; Norman, Gregory J; Rock, Cheryl L; Arredondo, Elva M; Roesch, Scott C; Madanat, Hala; Patrick, Kevin

2016-01-01

This study evaluates the reliability and validity of the strategies for weight management (SWM) measure, a questionnaire that assesses weight management strategies for adults. The SWM includes 20 items that are categorized within the following subscales: (1) energy intake, (2) energy expenditure, (3) self-monitoring, and (4) self-regulation. Baseline and 6-month data were collected from 404 overweight/obese adults (mean age=22±3.8 years, 68% ethnic minority) enrolled in a randomized controlled trial aiming to reduce weight by improving diet and physical activity behaviours. Reliability and validity were assessed for each subscale separately. Cronbach alpha was conducted to assess reliability. Concurrent, construct I (sensitivity to the study treatment condition), and construct II (relationship to the outcomes) validity were assessed using linear regressions with the following outcome measures: weight, self-reported diet, and weekly energy expenditure. All subscales showed strong internal consistency. The strength of the validity evidence depended on subscale and validity type. The strongest validity evidence was concurrent validity of the energy intake and energy expenditure subscales; construct I validity of the energy intake and self-monitoring subscales; and construct II validity of the energy intake, energy expenditure, and self-regulation subscales. Results indicate that the SWM can be used to assess weight management strategies among an ethnically diverse sample of adults as each subscale showed evidence of reliability and select types of validity. As validity is an accumulation of evidence over multiple studies, this study provides initial reliability and validity evidence in one population segment. Copyright © 2015 Asia Oceania Association for the Study of Obesity. Published by Elsevier Ltd. All rights reserved.
Job Embeddedness Demonstrates Incremental Validity When Predicting Turnover Intentions for Australian University Employees

PubMed Central

Heritage, Brody; Gilbert, Jessica M.; Roberts, Lynne D.

2016-01-01

Job embeddedness is a construct that describes the manner in which employees can be enmeshed in their jobs, reducing their turnover intentions. Recent questions regarding the properties of quantitative job embeddedness measures, and their predictive utility, have been raised. Our study compared two competing reflective measures of job embeddedness, examining their convergent, criterion, and incremental validity, as a means of addressing these questions. Cross-sectional quantitative data from 246 Australian university employees (146 academic; 100 professional) was gathered. Our findings indicated that the two compared measures of job embeddedness were convergent when total scale scores were examined. Additionally, job embeddedness was capable of demonstrating criterion and incremental validity, predicting unique variance in turnover intention. However, this finding was not readily apparent with one of the compared job embeddedness measures, which demonstrated comparatively weaker evidence of validity. We discuss the theoretical and applied implications of these findings, noting that job embeddedness has a complementary place among established determinants of turnover intention. PMID:27199817
The role of evidence based medicine in neurotrauma.

PubMed

Honeybul, S; Ho, K M

2015-04-01

The introduction of evidence based medicine de-emphasised clinical experience and so-called "background information" and stressed the importance of evidence gained from clinical research when making clinical decisions. For many years randomised controlled trials have been seen to be the only way to advance clinical practice, however, applying this methodology in the context of severe trauma can be problematic. In addition, it is increasingly recognised that considerable clinical experience is required in order to critically evaluate the quality of the evidence and the validity of the conclusions as presented. A contemporary example is seen when considering the role of decompressive craniectomy in the management of neurotrauma. Although there is a considerable amount of evidence available attesting to the efficacy of the procedure, considerable clinical expertise is required in order to properly interpret the results of these studies and the implications for clinical practice. Given these limitations the time may have come for a redesign of the traditional pyramid of evidence, to a model that re-emphasises the importance of "background information" such as pathophysiology and acknowledges the role of clinical experience such that the evidence can be critically evaluated in its appropriate context and the subsequent implications for clinical practice be clearly and objectively defined. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.
Validating for Use and Interpretation: A Mixed Methods Contribution Illustrated

ERIC Educational Resources Information Center

Morell, Linda; Tan, Rachael Jin Bee

2009-01-01

Researchers in the areas of psychology and education strive to understand the intersections among validity, educational measurement, and cognitive theory. Guided by a mixed model conceptual framework, this study investigates how respondents' opinions inform the validation argument. Validity evidence for a science assessment was collected through…
Initial Evidence for the Reliability and Validity of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors at the Elementary Level

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy P.; Harris, Pamela J.; Menzies, Holly Mariah; Cox, Meredith; Lambert, Warren

2012-01-01

We report findings of an exploratory validation study of a revised instrument: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE). The SRSS-IE was modified to include seven additional items reflecting characteristics of internalizing behaviors, with proposed items generated from the current literature base, review of…
The Trojan Lifetime Champions Health Survey: development, validity, and reliability.

PubMed

Sorenson, Shawn C; Romano, Russell; Scholefield, Robin M; Schroeder, E Todd; Azen, Stanley P; Salem, George J

2015-04-01

Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Descriptive laboratory study. A large National Collegiate Athletic Association Division I university. A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite
Validity of the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire.

PubMed

João, Thaís Moreira São; Rodrigues, Roberta Cunha Matheus; Gallani, Maria Cecília Bueno Jayme; Miura, Cinthya Tamie Passos; Domingues, Gabriela de Barros Leite; Amireault, Steve; Godin, Gaston

2015-09-01

This study provides evidence of construct validity for the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire (GSLTPAQ), a 1-item instrument used among 236 participants referred for cardiopulmonary exercise testing. The Baecke Habitual Physical Activity Questionnaire (Baecke-HPA) was used to evaluate convergent and divergent validity. The self-reported measure of walking (QCAF) evaluated the convergent validity. Cardiorespiratory fitness assessed convergent validity by the Veterans Specific Activity Questionnaire (VSAQ), peak measured (VO2peak) and maximum predicted (VO2pred) oxygen uptake. Partial adjusted correlation coefficients between the GSLTPAQ, Baecke-HPA, QCAF, VO2pred and VSAQ provided evidence for convergent validity; while divergent validity was supported by the absence of correlations between the GSLTPAQ and the Occupational Physical Activity domain (Baecke-HPA). The GSLTPAQ presents level 3 of evidence of construct validity and may be useful to assess leisure-time physical activity among patients with cardiovascular disease and healthy individuals.
Validation of the Juhnke-Balkin Life Balance Inventory

ERIC Educational Resources Information Center

Davis, R. J.; Balkin, Richard S.; Juhnke, Gerald A.

2014-01-01

Life balance is an important construct within the counseling profession. A validation study utilizing exploratory factor analysis and multiple regression was conducted on the Juhnke-Balkin Life Balance Inventory. Results from the study serve as evidence of validity for an assessment instrument designed to measure life balance.
The predictive validity of selection for entry into postgraduate training in general practice: evidence from three longitudinal studies.

PubMed

Patterson, Fiona; Lievens, Filip; Kerrin, Máire; Munro, Neil; Irish, Bill

2013-11-01

The selection methodology for UK general practice is designed to accommodate several thousand applicants per year and targets six core attributes identified in a multi-method job-analysis study To evaluate the predictive validity of selection methods for entry into postgraduate training, comprising a clinical problem-solving test, a situational judgement test, and a selection centre. A three-part longitudinal predictive validity study of selection into training for UK general practice. In sample 1, participants were junior doctors applying for training in general practice (n = 6824). In sample 2, participants were GP registrars 1 year into training (n = 196). In sample 3, participants were GP registrars sitting the licensing examination after 3 years, at the end of training (n = 2292). The outcome measures include: assessor ratings of performance in a selection centre comprising job simulation exercises (sample 1); supervisor ratings of trainee job performance 1 year into training (sample 2); and licensing examination results, including an applied knowledge examination and a 12-station clinical skills objective structured clinical examination (OSCE; sample 3). Performance ratings at selection predicted subsequent supervisor ratings of job performance 1 year later. Selection results also significantly predicted performance on both the clinical skills OSCE and applied knowledge examination for licensing at the end of training. In combination, these longitudinal findings provide good evidence of the predictive validity of the selection methods, and are the first reported for entry into postgraduate training. Results show that the best predictor of work performance and training outcomes is a combination of a clinical problem-solving test, a situational judgement test, and a selection centre. Implications for selection methods for all postgraduate specialties are considered.
Validity in work-based assessment: expanding our horizons.

PubMed

Govaerts, Marjan; van der Vleuten, Cees P M

2013-12-01

Although work-based assessments (WBA) may come closest to assessing habitual performance, their use for summative purposes is not undisputed. Most criticism of WBA stems from approaches to validity consistent with the quantitative psychometric framework. However, there is increasing research evidence that indicates that the assumptions underlying the predictive, deterministic framework of psychometrics may no longer hold. In this discussion paper we argue that meaningfulness and appropriateness of current validity evidence can be called into question and that we need alternative strategies to assessment and validity inquiry that build on current theories of learning and performance in complex and dynamic workplace settings. Drawing from research in various professional fields we outline key issues within the mechanisms of learning, competence and performance in the context of complex social environments and illustrate their relevance to WBA. In reviewing recent socio-cultural learning theory and research on performance and performance interpretations in work settings, we demonstrate that learning, competence (as inferred from performance) as well as performance interpretations are to be seen as inherently contextualised, and can only be under-stood 'in situ'. Assessment in the context of work settings may, therefore, be more usefully viewed as a socially situated interpretive act. We propose constructivist-interpretivist approaches towards WBA in order to capture and understand contextualised learning and performance in work settings. Theoretical assumptions underlying interpretivist assessment approaches call for a validity theory that provides the theoretical framework and conceptual tools to guide the validation process in the qualitative assessment inquiry. Basic principles of rigour specific to qualitative research have been established, and they can and should be used to determine validity in interpretivist assessment approaches. If used properly, these
Development of measurable indicators to enhance public health evidence-informed policy-making.

PubMed

Tudisca, Valentina; Valente, Adriana; Castellani, Tommaso; Stahl, Timo; Sandu, Petru; Dulf, Diana; Spitters, Hilde; Van de Goor, Ien; Radl-Karimi, Christina; Syed, Mohamed Ahmed; Loncarevic, Natasa; Lau, Cathrine Juel; Roelofs, Susan; Bertram, Maja; Edwards, Nancy; Aro, Arja R

2018-05-31

Ensuring health policies are informed by evidence still remains a challenge despite efforts devoted to this aim. Several tools and approaches aimed at fostering evidence-informed policy-making (EIPM) have been developed, yet there is a lack of availability of indicators specifically devoted to assess and support EIPM. The present study aims to overcome this by building a set of measurable indicators for EIPM intended to infer if and to what extent health-related policies are, or are expected to be, evidence-informed for the purposes of policy planning as well as formative and summative evaluations. The indicators for EIPM were developed and validated at international level by means of a two-round internet-based Delphi study conducted within the European project 'REsearch into POlicy to enhance Physical Activity' (REPOPA). A total of 82 researchers and policy-makers from the six European countries (Denmark, Finland, Italy, the Netherlands, Romania, the United Kingdom) involved in the project and international organisations were asked to evaluate the relevance and feasibility of an initial set of 23 indicators developed by REPOPA researchers on the basis of literature and knowledge gathered from the previous phases of the project, and to propose new indicators. The first Delphi round led to the validation of 14 initial indicators and to the development of 8 additional indicators based on panellists' suggestions; the second round led to the validation of a further 11 indicators, including 6 proposed by panellists, and to the rejection of 6 indicators. A total of 25 indicators were validated, covering EIPM issues related to human resources, documentation, participation and monitoring, and stressing different levels of knowledge exchange and involvement of researchers and other stakeholders in policy development and evaluation. The study overcame the lack of availability of indicators to assess if and to what extent policies are realised in an evidence-informed manner
Reliability and Validity of Instruments for Assessing Perinatal Depression in African Settings: Systematic Review and Meta-Analysis

PubMed Central

Tsai, Alexander C.; Scott, Jennifer A.; Hung, Kristin J.; Zhu, Jennifer Q.; Matthews, Lynn T.; Psaros, Christina; Tomlinson, Mark

2013-01-01

Background A major barrier to improving perinatal mental health in Africa is the lack of locally validated tools for identifying probable cases of perinatal depression or for measuring changes in depression symptom severity. We systematically reviewed the evidence on the reliability and validity of instruments to assess perinatal depression in African settings. Methods and Findings Of 1,027 records identified through searching 7 electronic databases, we reviewed 126 full-text reports. We included 25 unique studies, which were disseminated in 26 journal articles and 1 doctoral dissertation. These enrolled 12,544 women living in nine different North and sub-Saharan African countries. Only three studies (12%) used instruments developed specifically for use in a given cultural setting. Most studies provided evidence of criterion-related validity (20 [80%]) or reliability (15 [60%]), while fewer studies provided evidence of construct validity, content validity, or internal structure. The Edinburgh postnatal depression scale (EPDS), assessed in 16 studies (64%), was the most frequently used instrument in our sample. Ten studies estimated the internal consistency of the EPDS (median estimated coefficient alpha, 0.84; interquartile range, 0.71-0.87). For the 14 studies that estimated sensitivity and specificity for the EPDS, we constructed 2 x 2 tables for each cut-off score. Using a bivariate random-effects model, we estimated a pooled sensitivity of 0.94 (95% confidence interval [CI], 0.68-0.99) and a pooled specificity of 0.77 (95% CI, 0.59-0.88) at a cut-off score of ≥9, with higher cut-off scores yielding greater specificity at the cost of lower sensitivity. Conclusions The EPDS can reliably and validly measure perinatal depression symptom severity or screen for probable postnatal depression in African countries, but more validation studies on other instruments are needed. In addition, more qualitative research is needed to adequately characterize local
Evidence for the Criterion Validity and Clinical Utility of the Pathological Narcissism Inventory

ERIC Educational Resources Information Center

Thomas, Katherine M.; Wright, Aidan G. C.; Lukowitsky, Mark R.; Donnellan, M. Brent; Hopwood, Christopher J.

2012-01-01

In this study, the authors evaluated aspects of criterion validity and clinical utility of the grandiosity and vulnerability components of the Pathological Narcissism Inventory (PNI) using two undergraduate samples (N = 299 and 500). Criterion validity was assessed by evaluating the correlations of narcissistic grandiosity and narcissistic…
Electronic self-monitoring of mood using IT platforms in adult patients with bipolar disorder: A systematic review of the validity and evidence.

PubMed

Faurholt-Jepsen, Maria; Munkholm, Klaus; Frost, Mads; Bardram, Jakob E; Kessing, Lars Vedel

2016-01-15

Various paper-based mood charting instruments are used in the monitoring of symptoms in bipolar disorder. During recent years an increasing number of electronic self-monitoring tools have been developed. The objectives of this systematic review were 1) to evaluate the validity of electronic self-monitoring tools as a method of evaluating mood compared to clinical rating scales for depression and mania and 2) to investigate the effect of electronic self-monitoring tools on clinically relevant outcomes in bipolar disorder. A systematic review of the scientific literature, reported according to the Preferred Reporting items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines was conducted. MEDLINE, Embase, PsycINFO and The Cochrane Library were searched and supplemented by hand search of reference lists. Databases were searched for 1) studies on electronic self-monitoring tools in patients with bipolar disorder reporting on validity of electronically self-reported mood ratings compared to clinical rating scales for depression and mania and 2) randomized controlled trials (RCT) evaluating electronic mood self-monitoring tools in patients with bipolar disorder. A total of 13 published articles were included. Seven articles were RCTs and six were longitudinal studies. Electronic self-monitoring of mood was considered valid compared to clinical rating scales for depression in six out of six studies, and in two out of seven studies compared to clinical rating scales for mania. The included RCTs primarily investigated the effect of heterogeneous electronically delivered interventions; none of the RCTs investigated the sole effect of electronic mood self-monitoring tools. Methodological issues with risk of bias at different levels limited the evidence in the majority of studies. Electronic self-monitoring of mood in depression appears to be a valid measure of mood in contrast to self-monitoring of mood in mania. There are yet few studies on the effect of electronic
Why the Evidence-Based Paradigm in Early Childhood Education and Care Is Anything but Evident

ERIC Educational Resources Information Center

Vandenbroeck, Michel; Roets, Griet; Roose, Rudi

2012-01-01

Praxeological research is a necessary contribution to the research field in early childhood education and care, which is currently dominated by an evidence-based paradigm that tends to consider the measurement of predefined outcomes as the most valid form of research. We analyse the history of the evidence-based paradigm in the field of medicine…

Construct validity of the MMPI-2 College Maladjustment (Mt) Scale.

PubMed

Barthlow, Deanna L; Graham, John R; Ben-Porath, Yossef S; McNulty, John L

2004-09-01

The construct validity of the MMPI-2 (Minnesota Multiphasic Personality Inventory-2) College Maladjustment (Mt) Scale was examined using 376 student clients at a university psychological clinic. A principal components analysis and correlations of Mt scale scores with clients' and therapists' ratings of symptoms and functioning showed that the Mt scale identifies the presence of maladjustment as defined in terms of depressive and anxious symptoms. There is no evidence to show that the scale is specific to college students or that it is sensitive to severe psychological disturbance. The Mt scale does not inform the clinician as to why a person is distressed. In addition, there is no evidence from this study to suggest the superiority of the Mt scale over other MMPI-2 maladjustment measures. Therapists should use the entire MMPI-2 profile, not just the Mt scale, to gain the most comprehensive and specific understanding of clients.
Synthesizing Quantitative Evidence for Evidence-based Nursing: Systematic Review.

PubMed

Oh, Eui Geum

2016-06-01

As evidence-based practice has become an important issue in healthcare settings, the educational needs for knowledge and skills for the generation and utilization of healthcare evidence are increasing. Systematic review (SR), a way of evidence generation, is a synthesis of primary scientific evidence, which summarizes the best evidence on a specific clinical question using a transparent, a priori protocol driven approach. SR methodology requires a critical appraisal of primary studies, data extraction in a reliable and repeatable way, and examination for validity of the results. SRs are considered hierarchically as the highest form of evidence as they are a systematic search, identification, and summarization of the available evidence to answer a focused clinical question with particular attention to the methodological quality of studies or the credibility of opinion and text. The purpose of this paper is to introduce an overview of the fundamental knowledge, principals and processes in SR. The focus of this paper is on SR especially for the synthesis of quantitative data from primary research studies that examines the effectiveness of healthcare interventions. To activate evidence-based nursing care in various healthcare settings, the best and available scientific evidence are essential components. This paper will include some examples to promote understandings. Copyright © 2016. Published by Elsevier B.V.
Financial Decision-making Abilities and Financial Exploitation in Older African Americans: Preliminary Validity Evidence for the Lichtenberg Financial Decision Rating Scale (LFDRS)

PubMed Central

Ficker, Lisa J.; Rahman-Filipiak, Annalise

2015-01-01

This study examines preliminary evidence for the Lichtenberg Financial Decision Rating Scale (LFDRS), a new person-centered approach to assessing capacity to make financial decisions, and its relationship to self-reported cases of financial exploitation in 69 older African Americans. More than one third of individuals reporting financial exploitation also had questionable decisional abilities. Overall, decisional ability score and current decision total were significantly associated with cognitive screening test and financial ability scores, demonstrating good criterion validity. Financially exploited individuals, and non-exploited individuals, showed mean group differences on the Mini Mental State Exam, Financial Situational Awareness, Psychological Vulnerability, Current Decisional Ability, and Susceptibility to undue influence subscales, and Total Lichtenberg Financial Decision Rating Scale Score. Study findings suggest that impaired decisional abilities may render older adults more vulnerable to financial exploitation, and that the LFDRS is a valid tool for measuring both decisional abilities and financial exploitation. PMID:26285038
Utilization of Titanium Particle Impact Location to Validate a 3D Multicomponent Model for Cold Spray Additive Manufacturing

NASA Astrophysics Data System (ADS)

Faizan-Ur-Rab, M.; Zahiri, S. H.; King, P. C.; Busch, C.; Masood, S. H.; Jahedi, M.; Nagarajah, R.; Gulizia, S.

2017-12-01

Cold spray is a solid-state rapid deposition technology in which metal powder is accelerated to supersonic speeds within a de Laval nozzle and then impacts onto the surface of a substrate. It is possible for cold spray to build thick structures, thus providing an opportunity for melt-less additive manufacturing. Image analysis of particle impact location and focused ion beam dissection of individual particles were utilized to validate a 3D multicomponent model of cold spray. Impact locations obtained using the 3D model were found to be in close agreement with the empirical data. Moreover, the 3D model revealed the particles' velocity and temperature just before impact—parameters which are paramount for developing a full understanding of the deposition process. Further, it was found that the temperature and velocity variations in large-size particles before impact were far less than for the small-size particles. Therefore, an optimal particle temperature and velocity were identified, which gave the highest deformation after impact. The trajectory of the particles from the injection point to the moment of deposition in relation to propellant gas is visualized. This detailed information is expected to assist with the optimization of the deposition process, contributing to improved mechanical properties for additively manufactured cold spray titanium parts.
Measuring Long-Distance Romantic Relationships: A Validity Study

ERIC Educational Resources Information Center

Pistole, M. Carole; Roberts, Amber

2011-01-01

This study investigated aspects of construct validity for the scores of a new long-distance romantic relationship measure. A single-factor structure of the long-distance romantic relationship index emerged, with convergent and discriminant evidence of external validity, high internal consistency reliability, and applied utility of the scores.…
Validating Grammaticality Judgment Tests: Evidence from Two New Psycholinguistic Measures

ERIC Educational Resources Information Center

Vafaee, Payman; Suzuki, Yuichi; Kachisnke, Ilina

2017-01-01

Several previous factor-analytic studies on the construct validity of grammaticality judgment tests (GJTs) concluded that untimed GJTs measure explicit knowledge (EK) and timed GJTs measure implicit knowledge (IK) (Bowles, 2011; R. Ellis, 2005; R. Ellis & Loewen, 2007). It has also been shown that, irrespective of the time condition chosen,…
Validating the cross-cultural factor structure and invariance property of the Insomnia Severity Index: evidence based on ordinal EFA and CFA.

PubMed

Chen, Po-Yi; Yang, Chien-Ming; Morin, Charles M

2015-05-01

The purpose of this study is to examine the factor structure of the Insomnia Severity Index (ISI) across samples recruited from different countries. We tried to identify the most appropriate factor model for the ISI and further examined the measurement invariance property of the ISI across samples from different countries. Our analyses included one data set collected from a Taiwanese sample and two data sets obtained from samples in Hong Kong and Canada. The data set collected in Taiwan was analyzed with ordinal exploratory factor analysis (EFA) to obtain the appropriate factor model for the ISI. After that, we conducted a series of confirmatory factor analyses (CFAs), which is a special case of the structural equation model (SEM) that concerns the parameters in the measurement model, to the statistics collected in Canada and Hong Kong. The purposes of these CFA were to cross-validate the result obtained from EFA and further examine the cross-cultural measurement invariance of the ISI. The three-factor model outperforms other models in terms of global fit indices in Taiwan's population. Its external validity is also supported by confirmatory factor analyses. Furthermore, the measurement invariance analyses show that the strong invariance property between the samples from different cultures holds, providing evidence that the ISI results obtained in different cultures are comparable. The factorial validity of the ISI is stable in different populations. More importantly, its invariance property across cultures suggests that the ISI is a valid measure of the insomnia severity construct across countries. Copyright © 2014 Elsevier B.V. All rights reserved.
Therapeutic Misconception in Research Subjects: Development and Validation of a Measure

PubMed Central

Appelbaum, Paul S.; Anatchkova, Milena; Albert, Karen; Dunn, Laura B.; Lidz, Charles W.

2013-01-01

��gold standard” clinical interview is modest, although similar to other instruments based on self-report assessing states of mind rather than discrete symptoms. Thus, although the scale can offer evidence of which subjects are at risk for distortions in their decisions and to what degree, it will not allow researchers to conclude definitively that TM is present in a given subject. Conclusions The development of a reliable and valid TM scale, even with modest predictive power, should permit investigators in clinical trials to identify subjects with tendencies to misinterpret the nature of the situation and to provide additional information to them. It should also stimulate research on how best to decrease TM and facilitate meaningful informed consent to clinical research. PMID:22942217
Clinical Evidence: a useful tool for promoting evidence-based practice?

PubMed

Formoso, Giulio; Moja, Lorenzo; Nonino, Francesco; Dri, Pietro; Addis, Antonio; Martini, Nello; Liberati, Alessandro

2003-12-23

Research has shown that many healthcare professionals have problems with guidelines as they would prefer to be given all relevant information relevant to decision-making rather than being told what they should do. This study assesses doctors' judgement of the validity, relevance, clarity and usability of the Italian translation of Clinical Evidence (CE) after its free distribution launched by the Italian Ministry of Health. Opinions elicited using a standardised questionnaire delivered either by mail or during educational or professional meetings. Twenty percent (n = 1350) doctors participated the study. Most of them found CE's content valid, useful and relevant for their clinical practice, and said CE can foster communications among clinicians, particularly among GPs and specialists. Hospital doctors (63%) more often than GPs (48%) read the detailed presentation of individual chapters. Twenty-nine percent said CE brought changes in their clinical practice. Doctors appreciated CE's nature of an evidence-based information compendium and would have not preferred a collection of practice guidelines. Overall, the pilot initiative launched by the Italian Ministry of Health seems to have been well received and to support the subsequent decision to make the Italian edition of Clinical Evidence concise available to all doctors practising in the country. Local implementation initiatives should be warranted to favour doctor's use of CE.
Focused Evidence Review: Psychometric Properties of Patient-Reported Outcome Measures for Chronic Musculoskeletal Pain.

PubMed

Goldsmith, Elizabeth S; Taylor, Brent C; Greer, Nancy; Murdoch, Maureen; MacDonald, Roderick; McKenzie, Lauren; Rosebush, Christina E; Wilt, Timothy J

2018-05-01

Developing successful interventions for chronic musculoskeletal pain requires valid, responsive, and reliable outcome measures. The Minneapolis VA Evidence-based Synthesis Program completed a focused evidence review on key psychometric properties of 17 self-report measures of pain severity and pain-related functional impairment suitable for clinical research on chronic musculoskeletal pain. Pain experts of the VA Pain Measurement Outcomes Workgroup identified 17 pain measures to undergo systematic review. In addition to a MEDLINE search on these 17 measures (1/2000-1/2017), we hand-searched (without publication date limits) the reference lists of all included studies, prior systematic reviews, and-when available-Web sites dedicated to each measure (PROSPERO registration CRD42017056610). Our primary outcome was the measure's minimal important difference (MID). Secondary outcomes included responsiveness, validity, and test-retest reliability. Outcomes were synthesized through evidence mapping and qualitative comparison. Of 1635 abstracts identified, 331 articles underwent full-text review, and 43 met inclusion criteria. Five measures (Oswestry Disability Index (ODI), Roland-Morris Disability Questionnaire (RMDQ), SF-36 Bodily Pain Scale (SF-36 BPS), Numeric Rating Scale (NRS), and Visual Analog Scale (VAS)) had data reported on MID, responsiveness, validity, and test-retest reliability. Seven measures had data reported on three of the four psychometric outcomes. Eight measures had reported MIDs, though estimation methods differed substantially and often were not clinically anchored. In this focused evidence review, the most evidence on key psychometric properties in chronic musculoskeletal pain populations was found for the ODI, RMDQ, SF-36 BPS, NRS, and VAS. Key limitations in the field include substantial variation in methods of estimating psychometric properties, defining chronic musculoskeletal pain, and reporting patient demographics. Registered in the PROSPERO
The cross-cultural validity of posttraumatic stress disorder: implications for DSM-5.

PubMed

Hinton, Devon E; Lewis-Fernández, Roberto

2011-09-01

There is considerable debate about the cross-cultural applicability of the posttraumatic stress disorder (PTSD) category as currently specified. Concerns include the possible status of PTSD as a Western culture-bound disorder and the validity of individual items and criteria thresholds. This review examines various types of cross-cultural validity of the PTSD criteria as defined in DSM-IV-TR, and presents options and preliminary recommendations to be considered for DSM-5. Searches were conducted of the mental health literature, particularly since 1994, regarding cultural-, race-, or ethnicity-related factors that might limit the universal applicability of the diagnostic criteria of PTSD in DSM-IV-TR and the possible criteria for DSM-5. Substantial evidence of the cross-cultural validity of PTSD was found. However, evidence of cross-cultural variability in certain areas suggests the need for further research: the relative salience of avoidance/numbing symptoms, the role of the interpretation of trauma-caused symptoms in shaping symptomatology, and the prevalence of somatic symptoms. This review also indicates the need to modify certain criteria, such as the items on distressing dreams and on foreshortened future, to increase their cross-cultural applicability. Text additions are suggested to increase the applicability of the manual across cultural contexts: specifying that cultural syndromes-such as those indicated in the DSM-IV-TR Glossary-may be a prominent part of the trauma response in certain cultures, and that those syndromes may influence PTSD symptom salience and comorbidity. The DSM-IV-TR PTSD category demonstrates various types of validity. Criteria modification and textual clarifications are suggested to further improve its cross-cultural applicability. © 2010 Wiley-Liss, Inc.
Test Takers and the Validity of Score Interpretations

ERIC Educational Resources Information Center

Kopriva, Rebecca J.; Thurlow, Martha L.; Perie, Marianne; Lazarus, Sheryl S.; Clark, Amy

2016-01-01

This article argues that test takers are as integral to determining validity of test scores as defining target content and conditioning inferences on test use. A principled sustained attention to how students interact with assessment opportunities is essential, as is a principled sustained evaluation of evidence confirming the validity or calling…
Traditional Masculinity and Femininity: Validation of a New Scale Assessing Gender Roles.

PubMed

Kachel, Sven; Steffens, Melanie C; Niedlich, Claudia

2016-01-01

Gender stereotype theory suggests that men are generally perceived as more masculine than women, whereas women are generally perceived as more feminine than men. Several scales have been developed to measure fundamental aspects of gender stereotypes (e.g., agency and communion, competence and warmth, or instrumentality and expressivity). Although omitted in later version, Bem's original Sex Role Inventory included the items "masculine" and "feminine" in addition to more specific gender-stereotypical attributes. We argue that it is useful to be able to measure these two core concepts in a reliable, valid, and parsimonious way. We introduce a new and brief scale, the Traditional Masculinity-Femininity (TMF) scale, designed to assess central facets of self-ascribed masculinity-femininity. Studies 1-2 used known-groups approaches (participants differing in gender and sexual orientation) to validate the scale and provide evidence of its convergent validity. As expected the TMF reliably measured a one-dimensional masculinity-femininity construct. Moreover, the TMF correlated moderately with other gender-related measures. Demonstrating incremental validity, the TMF predicted gender and sexual orientation in a superior way than established adjective-based measures. Furthermore, the TMF was connected to criterion characteristics, such as judgments as straight by laypersons for the whole sample, voice pitch characteristics for the female subsample, and contact to gay men for the male subsample, and outperformed other gender-related scales. Taken together, as long as gender differences continue to exist, we suggest that the TMF provides a valuable methodological addition for research into gender stereotypes.
Traditional Masculinity and Femininity: Validation of a New Scale Assessing Gender Roles

PubMed Central

Kachel, Sven; Steffens, Melanie C.; Niedlich, Claudia

2016-01-01

Gender stereotype theory suggests that men are generally perceived as more masculine than women, whereas women are generally perceived as more feminine than men. Several scales have been developed to measure fundamental aspects of gender stereotypes (e.g., agency and communion, competence and warmth, or instrumentality and expressivity). Although omitted in later version, Bem's original Sex Role Inventory included the items “masculine” and “feminine” in addition to more specific gender-stereotypical attributes. We argue that it is useful to be able to measure these two core concepts in a reliable, valid, and parsimonious way. We introduce a new and brief scale, the Traditional Masculinity-Femininity (TMF) scale, designed to assess central facets of self-ascribed masculinity-femininity. Studies 1–2 used known-groups approaches (participants differing in gender and sexual orientation) to validate the scale and provide evidence of its convergent validity. As expected the TMF reliably measured a one-dimensional masculinity-femininity construct. Moreover, the TMF correlated moderately with other gender-related measures. Demonstrating incremental validity, the TMF predicted gender and sexual orientation in a superior way than established adjective-based measures. Furthermore, the TMF was connected to criterion characteristics, such as judgments as straight by laypersons for the whole sample, voice pitch characteristics for the female subsample, and contact to gay men for the male subsample, and outperformed other gender-related scales. Taken together, as long as gender differences continue to exist, we suggest that the TMF provides a valuable methodological addition for research into gender stereotypes. PMID:27458394
A hypothesis-driven physical examination learning and assessment procedure for medical students: initial validity evidence.

PubMed

Yudkowsky, Rachel; Otaki, Junji; Lowenstein, Tali; Riddle, Janet; Nishigori, Hiroshi; Bordage, Georges

2009-08-01

Diagnostic accuracy is maximised by having clinical signs and diagnostic hypotheses in mind during the physical examination (PE). This diagnostic reasoning approach contrasts with the rote, hypothesis-free screening PE learned by many medical students. A hypothesis-driven PE (HDPE) learning and assessment procedure was developed to provide targeted practice and assessment in anticipating, eliciting and interpreting critical aspects of the PE in the context of diagnostic challenges. This study was designed to obtain initial content validity evidence, performance and reliability estimates, and impact data for the HDPE procedure. Nineteen clinical scenarios were developed, covering 160 PE manoeuvres. A total of 66 Year 3 medical students prepared for and encountered three clinical scenarios during required formative assessments. For each case, students listed anticipated positive PE findings for two plausible diagnoses before examining the patient; examined a standardised patient (SP) simulating one of the diagnoses; received immediate feedback from the SP, and documented their findings and working diagnosis. The same students later encountered some of the scenarios during their Year 4 clinical skills examination. On average, Year 3 students anticipated 65% of the positive findings, correctly performed 88% of the PE manoeuvres and documented 61% of the findings. Year 4 students anticipated and elicited fewer findings overall, but achieved proportionally more discriminating findings, thereby more efficiently achieving a diagnostic accuracy equivalent to that of students in Year 3. Year 4 students performed better on cases on which they had received feedback as Year 3 students. Twelve cases would provide a reliability of 0.80, based on discriminating checklist items only. The HDPE provided medical students with a thoughtful, deliberate approach to learning and assessing PE skills in a valid and reliable manner.
Validation of sterilizing grade filtration.

PubMed

Jornitz, M W; Meltzer, T H

2003-01-01

Validation consideration of sterilizing grade filters, namely 0.2 micron, changed when FDA voiced concerns about the validity of Bacterial Challenge tests performed in the past. Such validation exercises are nowadays considered to be filter qualification. Filter validation requires more thorough analysis, especially Bacterial Challenge testing with the actual drug product under process conditions. To do so, viability testing is a necessity to determine the Bacterial Challenge test methodology. Additionally to these two compulsory tests, other evaluations like extractable, adsorption and chemical compatibility tests should be considered. PDA Technical Report # 26, Sterilizing Filtration of Liquids, describes all parameters and aspects required for the comprehensive validation of filters. The report is a most helpful tool for validation of liquid filters used in the biopharmaceutical industry. It sets the cornerstones of validation requirements and other filtration considerations.
EOS Terra Validation Program

NASA Technical Reports Server (NTRS)

Starr, David

1999-01-01

The EOS Terra mission will be launched in July 1999. This mission has great relevance to the atmospheric radiation community and global change issues. Terra instruments include ASTER, CERES, MISR, MODIS and MOPITT. In addition to the fundamental radiance data sets, numerous global science data products will be generated, including various Earth radiation budget, cloud and aerosol parameters, as well as land surface, terrestrial ecology, ocean color, and atmospheric chemistry parameters. Significant investments have been made in on-board calibration to ensure the quality of the radiance observations. A key component of the Terra mission is the validation of the science data products. This is essential for a mission focused on global change issues and the underlying processes. The Terra algorithms have been subject to extensive pre-launch testing with field data whenever possible. Intensive efforts will be made to validate the Terra data products after launch. These include validation of instrument calibration (vicarious calibration) experiments, instrument and cross-platform comparisons, routine collection of high quality correlative data from ground-based networks, such as AERONET, and intensive sites, such as the SGP ARM site, as well as a variety field experiments, cruises, etc. Airborne simulator instruments have been developed for the field experiment and underflight activities including the MODIS Airborne Simulator (MAS), AirMISR, MASTER (MODIS-ASTER), and MOPITT-A. All are integrated on the NASA ER-2, though low altitude platforms are more typically used for MASTER. MATR is an additional sensor used for MOPITT algorithm development and validation. The intensive validation activities planned for the first year of the Terra mission will be described with emphasis on derived geophysical parameters of most relevance to the atmospheric radiation community. Detailed information about the EOS Terra validation Program can be found on the EOS Validation program
Evaluating the Content Validity of Multistage-Adaptive Tests

ERIC Educational Resources Information Center

Crotts, Katrina; Sireci, Stephen G.; Zenisky, April

2012-01-01

Validity evidence based on test content is important for educational tests to demonstrate the degree to which they fulfill their purposes. Most content validity studies involve subject matter experts (SMEs) who rate items that comprise a test form. In computerized-adaptive testing, examinees take different sets of items and test "forms"…
The predictive validity of selection for entry into postgraduate training in general practice: evidence from three longitudinal studies

PubMed Central

Patterson, Fiona; Lievens, Filip; Kerrin, Máire; Munro, Neil; Irish, Bill

2013-01-01

Background The selection methodology for UK general practice is designed to accommodate several thousand applicants per year and targets six core attributes identified in a multi-method job-analysis study Aim To evaluate the predictive validity of selection methods for entry into postgraduate training, comprising a clinical problem-solving test, a situational judgement test, and a selection centre. Design and setting A three-part longitudinal predictive validity study of selection into training for UK general practice. Method In sample 1, participants were junior doctors applying for training in general practice (n = 6824). In sample 2, participants were GP registrars 1 year into training (n = 196). In sample 3, participants were GP registrars sitting the licensing examination after 3 years, at the end of training (n = 2292). The outcome measures include: assessor ratings of performance in a selection centre comprising job simulation exercises (sample 1); supervisor ratings of trainee job performance 1 year into training (sample 2); and licensing examination results, including an applied knowledge examination and a 12-station clinical skills objective structured clinical examination (OSCE; sample 3). Results Performance ratings at selection predicted subsequent supervisor ratings of job performance 1 year later. Selection results also significantly predicted performance on both the clinical skills OSCE and applied knowledge examination for licensing at the end of training. Conclusion In combination, these longitudinal findings provide good evidence of the predictive validity of the selection methods, and are the first reported for entry into postgraduate training. Results show that the best predictor of work performance and training outcomes is a combination of a clinical problem-solving test, a situational judgement test, and a selection centre. Implications for selection methods for all postgraduate specialties are considered. PMID:24267856
"La Clave Profesional": Validation of a Vocational Guidance Instrument

ERIC Educational Resources Information Center

Mudarra, Maria J.; Lázaro Martínez, Ángel

2014-01-01

Introduction: The current study demonstrates empirical and cultural validity of "La Clave Profesional" (Spanish adaptation of Career Key, Jones's test based Holland's RIASEC model). The process of providing validity evidence also includes a reflection on personal and career development and examines the relationahsips between RIASEC…

Validity: Applying Current Concepts and Standards to Gynecologic Surgery Performance Assessments

ERIC Educational Resources Information Center

LeClaire, Edgar L.; Nihira, Mikio A.; Hardré, Patricia L.

2015-01-01

Validity is critical for meaningful assessment of surgical competency. According to the Standards for Educational and Psychological Testing, validation involves the integration of data from well-defined classifications of evidence. In the authoritative framework, data from all classifications support construct validity claims. The two aims of this…
Strengthen forensic entomology in court--the need for data exploration and the validation of a generalised additive mixed model.

PubMed

Baqué, Michèle; Amendt, Jens

2013-01-01

Developmental data of juvenile blow flies (Diptera: Calliphoridae) are typically used to calculate the age of immature stages found on or around a corpse and thus to estimate a minimum post-mortem interval (PMI(min)). However, many of those data sets don't take into account that immature blow flies grow in a non-linear fashion. Linear models do not supply a sufficient reliability on age estimates and may even lead to an erroneous determination of the PMI(min). According to the Daubert standard and the need for improvements in forensic science, new statistic tools like smoothing methods and mixed models allow the modelling of non-linear relationships and expand the field of statistical analyses. The present study introduces into the background and application of these statistical techniques by analysing a model which describes the development of the forensically important blow fly Calliphora vicina at different temperatures. The comparison of three statistical methods (linear regression, generalised additive modelling and generalised additive mixed modelling) clearly demonstrates that only the latter provided regression parameters that reflect the data adequately. We focus explicitly on both the exploration of the data--to assure their quality and to show the importance of checking it carefully prior to conducting the statistical tests--and the validation of the resulting models. Hence, we present a common method for evaluating and testing forensic entomological data sets by using for the first time generalised additive mixed models.
Initial Evidence for the Reliability and Validity of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors at the Middle School Level

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Carter, Erik W.; Lambert, Warren E.; Jenkins, Abbie B.

2013-01-01

We reported findings of an exploratory validation study of a revised universal screening instrument: the Student Risk Screening Scale--Internalizing and Externalizing (SRSS-IE) for use with middle school students. Tested initially for use with elementary-age students, the SRSS-IE was adapted to include seven additional items reflecting…
46 CFR 355.5 - Additional material.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 46 Shipping 8 2011-10-01 2011-10-01 false Additional material. 355.5 Section 355.5 Shipping... STATES CITIZENSHIP § 355.5 Additional material. If additional material is determined to be essential to clarify or support the evidence of U.S. citizenship, such material shall be furnished by the...
46 CFR 355.5 - Additional material.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 46 Shipping 8 2014-10-01 2014-10-01 false Additional material. 355.5 Section 355.5 Shipping... STATES CITIZENSHIP § 355.5 Additional material. If additional material is determined to be essential to clarify or support the evidence of U.S. citizenship, such material shall be furnished by the...
46 CFR 355.5 - Additional material.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 46 Shipping 8 2013-10-01 2013-10-01 false Additional material. 355.5 Section 355.5 Shipping... STATES CITIZENSHIP § 355.5 Additional material. If additional material is determined to be essential to clarify or support the evidence of U.S. citizenship, such material shall be furnished by the...
The development and validation of the Incivility from Customers Scale.

PubMed

Wilson, Nicole L; Holmvall, Camilla M

2013-07-01

Scant research has examined customers as sources of workplace incivility, despite evidence suggesting that mistreatment is more common from organizational outsiders, including customers, than from organizational members (Grandey, Kern, & Frone, 2007; Schat & Kelloway, 2005). As an important step in extending the literature on customer incivility, we conducted two studies to develop and validate a measure of this construct. Study 1 used focus groups of retail and restaurant employees (n = 30) to elicit a list of uncivil customer behaviors, based on which we wrote initial scale items. Study 2 used a correlational survey design (n = 439) to pare down the number of scale items to 10 and to garner reliability and validity evidence for the scale. Exploratory and confirmatory factor analyses show that the scale is unidimensional and distinguishable from measures of the related, but distinct, constructs of interpersonal justice and psychological aggression from customers. Reliability analyses show that the scale is internally consistent. Significant correlations between the scale and individuals' job satisfaction, turnover intentions, and general and job-specific psychological strain provide evidence of criterion-related validity. Hierarchical regression analyses show that the scale significantly predicts three of four organizational and personal strain outcomes over and above a workplace incivility measure adapted for customer incivility, providing some evidence of incremental validity. Limitations and future research directions are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Development of the Hand Assessment for Infants: evidence of internal scale validity.

PubMed

Krumlinde-Sundholm, Lena; Ek, Linda; Sicola, Elisa; Sjöstrand, Lena; Guzzetta, Andrea; Sgandurra, Giuseppina; Cioni, Giovanni; Eliasson, Ann-Christin

2017-12-01

The aim of this study was to develop a descriptive and evaluative assessment of upper limb function for infants aged 3 to 12 months and to investigate its internal scale validity for use with infants at risk of unilateral cerebral palsy. The concepts of the test items and scoring criteria were developed. Internal scale validity and aspects of reliability were investigated on the basis of 156 assessments of infants at 3 to 12 months corrected age (mean 7.2mo, SD 2.5) with signs of asymmetric hand use. Rasch measurement model analysis and non-parametric statistics were used. The new test, the Hand Assessment for Infants (HAI), consists of 12 unimanual and five bimanual items, each scored on a 3-point rating scale. It demonstrated a unidimensional construct and good fit to the Rasch model requirements. The excellent person reliability enabled person separation to six significant ability strata. The HAI produced an interval-level measure of bilateral hand use as well as unimanual scores of each hand, allowing a quantification of possible asymmetry expressed as an asymmetry index. The HAI can be considered a valid assessment tool for measuring bilateral hand use and quantifying side difference between hands among infants at risk of developing unilateral cerebral palsy. The Hand Assessment for Infants (HAI) measures the use of both hands and quantifies a possible asymmetry of hand use. HAI is valid for infants at 3 to 12 months corrected age at risk of unilateral cerebral palsy. © 2017 Mac Keith Press.
The Anomalous Sentences Repetition Test: Replication and Validation Study.

ERIC Educational Resources Information Center

Weeks, David J.

1986-01-01

Presents a brief clinical test, derived from earlier neuropsychological instruments, with evidence for its reliability, interscorer agreement, and validity. The latter is based upon correlations with both CAT scan measures of cortical atrophy and ventricular enlargement, as well as correlations with seven other previously validated cognitive…
Applying Kane's Validity Framework to a Simulation Based Assessment of Clinical Competence

ERIC Educational Resources Information Center

Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud

2018-01-01

Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…
Contemporary Test Validity in Theory and Practice: A Primer for Discipline-Based Education Researchers.

PubMed

Reeves, Todd D; Marbach-Ad, Gili

2016-01-01

Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology--either quantitative or qualitative--on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations. © 2016 T. D. Reeves and G. Marbach-Ad. CBE—Life Sciences Education © 2016 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Validity of the Student Risk Screening Scale: Evidence of Predictive Validity in a Diverse, Suburban Elementary Setting

ERIC Educational Resources Information Center

Menzies, Holly M.; Lane, Kathleen Lynne

2012-01-01

In this study the authors examined the psychometric properties of the "Student Risk Screening Scale" (SRSS), including predictive validity in terms of student outcomes in behavioral and academic domains. The school, a diverse, suburban school in Southern California, administered the SRSS at three time points as part of regular school…
[Validation of a Japanese version of the Experience in Close Relationship- Relationship Structure].

PubMed

Komura, Kentaro; Murakami, Tatsuya; Toda, Koji

2016-08-01

The purpose of this study was to translate the Experience of Close Relationship-Relationship Structure (ECRRS) and evaluate its validity. In study 1 (N = 982), evidence based internal structure (factor structure, internal consistency, and correlation among sub-scales) and evidence based relations to other variables (depression, reassurance seeking and self-esteem) were confirmed. In study 2 (N = 563), evidence based on internal structure was reconfirmed, and evidence based relations to other variables (IWMS, RQ, and ECR-GO) were confirmed. In study 3 (N = 342), evidence based internal structure (test-retest reliability) was confirmed. Based on these results, we concluded that ECR-RS was valid for measuring adult attachment style.
Use of the Environment and Policy Evaluation and Observation as a Self-Report Instrument (EPAO-SR) to measure nutrition and physical activity environments in child care settings: validity and reliability evidence.

PubMed

Ward, Dianne S; Mazzucca, Stephanie; McWilliams, Christina; Hales, Derek

2015-09-26

Early care and education (ECE) centers are important settings influencing young children's diet and physical activity (PA) behaviors. To better understand their impact on diet and PA behaviors as well as to evaluate public health programs aimed at ECE settings, we developed and tested the Environment and Policy Assessment and Observation - Self-Report (EPAO-SR), a self-administered version of the previously validated, researcher-administered EPAO. Development of the EPAO-SR instrument included modification of items from the EPAO, community advisory group and expert review, and cognitive interviews with center directors and classroom teachers. Reliability and validity data were collected across 4 days in 3-5 year old classrooms in 50 ECE centers in North Carolina. Center teachers and directors completed relevant portions of the EPAO-SR on multiple days according to a standardized protocol, and trained data collectors completed the EPAO for 4 days in the centers. Reliability and validity statistics calculated included percent agreement, kappa, correlation coefficients, coefficients of variation, deviations, mean differences, and intraclass correlation coefficients (ICC), depending on the response option of the item. Data demonstrated a range of reliability and validity evidence for the EPAO-SR instrument. Reporting from directors and classroom teachers was consistent and similar to the observational data. Items that produced strongest reliability and validity estimates included beverages served, outside time, and physical activity equipment, while items such as whole grains served and amount of teacher-led PA had lower reliability (observation and self-report) and validity estimates. To overcome lower reliability and validity estimates, some items need administration on multiple days. This study demonstrated appropriate reliability and validity evidence for use of the EPAO-SR in the field. The self-administered EPAO-SR is an advancement of the measurement of ECE
Development and Validation of the Biobanking Attitudes and Knowledge Survey (BANKS)

PubMed Central

Wells, Kristen J.; Arevalo, Mariana; Meade, Cathy D.; Gwede, Clement K.; Quinn, Gwendolyn P.; Luque, John S.; Miguel, Gloria San; Watson, Dale; Phillips, Rebecca; Reyes, Carmen; Romo, Margarita; West, Jim; Jacobsen, Paul B.

2014-01-01

Background No validated multi-scale instruments exist that measure community members’ views on biobanking and biospecimen donation. This study describes the development and psychometric properties of the English-language BANKS (Biobanking Attitudes aNd Knowledge Survey). Methods The BANKS was created by item generation through review of scientific literature, focus groups with community members, and input from a community advisory board. Items were refined through cognitive interviews. Content validity was assessed through an expert panel review. Psychometric properties of the BANKS were assessed in a sample of 85 community members. Results The final BANKS includes 3 scales: Attitudes, Knowledge, and Self-Efficacy; as well as 3 single items, which evaluated receptivity and intention to donate a biospecimen for research. Cronbach's alpha coefficients for two scales that use Likert response format indicated high internal consistency (Attitudes: α=.88; Self-Efficacy: α=.95). Content validity indices were moderate, ranging from 0.69 to 0.89. Intention to donate blood and intention to donate urine were positively correlated with attitudes, knowledge, self-efficacy, and receptivity to learning more about biobanking (p's range from .029 to <.001). Conclusions The final BANKS shows evidence of satisfactory reliability and validity, is easy to administer, and is a promising tool to inform biospecimen research. Additional studies should be conducted with larger samples considering biospecimen donation to further assess the instrument's reliability and validity. Impact A valid and reliable instrument measuring community members’ views about biobanking may help researchers evaluate relevant communication interventions to enhance understanding, intention, and actual biospecimen donation. A Spanish-language BANKS is under development. PMID:24609846
Improving efficiency of a small forensic DNA laboratory: validation of robotic assays and evaluation of microcapillary array device.

PubMed

Crouse, Cecelia A; Yeung, Stephanie; Greenspoon, Susan; McGuckian, Amy; Sikorsky, Julie; Ban, Jeff; Mathies, Richard

2005-08-01

To present validation studies performed for the implementation of existing and new technologies to increase the efficiency in the forensic DNA Section of the Palm Beach County Sheriff's Office (PBSO) Crime Laboratory. Using federally funded grants, internal support, and an external Process Mapping Team, the PBSO collaborated with forensic vendors, universities, and other forensic laboratories to enhance DNA testing procedures, including validation of the DNA IQ magnetic bead extraction system, robotic DNA extraction using the BioMek2000, the ABI7000 Sequence Detection System, and is currently evaluating a micro Capillary Array Electrophoresis device. The PBSO successfully validated and implemented both manual and automated Promega DNA IQ magnetic bead extractions system, which have increased DNA profile results from samples with low DNA template concentrations. The Beckman BioMek2000 DNA robotic workstation has been validated for blood, tissue, bone, hair, epithelial cells (touch evidence), and mixed stains such as semen. There has been a dramatic increase in the number of samples tested per case since implementation of the robotic extraction protocols. The validation of the ABI7000 real-time quantitative polymerase chain reaction (qPCR) technology and the single multiplex short tandem repeat (STR) PowerPlex16 BIO amplification system has provided both a time and a financial benefit. In addition, the qPCR system allows more accurate DNA concentration data and the PowerPlex 16 BIO multiplex generates DNA profiles data in half the time when compared to PowerPlex1.1 and PowerPlex2.1 STR systems. The PBSO's future efficiency requirements are being addressed through collaboration with the University of California at Berkeley and the Virginia Division of Forensic Science to validate microcapillary array electrophoresis instrumentation. Initial data demonstrated the electrophoresis of 96 samples in less than twenty minutes. The PBSO demonstrated, through the validation of
Process Modeling and Validation for Metal Big Area Additive Manufacturing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, Srdjan; Nycz, Andrzej; Noakes, Mark W.

Metal Big Area Additive Manufacturing (mBAAM) is a new additive manufacturing (AM) technology based on the metal arc welding. A continuously fed metal wire is melted by an electric arc that forms between the wire and the substrate, and deposited in the form of a bead of molten metal along the predetermined path. Objects are manufactured one layer at a time starting from the base plate. The final properties of the manufactured object are dependent on its geometry and the metal deposition path, in addition to depending on the basic welding process parameters. Computational modeling can be used to acceleratemore » the development of the mBAAM technology as well as a design and optimization tool for the actual manufacturing process. We have developed a finite element method simulation framework for mBAAM using the new features of software ABAQUS. The computational simulation of material deposition with heat transfer is performed first, followed by the structural analysis based on the temperature history for predicting the final deformation and stress state. In this formulation, we assume that two physics phenomena are coupled in only one direction, i.e. the temperatures are driving the deformation and internal stresses, but their feedback on the temperatures is negligible. The experiment instrumentation (measurement types, sensor types, sensor locations, sensor placements, measurement intervals) and the measurements are presented. The temperatures and distortions from the simulations show good correlation with experimental measurements. Ongoing modeling work is also briefly discussed.« less
Eye-Tracking as a Tool in Process-Oriented Reading Test Validation

ERIC Educational Resources Information Center

Solheim, Oddny Judith; Uppstad, Per Henning

2011-01-01

The present paper addresses the continuous need for methodological reflection on how to validate inferences made on the basis of test scores. Validation is a process that requires many lines of evidence. In this article we discuss the potential of eye tracking methodology in process-oriented reading test validation. Methodological considerations…
External validation of scoring instruments for evaluating pediatric resuscitation.

PubMed

Levy, Arielle; Donoghue, Aaron; Bailey, Benoit; Thompson, Nathan; Jamoulle, Olivier; Gagnon, Robert; Gravel, Jocelyn

2014-12-01

shockable arrest (ICC, 0.98; 95% CI, 0.96-0.99), 4.1% (95% CI, -4.5 to 12.8) for the dysrhythmias (ICC, 0.92; 95% CI, 0.87-0.96), 18.4% (95% CI, 9.7-27.1) for the respiratory scenario (ICC, 0.97; 95% CI, 0.95-0.98), and 5.3% (95% CI, -1.4 to 2.0) for the shock scenarios (ICC, 0.94; 95% CI, 0.90-0.97). There were no differences between PGY1 and PGY3 scores before or after the PALS course. Reliability of the instrument was acceptable as demonstrated by a mean ICC of 0.95 (95% CI, 0.94-0.96). The G-study coefficient was 0.94. Most variance could be attributed to the subject (57%). Interactions between subject and scenario and subject and occasion were 9.9% and 1.4%, respectively, and variance attributable to rater was minimal (0%). Pediatric residents improved scores on CPT after completion of a PALS course. Clinical Performance Tool scores are sensitive to the increase in skills and knowledge resulting from such a course but not to learners' levels. Validity evidence from scores for the CPT confirms implementation in new contexts and partially supports internal structure. More evidence is required to further support internal structure and especially to support relations with other variables and consequence evidence. Additional modifications should be made to the CPT before considering its use for high-stakes certification such as PALS.
Construct Validation of the Fairy Tale Test--Standardization Data.

ERIC Educational Resources Information Center

Coulacoglou, Carina

2002-01-01

Studied the construct validity of the Fairy Tale Test (C. Coulacoglu, 1993), a personality projective test for children, in a sample of 800 Greek children aged 8, 10, and 12. Factor analysis led to identification of eight primary factors, and correlations with other measures provide construct validity evidence. (SLD)

Adapting Social Neuroscience Measures for Schizophrenia Clinical Trials, Part 3: Fathoming External Validity

PubMed Central

Olbert, Charles M.

2013-01-01

It is unknown whether measures adapted from social neuroscience linked to specific neural systems will demonstrate relationships to external variables. Four paradigms adapted from social neuroscience were administered to 173 clinically stable outpatients with schizophrenia to determine their relationships to functionally meaningful variables and to investigate their incremental validity beyond standard measures of social and nonsocial cognition. The 4 paradigms included 2 that assess perception of nonverbal social and action cues (basic biological motion and emotion in biological motion) and 2 that involve higher level inferences about self and others’ mental states (self- referential memory and empathic accuracy). Overall, social neuroscience paradigms showed significant relationships to functional capacity but weak relationships to community functioning; the paradigms also showed weak correlations to clinical symptoms. Evidence for incremental validity beyond standard measures of social and nonsocial cognition was mixed with additional predictive power shown for functional capacity but not community functioning. Of the newly adapted paradigms, the empathic accuracy task had the broadest external validity. These results underscore the difficulty of translating developments from neuroscience into clinically useful tasks with functional significance. PMID:24072806
Adapting social neuroscience measures for schizophrenia clinical trials, part 3: fathoming external validity.

PubMed

Olbert, Charles M; Penn, David L; Kern, Robert S; Lee, Junghee; Horan, William P; Reise, Steven P; Ochsner, Kevin N; Marder, Stephen R; Green, Michael F

2013-11-01

It is unknown whether measures adapted from social neuroscience linked to specific neural systems will demonstrate relationships to external variables. Four paradigms adapted from social neuroscience were administered to 173 clinically stable outpatients with schizophrenia to determine their relationships to functionally meaningful variables and to investigate their incremental validity beyond standard measures of social and nonsocial cognition. The 4 paradigms included 2 that assess perception of nonverbal social and action cues (basic biological motion and emotion in biological motion) and 2 that involve higher level inferences about self and others' mental states (self-referential memory and empathic accuracy). Overall, social neuroscience paradigms showed significant relationships to functional capacity but weak relationships to community functioning; the paradigms also showed weak correlations to clinical symptoms. Evidence for incremental validity beyond standard measures of social and nonsocial cognition was mixed with additional predictive power shown for functional capacity but not community functioning. Of the newly adapted paradigms, the empathic accuracy task had the broadest external validity. These results underscore the difficulty of translating developments from neuroscience into clinically useful tasks with functional significance.
FUNCTIONAL PERFORMANCE TESTING OF THE HIP IN ATHLETES: A SYSTEMATIC REVIEW FOR RELIABILITY AND VALIDITY

PubMed Central

Martin, RobRoy L.

2012-01-01

Purpose/Background: The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. Methods: A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. Results: The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Conclusions: Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. Level of Evidence: 2b (Systematic Review of Literature) PMID:22893860
Likelihood ratio data to report the validation of a forensic fingerprint evaluation method.

PubMed

Ramos, Daniel; Haraksim, Rudolf; Meuwly, Didier

2017-02-01

Data to which the authors refer to throughout this article are likelihood ratios (LR) computed from the comparison of 5-12 minutiae fingermarks with fingerprints. These LRs data are used for the validation of a likelihood ratio (LR) method in forensic evidence evaluation. These data present a necessary asset for conducting validation experiments when validating LR methods used in forensic evidence evaluation and set up validation reports. These data can be also used as a baseline for comparing the fingermark evidence in the same minutiae configuration as presented in (D. Meuwly, D. Ramos, R. Haraksim,) [1], although the reader should keep in mind that different feature extraction algorithms and different AFIS systems used may produce different LRs values. Moreover, these data may serve as a reproducibility exercise, in order to train the generation of validation reports of forensic methods, according to [1]. Alongside the data, a justification and motivation for the use of methods is given. These methods calculate LRs from the fingerprint/mark data and are subject to a validation procedure. The choice of using real forensic fingerprint in the validation and simulated data in the development is described and justified. Validation criteria are set for the purpose of validation of the LR methods, which are used to calculate the LR values from the data and the validation report. For privacy and data protection reasons, the original fingerprint/mark images cannot be shared. But these images do not constitute the core data for the validation, contrarily to the LRs that are shared.
Validation of the Adolescent Concerns Measure (ACM): evidence from exploratory and confirmatory factor analysis.

PubMed

Ang, Rebecca P; Chong, Wan Har; Huan, Vivien S; Yeo, Lay See

2007-01-01

This article reports the development and initial validation of scores obtained from the Adolescent Concerns Measure (ACM), a scale which assesses concerns of Asian adolescent students. In Study 1, findings from exploratory factor analysis using 619 adolescents suggested a 24-item scale with four correlated factors--Family Concerns (9 items), Peer Concerns (5 items), Personal Concerns (6 items), and School Concerns (4 items). Initial estimates of convergent validity for ACM scores were also reported. The four-factor structure of ACM scores derived from Study 1 was confirmed via confirmatory factor analysis in Study 2 using a two-fold cross-validation procedure with a separate sample of 811 adolescents. Support was found for both the multidimensional and hierarchical models of adolescent concerns using the ACM. Internal consistency and test-retest reliability estimates were adequate for research purposes. ACM scores show promise as a reliable and potentially valid measure of Asian adolescents' concerns.
LTDNA Evidence on Trial

PubMed Central

Roberts, Paul

2016-01-01

Adopting the interpretative/hermeneutical method typical of much legal scholarship, this article considers two sets of issues pertaining to LTDNA profiles as evidence in criminal proceedings. The section titled Expert Evidence as Forensic Epistemic Warrant addresses some rather large questions about the epistemic status and probative value of expert testimony in general. It sketches a theoretical model of expert evidence, highlighting five essential criteria: (1) expert competence; (2) disciplinary domain; (3) methodological validity; (4) materiality; and (5) legal admissibility. This generic model of expert authority, highlighting law's fundamentally normative character, applies to all modern forms of criminal adjudication, across Europe and farther afield. The section titled LTDNA Evidence in UK Criminal Trials then examines English and Northern Irish courts' attempts to get to grips with LTDNA evidence in recent cases. Better appreciating the ways in which UK courts have addressed the challenges of LTDNA evidence may offer some insights into parallel developments in other legal systems. Appellate court rulings follow a predictable judicial logic, which might usefully be studied and reflected upon by any forensic scientist or statistician seeking to operate effectively in criminal proceedings. Whilst each legal jurisdiction has its own unique blend of jurisprudence, institutions, cultures and historical traditions, there is considerable scope for comparative analysis and cross-jurisdictional borrowing and instruction. In the spirit of promoting more nuanced and sophisticated international interdisciplinary dialogue, this article examines UK judicial approaches to LTDNA evidence and begins to elucidate their underlying institutional logic. Legal argument and broader policy debates are not confined to considerations of scientific validity, contamination risks and evidential integrity, or associated judgments of legal admissibility or exclusion. They also crucially
LTDNA Evidence on Trial.

PubMed

Roberts, Paul

2016-01-01

Adopting the interpretative/hermeneutical method typical of much legal scholarship, this article considers two sets of issues pertaining to LTDNA profiles as evidence in criminal proceedings. The section titled Expert Evidence as Forensic Epistemic Warrant addresses some rather large questions about the epistemic status and probative value of expert testimony in general. It sketches a theoretical model of expert evidence, highlighting five essential criteria: (1) expert competence; (2) disciplinary domain; (3) methodological validity; (4) materiality; and (5) legal admissibility. This generic model of expert authority, highlighting law's fundamentally normative character, applies to all modern forms of criminal adjudication, across Europe and farther afield. The section titled LTDNA Evidence in UK Criminal Trials then examines English and Northern Irish courts' attempts to get to grips with LTDNA evidence in recent cases. Better appreciating the ways in which UK courts have addressed the challenges of LTDNA evidence may offer some insights into parallel developments in other legal systems. Appellate court rulings follow a predictable judicial logic, which might usefully be studied and reflected upon by any forensic scientist or statistician seeking to operate effectively in criminal proceedings. Whilst each legal jurisdiction has its own unique blend of jurisprudence, institutions, cultures and historical traditions, there is considerable scope for comparative analysis and cross-jurisdictional borrowing and instruction. In the spirit of promoting more nuanced and sophisticated international interdisciplinary dialogue, this article examines UK judicial approaches to LTDNA evidence and begins to elucidate their underlying institutional logic. Legal argument and broader policy debates are not confined to considerations of scientific validity, contamination risks and evidential integrity, or associated judgments of legal admissibility or exclusion. They also crucially
Development and validation of a multi-analyte method for the regulatory control of carotenoids used as feed additives in fish and poultry feed.

PubMed

Vincent, Ursula; Serano, Federica; von Holst, Christoph

2017-08-01

Carotenoids are used in animal nutrition mainly as sensory additives that favourably affect the colour of fish, birds and food of animal origin. Various analytical methods exist for their quantification in compound feed, reflecting the different physico-chemical characteristics of the carotenoid and the corresponding feed additives. They may be natural products or specific formulations containing the target carotenoids produced by chemical synthesis. In this study a multi-analyte method was developed that can be applied to the determination of all 10 carotenoids currently authorised within the European Union for compound feedingstuffs. The method functions regardless of whether the carotenoids have been added to the compound feed via natural products or specific formulations. It is comprised of three steps: (1) digestion of the feed sample with an enzyme; (2) pressurised liquid extraction; and (3) quantification of the analytes by reversed-phase HPLC coupled to a photodiode array detector in the visible range. The method was single-laboratory validated for poultry and fish feed covering a mass fraction range of the target analyte from 2.5 to 300 mg kg - 1 . The following method performance characteristics were obtained: the recovery rate varied from 82% to 129% and precision expressed as the relative standard deviation of intermediate precision varied from 1.6% to 15%. Based on the acceptable performance obtained in the validation study, the multi-analyte method is considered fit for the intended purpose.
Exploring the validity of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) with established emotions measures.

PubMed

Roberts, Richard D; Schulze, Ralf; O'Brien, Kristin; MacCann, Carolyn; Reid, John; Maul, Andy

2006-11-01

Emotions measures represent an important means of obtaining construct validity evidence for emotional intelligence (EI) tests because they have the same theoretical underpinnings. Additionally, the extent to which both emotions and EI measures relate to intelligence is poorly understood. The current study was designed to address these issues. Participants (N = 138) completed the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT), two emotions measures, as well as four intelligence tests. Results provide mixed support for the model hypothesized to underlie the MSCEIT, with emotions research and EI measures failing to load on the same factor. The emotions measures loaded on the same factor as intelligence measures. The validity of certain EI components (in particular, Emotion Perception), as currently assessed, appears equivocal. Copyright 2006 APA, all rights reserved.
Metal Big Area Additive Manufacturing: Process Modeling and Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, Srdjan; Nycz, Andrzej; Noakes, Mark W

Metal Big Area Additive Manufacturing (mBAAM) is a new additive manufacturing (AM) technology for printing large-scale 3D objects. mBAAM is based on the gas metal arc welding process and uses a continuous feed of welding wire to manufacture an object. An electric arc forms between the wire and the substrate, which melts the wire and deposits a bead of molten metal along the predetermined path. In general, the welding process parameters and local conditions determine the shape of the deposited bead. The sequence of the bead deposition and the corresponding thermal history of the manufactured object determine the long rangemore » effects, such as thermal-induced distortions and residual stresses. Therefore, the resulting performance or final properties of the manufactured object are dependent on its geometry and the deposition path, in addition to depending on the basic welding process parameters. Physical testing is critical for gaining the necessary knowledge for quality prints, but traversing the process parameter space in order to develop an optimized build strategy for each new design is impractical by pure experimental means. Computational modeling and optimization may accelerate development of a build process strategy and saves time and resources. Because computational modeling provides these opportunities, we have developed a physics-based Finite Element Method (FEM) simulation framework and numerical models to support the mBAAM process s development and design. In this paper, we performed a sequentially coupled heat transfer and stress analysis for predicting the final deformation of a small rectangular structure printed using the mild steel welding wire. Using the new simulation technologies, material was progressively added into the FEM simulation as the arc weld traversed the build path. In the sequentially coupled heat transfer and stress analysis, the heat transfer was performed to calculate the temperature evolution, which was used in a stress
Evidence for validity of five secondary data sources for enumerating retail food outlets in seven American Indian communities in North Carolina.

PubMed

Fleischhacker, Sheila E; Rodriguez, Daniel A; Evenson, Kelly R; Henley, Amanda; Gizlice, Ziya; Soto, Dolly; Ramachandran, Gowri

2012-11-22

Most studies on the local food environment have used secondary sources to describe the food environment, such as government food registries or commercial listings (e.g., Reference USA). Most of the studies exploring evidence for validity of secondary retail food data have used on-site verification and have not conducted analysis by data source (e.g., sensitivity of Reference USA) or by food outlet type (e.g., sensitivity of Reference USA for convenience stores). Few studies have explored the food environment in American Indian communities. To advance the science on measuring the food environment, we conducted direct, on-site observations of a wide range of food outlets in multiple American Indian communities, without a list guiding the field observations, and then compared our findings to several types of secondary data. Food outlets located within seven State Designated Tribal Statistical Areas in North Carolina (NC) were gathered from online Yellow Pages, Reference USA, Dun & Bradstreet, local health departments, and the NC Department of Agriculture and Consumer Services. All TIGER/Line 2009 roads (>1,500 miles) were driven in six of the more rural tribal areas and, for the largest tribe, all roads in two of its cities were driven. Sensitivity, positive predictive value, concordance, and kappa statistics were calculated to compare secondary data sources to primary data. 699 food outlets were identified during primary data collection. Match rate for primary data and secondary data differed by type of food outlet observed, with the highest match rates found for grocery stores (97%), general merchandise stores (96%), and restaurants (91%). Reference USA exhibited almost perfect sensitivity (0.89). Local health department data had substantial sensitivity (0.66) and was almost perfect when focusing only on restaurants (0.91). Positive predictive value was substantial for Reference USA (0.67) and moderate for local health department data (0.49). Evidence for validity
Evidence for validity of five secondary data sources for enumerating retail food outlets in seven American Indian Communities in North Carolina

PubMed Central

2012-01-01

Background Most studies on the local food environment have used secondary sources to describe the food environment, such as government food registries or commercial listings (e.g., Reference USA). Most of the studies exploring evidence for validity of secondary retail food data have used on-site verification and have not conducted analysis by data source (e.g., sensitivity of Reference USA) or by food outlet type (e.g., sensitivity of Reference USA for convenience stores). Few studies have explored the food environment in American Indian communities. To advance the science on measuring the food environment, we conducted direct, on-site observations of a wide range of food outlets in multiple American Indian communities, without a list guiding the field observations, and then compared our findings to several types of secondary data. Methods Food outlets located within seven State Designated Tribal Statistical Areas in North Carolina (NC) were gathered from online Yellow Pages, Reference USA, Dun & Bradstreet, local health departments, and the NC Department of Agriculture and Consumer Services. All TIGER/Line 2009 roads (>1,500 miles) were driven in six of the more rural tribal areas and, for the largest tribe, all roads in two of its cities were driven. Sensitivity, positive predictive value, concordance, and kappa statistics were calculated to compare secondary data sources to primary data. Results 699 food outlets were identified during primary data collection. Match rate for primary data and secondary data differed by type of food outlet observed, with the highest match rates found for grocery stores (97%), general merchandise stores (96%), and restaurants (91%). Reference USA exhibited almost perfect sensitivity (0.89). Local health department data had substantial sensitivity (0.66) and was almost perfect when focusing only on restaurants (0.91). Positive predictive value was substantial for Reference USA (0.67) and moderate for local health department data (0
Functional performance testing of the hip in athletes: a systematic review for reliability and validity.

PubMed

Kivlan, Benjamin R; Martin, Robroy L

2012-08-01

The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. 2b (Systematic Review of Literature).
Extending the validity of the Feeding Practices and Structure Questionnaire.

PubMed

Jansen, Elena; Mallan, Kimberley M; Daniels, Lynne A

2015-06-30

Feeding practices are commonly examined as potentially modifiable determinants of children's eating behaviours and weight status. Although a variety of questionnaires exist to assess different feeding aspects, many lack thorough reliability and validity testing. The Feeding Practices and Structure Questionnaire (FPSQ) is a tool designed to measure early feeding practices related to non-responsive feeding and structure of the meal environment. Face validity, factorial validity, internal reliability and cross-sectional correlations with children's eating behaviours have been established in mothers with 2-year-old children. The aim of the present study was to further extend the validity of the FPSQ by examining factorial, construct and predictive validity, and stability. Participants were from the NOURISH randomised controlled trial which evaluated an intervention with first-time mothers designed to promote protective feeding practices. Maternal feeding practices (FP) and child eating behaviours were assessed when children were aged 2 years and 3.7 years (n = 388). Confirmatory Factor analysis, group differences, predictive relationships, and stability were tested. The original 9-factor structure was confirmed when children were aged 3.7 ± 0.3 years. Cronbach's alpha was above the recommended 0.70 cut-off for all factors except Structured Meal Timing, Over Restriction and Distrust in Appetite which were 0.58, 0.67 and 0.66 respectively. Allocated group differences reflected behaviour consistent with intervention content and all feeding practices were stable across both time points (range of r = 0.45-0.70). There was some evidence for the predictive validity of factors with 2 FP showing expected relationships, 2 FP showing expected and unexpected relationships and 5 FP showing no relationship. Reliability and validity was demonstrated for most subscales of the FPSQ. Future validation is warranted with culturally diverse samples and with fathers and
Validation of a global scale to assess the quality of interprofessional teamwork in mental health settings.

PubMed

Tomizawa, Ryoko; Yamano, Mayumi; Osako, Mitue; Hirabayashi, Naotugu; Oshima, Nobuo; Sigeta, Masahiro; Reeves, Scott

2017-12-01

Few scales currently exist to assess the quality of interprofessional teamwork through team members' perceptions of working together in mental health settings. The purpose of this study was to revise and validate an interprofessional scale to assess the quality of teamwork in inpatient psychiatric units and to use it multi-nationally. A literature review was undertaken to identify evaluative teamwork tools and develop an additional 12 items to ensure a broad global focus. Focus group discussions considered adaptation to different care systems using subjective judgements from 11 participants in a pre-test of items. Data quality, construct validity, reproducibility, and internal consistency were investigated in the survey using an international comparative design. Exploratory factor analysis yielded five factors with 21 items: 'patient/community centred care', 'collaborative communication', 'interprofessional conflict', 'role clarification', and 'environment'. High overall internal consistency, reproducibility, adequate face validity, and reasonable construct validity were shown in the USA and Japan. The revised Collaborative Practice Assessment Tool (CPAT) is a valid measure to assess the quality of interprofessional teamwork in psychiatry and identifies the best strategies to improve team performance. Furthermore, the revised scale will generate more rigorous evidence for collaborative practice in psychiatry internationally.
Triangulating Evidence to Investigate the Validity of Measures: Evidence from Discussion during Instruction, Cognitive Interviews, and Written Assessments

ERIC Educational Resources Information Center

Burmester, Kristen O'Rourke

2011-01-01

Classrooms are a primary site of evidence about learning. Yet classroom proceedings often occur behind closed doors and hence evidence of student learning is observable only to the classroom teacher. The informal and undocumented nature of this information means that it is rarely included in statistical models or quantifiable analyses. This…
Development and Initial Validation of the PROMIS(®) Sexual Function and Satisfaction Measures Version 2.0.

PubMed

Weinfurt, Kevin P; Lin, Li; Bruner, Deborah Watkins; Cyranowski, Jill M; Dombeck, Carrie B; Hahn, Elizabeth A; Jeffery, Diana D; Luecht, Richard M; Magasi, Susan; Porter, Laura S; Reese, Jennifer Barsky; Reeve, Bryce B; Shelby, Rebecca A; Smith, Ashley Wilder; Willse, John T; Flynn, Kathryn E

2015-09-01

The Patient-Reported Outcomes Measurement Information System (PROMIS)(®) Sexual Function and Satisfaction measure (SexFS) version 1.0 was developed with cancer populations. There is a need to expand the SexFS and provide evidence of its validity in diverse populations. The aim of this study was to describe the development of the SexFS v2.0 and present preliminary evidence for its validity. Development built on version 1.0, plus additional review of extant items, discussions with 15 clinical experts, 11 patient focus groups (including individuals with diabetes, heart disease, anxiety, depression, and/or are lesbian, gay, bisexual, or aged 65 or older), 48 cognitive interviews, and psychometric evaluation in a random sample of U.S. adults plus an oversample for specific sexual problems (2281 men, 1686 women). We examined differential item functioning (DIF) by gender and sexual activity. We examined convergent and known-groups validity. The final set of domains includes 11 scored scales (interest in sexual activity, lubrication, vaginal discomfort, clitoral discomfort, labial discomfort, erectile function, orgasm ability, orgasm pleasure, oral dryness, oral discomfort, satisfaction), and six nonscored item pools (screeners, sexual activities, anal discomfort, therapeutic aids, factors interfering with sexual satisfaction, bother). Domains from version 1.0 were reevaluated and improved. Domains considered applicable across gender and sexual activity status, namely interest, orgasm, and satisfaction, were found to have significant DIF. We identified subsets of items in each domain that provided consistent measurement across these important respondent groups. Convergent and known-groups validity was supported. The SexFS version 2.0 has several improvements and enhancements over version 1.0 and other extant measures, including expanded evidence for validity, scores centered around norms for sexually active U.S. adults, new domains, and a final set of items applicable for
Validity and reliability of the Diagnostic Adaptive Behaviour Scale.

PubMed

Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P

2016-01-01

The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
The Trojan Lifetime Champions Health Survey: Development, Validity, and Reliability

PubMed Central

Sorenson, Shawn C.; Romano, Russell; Scholefield, Robin M.; Schroeder, E. Todd; Azen, Stanley P.; Salem, George J.

2015-01-01

Context Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. Objective To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Design Descriptive laboratory study. Setting A large National Collegiate Athletic Association Division I university. Patients or Other Participants A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Intervention(s) Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Main Outcome Measure(s) Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Results Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent
Scale of attitudes toward alcohol - Spanish version: evidences of validity and reliability.

PubMed

Ramírez, Erika Gisseth León; Vargas, Divane de

2017-08-03

validate the Scale of attitudes toward alcohol, alcoholism and individuals with alcohol use disorders in its Spanish version. methodological study, involving 300 Colombian nurses. Adopting the classical theory, confirmatory factor analysis was applied without prior examination, based on the strong historical evidence of the factorial structure of the original scale to determine the construct validity of this Spanish version. To assess the reliability, Cronbach's Alpha and Mc Donalid's Omega coefficients were used. the confirmatory factor analysis indicated the good fit of the scale model in a four-factor distribution, with a cut-off point at 3.2, demonstrating 66.7% of sensitivity. the Scale of attitudes toward alcohol, alcoholism and individuals with alcohol use disorders in Spanish presented robust psychometric qualities, affirming that the instrument possesses a solid factorial structure and reliability and is capable of precisely measuring the nurses' atittudes towards the phenomenon proposed. validar a Escala de atitudes frente ao álcool, ao alcoolismo e a pessoas com transtornos relacionados ao uso do álcool, versão espanhola. estudo metodológico, realizado com 303 enfermeiros colombianos. Seguindo a teoria clássica, foi aplicada a análise fatorial confirmatória sem exploração preliminar, com base na forte evidência histórica da estrutura fatorial do instrumento original para a validação de construto desta versão em espanhol. Para a avaliação da confiabilidade foram utilizados os coeficientes de Alfa de Cronbach e Ômega de Mc Donald. a análise fatorial confirmatória indicou o bom ajuste do modelo da escala na distribuição de quatro fatores, compreendendo 48 itens em sua versão espanhola. Os índices de confiabilidade foram satisfatórios, com ponto de corte observado em 3,2, demonstrando sensibilidade de 66,7%. a Escala de atitudes frente ao álcool, ao alcoolismo e a pessoas com transtornos relacionados ao uso do álcool no idioma

Validation of biomarkers to predict response to immunotherapy in cancer: Volume I - pre-analytical and analytical validation.

PubMed

Masucci, Giuseppe V; Cesano, Alessandra; Hawtin, Rachael; Janetzki, Sylvia; Zhang, Jenny; Kirsch, Ilan; Dobbin, Kevin K; Alvarez, John; Robbins, Paul B; Selvan, Senthamil R; Streicher, Howard Z; Butterfield, Lisa H; Thurin, Magdalena

2016-01-01

Immunotherapies have emerged as one of the most promising approaches to treat patients with cancer. Recently, there have been many clinical successes using checkpoint receptor blockade, including T cell inhibitory receptors such as cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) and programmed cell death-1 (PD-1). Despite demonstrated successes in a variety of malignancies, responses only typically occur in a minority of patients in any given histology. Additionally, treatment is associated with inflammatory toxicity and high cost. Therefore, determining which patients would derive clinical benefit from immunotherapy is a compelling clinical question. Although numerous candidate biomarkers have been described, there are currently three FDA-approved assays based on PD-1 ligand expression (PD-L1) that have been clinically validated to identify patients who are more likely to benefit from a single-agent anti-PD-1/PD-L1 therapy. Because of the complexity of the immune response and tumor biology, it is unlikely that a single biomarker will be sufficient to predict clinical outcomes in response to immune-targeted therapy. Rather, the integration of multiple tumor and immune response parameters, such as protein expression, genomics, and transcriptomics, may be necessary for accurate prediction of clinical benefit. Before a candidate biomarker and/or new technology can be used in a clinical setting, several steps are necessary to demonstrate its clinical validity. Although regulatory guidelines provide general roadmaps for the validation process, their applicability to biomarkers in the cancer immunotherapy field is somewhat limited. Thus, Working Group 1 (WG1) of the Society for Immunotherapy of Cancer (SITC) Immune Biomarkers Task Force convened to address this need. In this two volume series, we discuss pre-analytical and analytical (Volume I) as well as clinical and regulatory (Volume II) aspects of the validation process as applied to predictive biomarkers
The Psychopathy Q-Sort. Construct Validity Evidence in a Nonclinical Sample

ERIC Educational Resources Information Center

Fowler, Katherine A.; Lilienfeld, Scott O.

2007-01-01

Scant research has examined the validity of instruments that permit observer ratings of psychopathy. Using a nonclinical (undergraduate) sample, the authors examined the associations between both self- and observer ratings on a psychopathy prototype (Psychopathy Q-Sort, PQS) and widely used measures of psychopathy, antisocial behavior, and…
Validity of the Microcomputer Evaluation Screening and Assessment Aptitude Scores.

ERIC Educational Resources Information Center

Janikowski, Timothy P.; And Others

1991-01-01

Examined validity of Microcomputer Evaluation Screening and Assessment (MESA) aptitude scores relative to General Aptitude Test Battery (GATB) using multitrait-multimethod correlational analyses. Findings from 54 rehabilitation clients and 29 displaced workers revealed no evidence to support the construct validity of the MESA. (Author/NB)
Nutrition screening tools: an analysis of the evidence.

PubMed

Skipper, Annalynn; Ferguson, Maree; Thompson, Kyle; Castellanos, Victoria H; Porcari, Judy

2012-05-01

In response to questions about tools for nutrition screening, an evidence analysis project was developed to identify the most valid and reliable nutrition screening tools for use in acute care and hospital-based ambulatory care settings. An oversight group defined nutrition screening and literature search criteria. A trained analyst conducted structured searches of the literature for studies of nutrition screening tools according to predetermined criteria. Eleven nutrition screening tools designed to detect undernutrition in patients in acute care and hospital-based ambulatory care were identified. Trained analysts evaluated articles for quality using criteria specified by the American Dietetic Association's Evidence Analysis Library. Members of the oversight group assigned quality grades to the tools based on the quality of the supporting evidence, including reliability and validity data. One tool, the NRS-2002, received a grade I, and 4 tools-the Simple Two-Part Tool, the Mini-Nutritional Assessment-Short Form (MNA-SF), the Malnutrition Screening Tool (MST), and Malnutrition Universal Screening Tool (MUST)-received a grade II. The MST was the only tool shown to be both valid and reliable for identifying undernutrition in the settings studied. Thus, validated nutrition screening tools that are simple and easy to use are available for application in acute care and hospital-based ambulatory care settings.
Are loss of control while eating and overeating valid constructs? A critical review of the literature

PubMed Central

Goldschmidt, Andrea B.

2017-01-01

Background Binge eating is a marker of weight gain and obesity, and a hallmark feature of eating disorders. Yet, its component constructs—overeating and loss of control (LOC) while eating—are poorly understood and difficult to measure. Objective To critically review the human literature concerning the validity of LOC and overeating across the age and weight spectrum. Data sources English-language articles addressing the face, convergent, discriminant, and predictive validity of LOC and overeating were included. Results LOC and overeating appear to have adequate face validity. Emerging evidence supports the convergent and predictive validity of the LOC construct, given its unique cross-sectional and prospective associations with numerous anthropometric, psychosocial, and eating behavior-related factors. Overeating may be best conceptualized as a marker of excess weight status. Limitations Binge eating constructs, particularly in the context of subjectively large episodes, are challenging to measure reliably. Few studies addressed overeating in the absence of LOC, thereby limiting conclusions about the validity of the overeating construct independent of LOC. Additional studies addressing the discriminant validity of both constructs are warranted. Discussion Suggestions for future weight-related research and for appropriately defining binge eating in the eating disorders diagnostic scheme are presented. PMID:28165655
Adjusted linguistic validation and psychometric properties of the Colombian version of KIDSCREEN-52.

PubMed

Jaimes-Valencia, Mary Luz; Perpiñá-Galvañ, Juana; Cabañero-Martínez, Maria José; Cabrero-García, Julio; Richart-Martínez, Miguel

2018-01-01

In health and clinical studies, health-related quality of life is often assessed using the well-established KIDSCREEN-52 questionnaires as well as the Vécu et Santé Perçue de l'Adolescent (VSP-A). The purpose of this study was twofold: to perform an adjusted linguistic validation of the Colombian version of the KIDSCREEN-52 and to assess its psychometric properties in children and adolescents. A total of 146 children and adolescents completed the KIDSCREEN-52, adolescents ( n = 48) additionally completed the VSP-A. Psychometric analyses focused on the internal consistency as well as the convergent and discriminant validity of the KIDSCREEN-52 Colombian version. Syntactic and semantic modifications were made to 19 items in the adapted version of the KIDSCREEN-52. Cronbach's α ranged from .74 to .89 for eight dimensions, while α < .70 was obtained for self-perception and social acceptance. We found evidence of good convergent validity with the VSP-A dimensions. Regarding known-groups validity, children aged between 8 and 10=years, male, with a high socioeconomic level and no chronic health condition obtained higher scores compared to the other categories. The developed Colombian version of the KIDSCREEN-52 showed acceptable reliability and validity. This study provides a cultural adaptation of the Spanish version of the KIDSCREEN-52 for Colombian children and adolescents.
Validation of Helicopter Gear Condition Indicators Using Seeded Fault Tests

NASA Technical Reports Server (NTRS)

Dempsey, Paula; Brandon, E. Bruce

2013-01-01

A "seeded fault test" in support of a rotorcraft condition based maintenance program (CBM), is an experiment in which a component is tested with a known fault while health monitoring data is collected. These tests are performed at operating conditions comparable to operating conditions the component would be exposed to while installed on the aircraft. Performance of seeded fault tests is one method used to provide evidence that a Health Usage Monitoring System (HUMS) can replace current maintenance practices required for aircraft airworthiness. Actual in-service experience of the HUMS detecting a component fault is another validation method. This paper will discuss a hybrid validation approach that combines in service-data with seeded fault tests. For this approach, existing in-service HUMS flight data from a naturally occurring component fault will be used to define a component seeded fault test. An example, using spiral bevel gears as the targeted component, will be presented. Since the U.S. Army has begun to develop standards for using seeded fault tests for HUMS validation, the hybrid approach will be mapped to the steps defined within their Aeronautical Design Standard Handbook for CBM. This paper will step through their defined processes, and identify additional steps that may be required when using component test rig fault tests to demonstrate helicopter CI performance. The discussion within this paper will provide the reader with a better appreciation for the challenges faced when defining a seeded fault test for HUMS validation.
Exploring examinee behaviours as validity evidence for multiple-choice question examinations.

PubMed

Surry, Luke T; Torre, Dario; Durning, Steven J

2017-10-01

Clinical-vignette multiple choice question (MCQ) examinations are used widely in medical education. Standardised MCQ examinations are used by licensure and certification bodies to award credentials that are meant to assure stakeholders as to the quality of physicians. Such uses are based on the interpretation of MCQ examination performance as giving meaningful information about the quality of clinical reasoning. There are several assumptions foundational to these interpretations and uses of standardised MCQ examinations. This study explores the implicit assumption that cognitive processes elicited by clinical-vignette MCQ items are like the processes thought to occur with 'real-world' clinical reasoning as theorised by dual-process theory. Fourteen participants (three medical students, five residents and six staff physicians) completed three sets of five timed MCQ items (total 15) from the Medical Knowledge Self-Assessment Program (MKSAP). Upon answering a set of MCQs, each participant completed a retrospective think aloud (TA) protocol. Using constant comparative analysis (CCA) methods sensitised by dual-process theory, we performed a qualitative thematic analysis. Examinee behaviours fell into three categories: clinical reasoning behaviours, test-taking behaviours and reactions to the MCQ. Consistent with dual-process theory, statements about clinical reasoning behaviours were divided into two sub-categories: analytical reasoning and non-analytical reasoning. Each of these categories included several themes. Our study provides some validity evidence that test-takers' descriptions of their cognitive processes during completion of high-quality clinical-vignette MCQs align with processes expected in real-world clinical reasoning. This supports one of the assumptions important for interpretations of MCQ examination scores as meaningful measures of clinical reasoning. Our observations also suggest that MCQs elicit other cognitive processes, including certain test
Validity Evidence of the Spanish Version of the Automatic Thoughts Questionnaire-8 in Colombia.

PubMed

Ruiz, Francisco J; Suárez-Falcón, Juan C; Riaño-Hernández, Diana

2017-02-13

The Automatic Thoughts Questionnaire (ATQ) is a widely used, 30-item, 5-point Likert-type scale that measures the frequency of negative automatic thoughts as experienced by individuals suffering from depression. However, there is some controversy about the factor structure of the ATQ, and its application can be too time-consuming for survey research. Accordingly, an abbreviated, 8-item version of the ATQ has been proposed. The aim of this study was to analyze the validity evidence of the Spanish version of the ATQ-8 in Colombia. The ATQ-8 was administered to a total of 1587 participants, including a sample of undergraduates, one of general population, and a clinical sample. The internal consistency across the different samples was good (α = .89). The one-factor model found in the original scale showed a good fit to the data (RMSEA = .083, 90% CI [.074, .092]; CFI = .96; NNFI = .95). The clinical sample's mean score on the ATQ-8 was significantly higher than the scores of the nonclinical samples. The ATQ-8 was sensitive to the effects of a 1-session acceptance and commitment therapy focused on disrupting negative repetitive thinking. ATQ-8 scores were significantly related to dysfunctional schemas, emotional symptoms, mindfulness, experiential avoidance, satisfaction with life, and dysfunctional attitudes. In conclusion, the Spanish version of the ATQ-8 showed good psychometric properties in Colombia.
CASL Verification and Validation Plan

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mousseau, Vincent Andrew; Dinh, Nam

2016-06-30

This report documents the Consortium for Advanced Simulation of LWRs (CASL) verification and validation plan. The document builds upon input from CASL subject matter experts, most notably the CASL Challenge Problem Product Integrators, CASL Focus Area leaders, and CASL code development and assessment teams. This document will be a living document that will track progress on CASL to do verification and validation for both the CASL codes (including MPACT, CTF, BISON, MAMBA) and for the CASL challenge problems (CIPS, PCI, DNB). The CASL codes and the CASL challenge problems are at differing levels of maturity with respect to validation andmore » verification. The gap analysis will summarize additional work that needs to be done. Additional VVUQ work will be done as resources permit. This report is prepared for the Department of Energy’s (DOE’s) CASL program in support of milestone CASL.P13.02.« less
Validity threats: overcoming interference with proposed interpretations of assessment data.

PubMed

Downing, Steven M; Haladyna, Thomas M

2004-03-01

Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity. The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided. The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged. There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.
Hierarchy of evidence: a simple system for orthopaedic research?

PubMed

Pemberton, Julia; Kraeva, Juliana; Bhandari, Mohit

2007-01-01

To be able to make a sound recommendation for a treatment based on the best available evidence, it is necessary to follow specific steps in acquiring literature, appraising the study design and quality, and assessing the results. Evidence-based medicine is founded on the concepts of using best evidence, levels of evidence, and grades of recommendation, and aims to provide clinicians with standardized rules to help them appraise the validity of published research. A number of systems have been developed to categorize research studies into consistent levels of evidence. These systems are based primarily on consensus expert opinion, and have not been validated to any extent. The use of different systems does not allow for effective communication between users; there is a lack of accord even between users of the same system. The GRADE working group has devised a new rating system that attempts to address deficiencies seen within other systems.
Designing and Validating a Measure of Teacher Knowledge of Universal Design for Assessment (UDA)

ERIC Educational Resources Information Center

Jamgochian, Elisa Megan

2010-01-01

The primary purpose of this study was to design and validate a measure of teacher knowledge of Universal Design for Assessment (TK-UDA). Guided by a validity framework, a number of inferences, assumptions, and evidences supported this investigation. By addressing a series of research questions, evidence was garnered for the use of the measure to…
Food additives: an ethical evaluation.

PubMed

Mepham, Ben

2011-01-01

Food additives are an integral part of the modern food system, but opinion polls showing most Europeans have worries about them imply an urgent need for ethical analysis of their use. The existing literature on food ethics, safety assessment and animal testing. Food additives provide certain advantages in terms of many people's lifestyles. There are disagreements about the appropriate application of the precautionary principle and of the value and ethical validity of animal tests in assessing human safety. Most consumers have a poor understanding of the relative benefits and risks of additives, but concerns over food safety and animal testing remain high. Examining the impacts of food additives on consumer sovereignty, consumer health and on animals used in safety testing should allow a more informed debate about their appropriate uses.
Sicily statement on evidence-based practice

PubMed Central

Dawes, Martin; Summerskill, William; Glasziou, Paul; Cartabellotta, Antonino; Martin, Janet; Hopayian, Kevork; Porzsolt, Franz; Burls, Amanda; Osborne, James

2005-01-01

Background A variety of definitions of evidence-based practice (EBP) exist. However, definitions are in themselves insufficient to explain the underlying processes of EBP and to differentiate between an evidence-based process and evidence-based outcome. There is a need for a clear statement of what Evidence-Based Practice (EBP) means, a description of the skills required to practise in an evidence-based manner and a curriculum that outlines the minimum requirements for training health professionals in EBP. This consensus statement is based on current literature and incorporating the experience of delegates attending the 2003 Conference of Evidence-Based Health Care Teachers and Developers ("Signposting the future of EBHC"). Discussion Evidence-Based Practice has evolved in both scope and definition. Evidence-Based Practice (EBP) requires that decisions about health care are based on the best available, current, valid and relevant evidence. These decisions should be made by those receiving care, informed by the tacit and explicit knowledge of those providing care, within the context of available resources. Health care professionals must be able to gain, assess, apply and integrate new knowledge and have the ability to adapt to changing circumstances throughout their professional life. Curricula to deliver these aptitudes need to be grounded in the five-step model of EBP, and informed by ongoing research. Core assessment tools for each of the steps should continue to be developed, validated, and made freely available. Summary All health care professionals need to understand the principles of EBP, recognise EBP in action, implement evidence-based policies, and have a critical attitude to their own practice and to evidence. Without these skills, professionals and organisations will find it difficult to provide 'best practice'. PMID:15634359
Measuring striving for understanding and learning value of geometry: a validity study

NASA Astrophysics Data System (ADS)

Ubuz, Behiye; Aydınyer, Yurdagül

2017-11-01

The current study aimed to construct a questionnaire that measures students' personality traits related to striving for understanding and learning value of geometry and then examine its psychometric properties. Through the use of multiple methods on two independent samples of 402 and 521 middle school students, two studies were performed to address this issue to provide support for its validity. In Study 1, exploratory factor analysis indicated the two-factor model. In Study 2, confirmatory factor analysis indicated the better fit of two-factor model compared to one or three-factor model. Convergent and discriminant validity evidence provided insight into the distinctiveness of the two factors. Subgroup validity evidence revealed gender differences for striving for understanding geometry trait favouring girls and grade level differences for learning value of geometry trait favouring the sixth- and seventh-grade students. Predictive validity evidence demonstrated that the striving for understanding geometry trait but not learning value of geometry trait was significantly correlated with prior mathematics achievement. In both studies, each factor and the entire questionnaire showed satisfactory reliability. In conclusion, the questionnaire was psychometrically sound.
Additional specimen of Microraptor provides unique evidence of dinosaurs preying on birds

PubMed Central

O'Connor, Jingmai; Zhou, Zhonghe; Xu, Xing

2011-01-01

Preserved indicators of diet are extremely rare in the fossil record; even more so is unequivocal direct evidence for predator–prey relationships. Here, we report on a unique specimen of the small nonavian theropod Microraptor gui from the Early Cretaceous Jehol biota, China, which has the remains of an adult enantiornithine bird preserved in its abdomen, most likely not scavenged, but captured and consumed by the dinosaur. We provide direct evidence for the dietary preferences of Microraptor and a nonavian dinosaur feeding on a bird. Further, because Jehol enantiornithines were distinctly arboreal, in contrast to their cursorial ornithurine counterparts, this fossil suggests that Microraptor hunted in trees thereby supporting inferences that this taxon was also an arborealist, and provides further support for the arboreality of basal dromaeosaurids. PMID:22106278
The Utrecht questionnaire (U-CEP) measuring knowledge on clinical epidemiology proved to be valid.

PubMed

Kortekaas, Marlous F; Bartelink, Marie-Louise E L; de Groot, Esther; Korving, Helen; de Wit, Niek J; Grobbee, Diederick E; Hoes, Arno W

2017-02-01

Knowledge on clinical epidemiology is crucial to practice evidence-based medicine. We describe the development and validation of the Utrecht questionnaire on knowledge on Clinical epidemiology for Evidence-based Practice (U-CEP); an assessment tool to be used in the training of clinicians. The U-CEP was developed in two formats: two sets of 25 questions and a combined set of 50. The validation was performed among postgraduate general practice (GP) trainees, hospital trainees, GP supervisors, and experts. Internal consistency, internal reliability (item-total correlation), item discrimination index, item difficulty, content validity, construct validity, responsiveness, test-retest reliability, and feasibility were assessed. The questionnaire was externally validated. Internal consistency was good with a Cronbach alpha of 0.8. The median item-total correlation and mean item discrimination index were satisfactory. Both sets were perceived as relevant to clinical practice. Construct validity was good. Both sets were responsive but failed on test-retest reliability. One set took 24 minutes and the other 33 minutes to complete, on average. External GP trainees had comparable results. The U-CEP is a valid questionnaire to assess knowledge on clinical epidemiology, which is a prerequisite for practicing evidence-based medicine in daily clinical practice. Copyright © 2016 Elsevier Inc. All rights reserved.
The Validity and Responsiveness of Isometric Lower Body Multi-Joint Tests of Muscular Strength: a Systematic Review.

PubMed

Drake, David; Kennedy, Rodney; Wallace, Eric

2017-12-01

Researchers and practitioners working in sports medicine and science require valid tests to determine the effectiveness of interventions and enhance understanding of mechanisms underpinning adaptation. Such decision making is influenced by the supportive evidence describing the validity of tests within current research. The objective of this study is to review the validity of lower body isometric multi-joint tests ability to assess muscular strength and determine the current level of supporting evidence. Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines were followed in a systematic fashion to search, assess and synthesize existing literature on this topic. Electronic databases such as Web of Science, CINAHL and PubMed were searched up to 18 March 2015. Potential inclusions were screened against eligibility criteria relating to types of test, measurement instrument, properties of validity assessed and population group and were required to be published in English. The Consensus-based Standards for the Selection of health Measurement Instruments (COSMIN) checklist was used to assess methodological quality and measurement property rating of included studies. Studies rated as fair or better in methodological quality were included in the best evidence synthesis. Fifty-nine studies met the eligibility criteria for quality appraisal. The ten studies that rated fair or better in methodological quality were included in the best evidence synthesis. The most frequently investigated lower body isometric multi-joint tests for validity were the isometric mid-thigh pull and isometric squat. The validity of each of these tests was strong in terms of reliability and construct validity. The evidence for responsiveness of tests was found to be moderate for the isometric squat test and unknown for the isometric mid-thigh pull. No tests using the isometric leg press met the criteria for inclusion in the best evidence synthesis. Researchers and
Examining construct and predictive validity of the Health-IT Usability Evaluation Scale: confirmatory factor analysis and structural equation modeling results

PubMed Central

Yen, Po-Yin; Sousa, Karen H; Bakken, Suzanne

2014-01-01

Background In a previous study, we developed the Health Information Technology Usability Evaluation Scale (Health-ITUES), which is designed to support customization at the item level. Such customization matches the specific tasks/expectations of a health IT system while retaining comparability at the construct level, and provides evidence of its factorial validity and internal consistency reliability through exploratory factor analysis. Objective In this study, we advanced the development of Health-ITUES to examine its construct validity and predictive validity. Methods The health IT system studied was a web-based communication system that supported nurse staffing and scheduling. Using Health-ITUES, we conducted a cross-sectional study to evaluate users’ perception toward the web-based communication system after system implementation. We examined Health-ITUES's construct validity through first and second order confirmatory factor analysis (CFA), and its predictive validity via structural equation modeling (SEM). Results The sample comprised 541 staff nurses in two healthcare organizations. The CFA (n=165) showed that a general usability factor accounted for 78.1%, 93.4%, 51.0%, and 39.9% of the explained variance in ‘Quality of Work Life’, ‘Perceived Usefulness’, ‘Perceived Ease of Use’, and ‘User Control’, respectively. The SEM (n=541) supported the predictive validity of Health-ITUES, explaining 64% of the variance in intention for system use. Conclusions The results of CFA and SEM provide additional evidence for the construct and predictive validity of Health-ITUES. The customizability of Health-ITUES has the potential to support comparisons at the construct level, while allowing variation at the item level. We also illustrate application of Health-ITUES across stages of system development. PMID:24567081

Are the claims made in orthopaedic print advertisements valid?

PubMed

Davidson, Donald J; Rankin, Kenneth S; Jensen, Cyrus D; Moverley, Robert; Reed, Mike R; Sprowson, Andrew P

2014-05-01

Advertisements are commonplace in orthopaedic journals and may influence the readership with claims of clinical and scientific fact. Since the last assessment of the claims made in orthopaedic print advertisements ten years ago, there have been legislative changes and media scrutiny which have shaped this practice. The purpose of this study is to re-evaluate these claims. Fifty claims from 50 advertisements were chosen randomly from six highly respected peer-reviewed orthopaedic journals (published July-December 2011). The evidence supporting each claim was assessed and validated by three orthopaedic surgeons. The assessors, blinded to product and company, rated the evidence and answered the following questions: Does the evidence as presented support the claim made in the advertisement and what is the quality of that evidence? Is the claim supported by enough evidence to influence your own clinical practice? Twenty-eight claims cited evidence from published literature, four from public presentations, 11 from manufacturer "data held on file" and seven had no supporting evidence. Only 12 claims were considered to have high-quality evidence and only 11 were considered well supported. A strong correlation was seen between the quality of evidence and strength of support (Spearman r = 0.945, p < 0.0001). The average ICC between the assessors' ratings was strong (r = 0.85) giving validity to the results. Orthopaedic surgeons must remain sceptical about the claims made in print advertisements. High-quality evidence is required by orthopaedic surgeons to influence clinical practice and this evidence should be sought by manufacturers wishing to market a successful product.
Evidence-based medicine for every day, everyone, and every therapeutic study.

PubMed

Govindarajan, Raghav; Narayanaswami, Pushpa

2018-04-17

The rapid growth in published medical literature makes it difficult for clinicians to keep up with advances in their fields. This may result in a cursory scan of the abstract and conclusion of a study without critically evaluating study quality. The application of evidence-based medicine (EBM) is the process of converting the abstract task of reading the literature into a practical method of using the literature to inform care in a specific clinical context while simultaneously expanding one's knowledge. EBM involves 4 steps: (1) stating the clinical problem in a defined question; (2) searching the literature for the evidence; (3) critically appraising the evidence for its validity; and (4) applying the evidence in the context of the patient's situation, preferences, and values. In this review, we use the recently published trial of thymectomy in myasthenia gravis as an example and systematically go through the steps of assessing internal validity, precision, and external validity. Muscle Nerve, 2018. © 2018 Wiley Periodicals, Inc.
Beware of external validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.

PubMed

Majumdar, Subhabrata; Basak, Subhash C

2018-04-26

Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
The Evidence-based Practice Attitude Scale-36 (EBPAS-36): a brief and pragmatic measure of attitudes to evidence-based practice validated in US and Norwegian samples.

PubMed

Rye, Marte; Torres, Elisa M; Friborg, Oddgeir; Skre, Ingunn; Aarons, Gregory A

2017-04-04

Short and valid instruments for measuring factors facilitating or hindering implementation efforts are called for. This article describes (1) the adaptation of a shorter version of the Evidence-based Practice Attitude Scale (EBPAS-50 items), and (2) the psychometric properties of the shortened version in both US and Norwegian data. The US participants were mental health service providers (N = 418) recruited from clinics providing mental health services in San Diego County, California. The Norwegian participants were psychologists, psychiatric nurses, and psychology students (N = 838) recruited from the Norwegian Psychological Association and the Norwegian Nurses Organization. A confirmatory factor analysis (CFA) approach was used. The reduction resulted in 36 items named EBPAS-36, and the original 12 factor model was maintained. The EBPAS-36 had acceptable model fit, as indicated by a low degree of misspecification errors in both the US (RMSEA = .045 (CI 90% .040-.049); SRMR = .05) and the Norwegian data (RMSEA = .052 (CI 90% .047-.056, SRMR = .07). Incremental model fit was fair in the US (CFI = .93, TLI = .91) and in the Norwegian samples (CFI = .91, TLI = .89). The internal consistency (Cronbach's α) in the US and the Norwegian samples were good for the total EBPAS-36 score (.79 and .86, respectively) and were ranged from adequate to excellent for the subscales (US .60-.91 and Norway .61-.92). The EBPAS-36 has adequate psychometric properties both in US and Norwegian samples, hence indicating cross-cultural validity. It is a brief, pragmatic, and more user-friendly instrument than the EBPAS-50, yet maintains a broad scope by retaining the original 12 measurement domains.
Depression Anxiety Stress Scale: is it valid for children and adolescents?

PubMed

Patrick, Jeff; Dyck, Murray; Bramston, Paul

2010-09-01

The Depression Anxiety Stress Scale (Lovibond & Lovibond, 1995) is used to assess the severity of symptoms in child and adolescent samples although its validity in these populations has not been demonstrated. The authors assessed the latent structure of the 21-item version of the scale in samples of 425 and 285 children and adolescents on two occasions, one year apart. On each occasion, parallel analyses suggested that only one component should be extracted, indicating that the test does not differentiate depression, anxiety, and stress in children and adolescents. The results provide additional evidence that adult models of depression do not describe the experience of depression in children and adolescents. (c) 2010 Wiley Periodicals, Inc.
The DSM-5 social anxiety disorder severity scale: Evidence of validity and reliability in a clinical sample.

PubMed

LeBeau, Richard T; Mesri, Bita; Craske, Michelle G

2016-10-30

With DSM-5, the APA began providing guidelines for anxiety disorder severity assessment that incorporates newly developed self-report scales. The scales share a common template, are brief, and are free of copyright restrictions. Initial validation studies have been promising, but the English-language versions of the scales have not been formally validated in clinical samples. Forty-seven individuals with a principal diagnosis of Social Anxiety Disorder (SAD) completed a diagnostic assessment, as well as the DSM-5 SAD severity scale and several previously validated measures. The scale demonstrated internal consistency, convergent validity, and discriminant validity. The next steps in the validation process are outlined. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The Construct of the Learning Organization: Dimensions, Measurement, and Validation

ERIC Educational Resources Information Center

Yang, Baiyin; Watkins, Karen E.; Marsick, Victoria J.

2004-01-01

This research describes efforts to develop and validate a multidimensional measure of the learning organization. An instrument was developed based on a critical review of both the conceptualization and practice of this construct. Supporting validity evidence for the instrument was obtained from several sources, including best model-data fit among…
Cross-Cultural Adaptation and Validation of the SWAL-QoL Questionnaire in Greek.

PubMed

Georgopoulos, Voula C; Perdikogianni, Myrto; Mouskenteri, Myrto; Psychogiou, Loukia; Oikonomou, Maria; Malandraki, Georgia A

2018-02-01

The purpose of this study was to translate and adapt the 44-item SWAL-QoL into Greek and examine its internal consistency, test-retest reliability, external construct validity, and discriminant validity in order to provide a validated dysphagia-specific QoL instrument in the Greek language. The instrument was translated into Greek using the back translation to ensure linguistic validity and was culturally adapted resulting in the SWAL-QoL-GR. Two groups of participants were included: a patient group of 86 adults (48 males; age range: 18-87 years) diagnosed with oropharyngeal dysphagia, and an age-matched healthy control group (39 adults; 19 males; age range: 18-84 years). The Greek 30-item version of the WHOQOL-BREF was used for assessment of construct validity. Overall, the questionnaire achieved good to excellent psychometric values. Internal consistency of all 10 subscales and the physical symptoms scale of the SWAL-QoL-GR assessed by Cronbach's α was good to excellent (0.811 < α < 0.940). Test-retest validity was found to be good to excellent as well. In addition, moderate to strong correlations were found between seven of the ten subscales of the SWAL-QoL-GR with limited items of the WHOQΟL-BREF (0.401 < ρ < 0.65), supporting good construct validity of the SWAL-QoL-GR. The SWAL-QoL-GR also correctly differentiated between patients with dysphagia and age-matched healthy controls (p < 0.001) on all 11 scales, further indicating excellent discriminant validity. Finally, no significant differences were found between the two sexes. This cultural adaptation and validation allows the use of this tool in Greece, further enhancing our clinical and scientific efforts to increase the evidence-based practice resources for dysphagia rehabilitation in Greece.
Critical validation studies of neurofeedback.

PubMed

Gruzelier, John; Egner, Tobias

2005-01-01

The field of neurofeedback training has proceeded largely without validation. In this article the authors review studies directed at validating sensory motor rhythm, beta and alpha-theta protocols for improving attention, memory, and music performance in healthy participants. Importantly, benefits were demonstrable with cognitive and neurophysiologic measures that were predicted on the basis of regression models of learning to enhance sensory motor rhythm and beta activity. The first evidence of operant control over the alpha-theta ratio is provided, together with remarkable improvements in artistic aspects of music performance equivalent to two class grades in conservatory students. These are initial steps in providing a much needed scientific basis to neurofeedback.
Validating presupposed versus focused text information.

PubMed

Singer, Murray; Solar, Kevin G; Spear, Jackie

2017-04-01

There is extensive evidence that readers continually validate discourse accuracy and congruence, but that they may also overlook conspicuous text contradictions. Validation may be thwarted when the inaccurate ideas are embedded sentence presuppositions. In four experiments, we examined readers' validation of presupposed ("given") versus new text information. Throughout, a critical concept, such as a truck versus a bus, was introduced early in a narrative. Later, a character stated or thought something about the truck, which therefore matched or mismatched its antecedent. Furthermore, truck was presented as either given or new information. Mismatch target reading times uniformly exceeded the matching ones by similar magnitudes for given and new concepts. We obtained this outcome using different grammatical constructions and with different antecedent-target distances. In Experiment 4, we examined only given critical ideas, but varied both their matching and the main verb's factivity (e.g., factive know vs. nonfactive think). The Match × Factivity interaction closely resembled that previously observed for new target information (Singer, 2006). Thus, readers can successfully validate given target information. Although contemporary theories tend to emphasize either deficient or successful validation, both types of theory can accommodate the discourse and reader variables that may regulate validation.
Distinguishing between debris flows and floods from field evidence in small watersheds

USGS Publications Warehouse

Pierson, Thomas C.

2005-01-01

Post-flood indirect measurement techniques to back-calculate flood magnitude are not valid for debris flows, which commonly occur in small steep watersheds during intense rainstorms. This is because debris flows can move much faster than floods in steep channel reaches and much slower than floods in low-gradient reaches. In addition, debris-flow deposition may drastically alter channel geometry in reaches where slope-area surveys are applied. Because high-discharge flows are seldom witnessed and automated samplers are commonly plugged or destroyed, determination of flow type often must be made on the basis of field evidence preserved at the site.
Validation of the Intelligibility in Context Scale for Jamaican Creole-Speaking Preschoolers.

PubMed

Washington, Karla N; McDonald, Megan M; McLeod, Sharynne; Crowe, Kathryn; Devonish, Hubert

2017-08-15

To describe validation of the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012a) and ICS-Jamaican Creole (ICS-JC; McLeod, Harrison, & McCormack, 2012b) in a sample of typically developing 3- to 6-year-old Jamaicans. One-hundred and forty-five preschooler-parent dyads participated in the study. Parents completed the 7-item ICS (n = 145) and ICS-JC (n = 98) to rate children's speech intelligibility (5-point scale) across communication partners (parents, immediate family, extended family, friends, acquaintances, strangers). Preschoolers completed the Diagnostic Evaluation of Articulation and Phonology (DEAP; Dodd, Hua, Crosbie, Holm, & Ozanne, 2006) in English and Jamaican Creole to establish speech-sound competency. For this sample, we examined validity and reliability (interrater, test-rest, internal consistency) evidence using measures of speech-sound production: (a) percentage of consonants correct, (b) percentage of vowels correct, and (c) percentage of phonemes correct. ICS and ICS-JC ratings showed preschoolers were always (5) to usually (4) understood across communication partners (ICS, M = 4.43; ICS-JC, M = 4.50). Both tools demonstrated excellent internal consistency (α = .91), high interrater, and test-retest reliability. Significant correlations between the two tools and between each measure and language-specific percentage of consonants correct, percentage of vowels correct, and percentage of phonemes correct provided criterion-validity evidence. A positive correlation between the ICS and age further strengthened validity evidence for that measure. Both tools show promising evidence of reliability and validity in describing functional speech intelligibility for this group of typically developing Jamaican preschoolers.
Expanding the domains of attitudes towards evidence-based practice: the evidence based practice attitude scale-50.

PubMed

Aarons, Gregory A; Cafri, Guy; Lugo, Lindsay; Sawitzky, Angelina

2012-09-01

Mental health and social service provider attitudes toward evidence-based practice have been measured through the development and validation of the Evidence-Based Practice Attitude Scale (EBPAS; Aarons, Ment Health Serv Res 6(2):61-74, 2004). Scores on the EBPAS scales are related to provider demographic characteristics, organizational characteristics, and leadership. However, the EBPAS assesses only four domains of attitudes toward EBP. The current study expands and further identifies additional domains of attitudes towards evidence-based practice. A qualitative and quantitative mixed-methods approach was used to: (1) generate items from multiples sources (researcher, mental health program manager, clinician/therapist), (2) identify potential content domains, and (3) examine the preliminary domains and factor structure through exploratory factor analysis. Participants for item generation included the investigative team, a group of mental health program managers (n = 6), and a group of clinicians/therapists (n = 8). For quantitative analyses a sample of 422 mental health service providers from 65 outpatient programs in San Diego County completed a survey that included the new items. Eight new EBPAS factors comprised of 35 items were identified. Factor loadings were moderate to large and internal consistency reliabilities were fair to excellent. We found that the convergence of these factors with the four previously identified evidence-based practice attitude factors (15 items) was small to moderate suggesting that the newly identified factors represent distinct dimensions of mental health and social service provider attitudes toward adopting EBP. Combining the original 15 items with the 35 new items comprises the EBPAS 50-item version (EBPAS-50) that adds to our understanding of provider attitudes toward adopting EBPs. Directions for future research are discussed.
Evidence based vaccinology.

PubMed

Nalin, David R

2002-02-22

Evidence based vaccinology (EBV) is the identification and use of the best evidence in making and implementing decisions during all of the stages of the life of a vaccine, including pre-licensure vaccine development and post-licensure manufacture and research, and utilization of the vaccine for disease control. Vaccines, unlike most pharmaceuticals, are in a continuous process of development both before and after licensure. Changes in biologics manufacturing technology and changes that vaccines induce in population and disease biology lead to periodic review of regimens (and sometimes dosage) based on changing immunologic data or public perceptions relevant to vaccine safety and effectiveness. EBV includes the use of evidence based medicine (EBM) both in clinical trials and in national disease containment programs. The rationale for EBV is that the highest evidentiary standards are required to maintain a rigorous scientific basis of vaccine quality control in manufacture and to ensure valid determination of vaccine efficacy, field effectiveness and safety profiles (including post-licensure safety monitoring), cost-benefit analyses, and risk:benefit ratios. EBV is increasingly based on statistically validated, clearly defined laboratory, manufacturing, clinical and epidemiological research methods and procedures, codified as good laboratory practices (GLP), good manufacturing practices (GMP), good clinical research practices (GCRP) and in clinical and public health practice (good vaccination practices, GVP). Implementation demands many data-driven decisions made by a spectrum of specialists pre- and post-licensure, and is essential to maintaining public confidence in vaccines.
Validation methodology in publications describing epidemiological registration methods of dental caries: a systematic review.

PubMed

Sjögren, P; Ordell, S; Halling, A

2003-12-01

The aim was to describe and systematically review the methodology and reporting of validation in publications describing epidemiological registration methods for dental caries. BASIC RESEARCH METHODOLOGY: Literature searches were conducted in six scientific databases. All publications fulfilling the predetermined inclusion criteria were assessed for methodology and reporting of validation using a checklist including items described previously as well as new items. The frequency of endorsement of the assessed items was analysed. Moreover, the type and strength of evidence, was evaluated. Reporting of predetermined items relating to methodology of validation and the frequency of endorsement of the assessed items were of primary interest. Initially 588 publications were located. 74 eligible publications were obtained, 23 of which fulfilled the inclusion criteria and remained throughout the analyses. A majority of the studies reported the methodology of validation. The reported methodology of validation was generally inadequate, according to the recommendations of evidence-based medicine. The frequencies of reporting the assessed items (frequencies of endorsement) ranged from four to 84 per cent. A majority of the publications contributed to a low strength of evidence. There seems to be a need to improve the methodology and the reporting of validation in publications describing professionally registered caries epidemiology. Four of the items assessed in this study are potentially discriminative for quality assessments of reported validation.
Implementation and Initial Validation of the APS English Test [and] The APS English-Writing Test at Golden West College: Evidence for Predictive Validity.

ERIC Educational Resources Information Center

Isonio, Steven

In May 1991, Golden West College (California) conducted a validation study of the English portion of the Assessment and Placement Services for Community Colleges (APS), followed by a predictive validity study in July 1991. The initial study was designed to aid in the implementation of the new test at GWC by comparing data on APS use at other…
Factorial validity of the Movement Assessment Battery for Children-2 (age band 2).

PubMed

Wagner, Matthias Oliver; Kastner, Julia; Petermann, Franz; Bös, Klaus

2011-01-01

The Movement Assessment Battery for Children-2 (M-ABC-2) is one of the most commonly used tests for the diagnosis of specific developmental disorders of motor function (F82). The M-ABC-2 comprises eight subtests per age band (AB) that are assigned to three dimensions: manual dexterity, aiming and catching, and balance. However, while previous exploratory findings suggested the correctness of the assumption of factorial validity, there is no empirical evidence that the M-ABC-2 subtests allow for a valid reproduction of the postulated factorial structure. The purpose of this study was to empirically confirm the factorial validity of the M-ABC-2. The German normative sample of AB2 (7-10 years; N=323) was used as the study sample for the empirical analyses. Confirmatory factor analysis was used to verify the factorial validity of the M-ABC-2 (AB2). The incremental fit indices (χ2=28.675; df=17; Bollen-Stine p value=0.318; RMSEA=0.046 [0.011-0.075]; SRMR=0.038; CFI=0.960) provided evidence for the factorial validity of the M-ABC-2 (AB2). However, because of a lack of empirical verification for convergent and discriminant validity, there is still no evidence that F82 can be diagnosed using M-ABC-2 (AB2). Copyright © 2010 Elsevier Ltd. All rights reserved.
Validity and reliability of the Spanish version of the Organizational Readiness for Knowledge Translation (OR4KT) questionnaire.

PubMed

Grandes, Gonzalo; Bully, Paola; Martinez, Catalina; Gagnon, Marie-Pierre

2017-11-10

Organizational readiness to change healthcare practice is a major determinant of successful implementation of evidence-based interventions. However, we lack of comprehensive, valid, and reliable instruments to measure it. We assessed the validity and reliability of the Spanish version of the Organizational Readiness for Knowledge Translation (OR4KT) questionnaire in the context of the implementation of the Prescribe Vida Saludable III project, which seeks to strengthen health promotion and chronic disease prevention in primary healthcare organizations of the Osakidetza (Basque Health Service, Spain). A cross-sectional study was conducted including 127 professionals from 20 primary care centers within Osakidetza. They filled in the OR4KT questionnaire twice in a 15- to 30-day period to test repeatability. In addition, we used the Survey of Organizational Attributes for Primary Care (SOAPC) and we documented the number of healthcare professionals who formally engaged in the Prescribe Vida Saludable III project within each participating center to assess concurrent validity. Cronbach's alpha for the overall OR4KT was .95, and the overall repeatability coefficient was 6.95%, both excellent results. Confirmatory factor analysis supported the underlying theoretical structure of 6 dimensions and 23 sub-dimensions. There were positive moderate-to-high internal correlations between these six dimensions, and there was evidence of good concurrent validity (correlation coefficient of .76 with SOAPC, and .80 with the proportion of professionals engaged by center). A score higher than 64 (out of 100) would be indicative of an organization with high level of readiness to implement the intervention (sensitivity = .75, specificity = 1). The Spanish version of the OR4KT exhibits very strong reliability and good validity, although it needs to be validated in a larger sample and in different implementation contexts.
Demonstrating the validity of three general scores of PET in predicting higher education achievement in Israel.

PubMed

Oren, Carmel; Kennet-Cohen, Tamar; Turvall, Elliot; Allalouf, Avi

2014-01-01

The Psychometric Entrance Test (PET), used for admission to higher education in Israel together with the Matriculation (Bagrut), had in the past one general (total) score in which the weights for its domains: Verbal, Quantitative and English, were 2:2:1, respectively. In 2011, two additional total scores were introduced, with different weights for the Verbal and the Quantitative domains. This study compares the predictive validity of the three general scores of PET, and demonstrates validity in terms of utility. 100,863 freshmen students of all Israeli universities over the classes of 2005-2009. Regression weights and correlations of the predictors with FYGPA were computed. Simulations based on these results supplied the utility estimates. On average, PET is slightly more predictive than the Bagrut; using them both yields a better tool than either of them alone. Assigning differential weights to the components in the respective schools further improves the validity. The introduction of the new general scores of PET is validated by gathering and analyzing evidence based on relations of test scores to other variables. The utility of using the test can be demonstrated in ways different from correlations.
Social Skills Questionnaire for Argentinean College Students (SSQ-U) Development and Validation.

PubMed

Morán, Valeria E; Olaz, Fabián O; Del Prette, Zilda A P

2015-11-27

In this paper we present a new instrument called Social Skills Questionnaire for Argentinean College Students (SSQ-U). Based on the adapted version of the Social Skills Inventory - Del Prette (SSI-Del Prette) (Olaz, Medrano, Greco, & Del Prette, 2009), we wrote new items for the scale, and carried out psychometric analysis to assess the validity and reliability of the instrument. In the first study, we collected evidence based on test content through expert judges who evaluated the quality and the relevance of the items. In the second and third studies, we provided validity evidence based on the internal structure of the instrument using exploratory (n = 1067) and confirmatory (n = 661) factor analysis. Results suggested a five-factor structure consistent with the dimensions of social skills, as proposed by Kelly (2002). The fit indexes corresponding to the obtained model were adequate, and composite reliability coefficients of each factor were excellent (above .75). Finally, in the fourth study, we provided evidence of convergent and discriminant validity. The obtained results allow us to conclude that the SSQ-U is the first valid and reliable instrument for measuring social skills in Argentinean college students.

PLCO Ovarian Phase III Validation Study — EDRN Public Portal

Cancer.gov

Our preliminary data indicate that the performance of CA 125 as a screening test for ovarian cancer can be improved upon by additional biomarkers. With completion of one additional validation step, we will be ready to test the performance of a consensus marker panel in a phase III validation study. Given the original aims of the PLCO trial, we believe that the PLCO represents an ideal longitudinal cohort offering specimens for phase III validation of ovarian cancer biomarkers.
Validation and adaptation of the hospital consumer assessment of healthcare providers and systems in Arabic context: Evidence from Saudi Arabia.

PubMed

Alanazi, Mohammed R; Alamry, Ahmed; Al-Surimi, Khaled

One of the main purposes of healthcare organizations is to serve patients by providing safe and high-quality patient-centered care. Patients are considered the most appropriate source to assess the quality level of healthcare services. The objectives of this paper were to describe the translation and adaptation process of the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey for Arabic speaking populations, examine the degree of equivalence between the original English version and the Arabic translated version, and estimate and report the validity and reliability of the translated Arabic HCAHPS version. The translation process had four main steps: (1) qualified bilingual translators translated the HCAHPS from English to Arabic; (2) the Arabic version was translated back to English and reviewed by experts to ensure content accuracy (content equivalence); (3) both Arabic and English versions were verified for accuracy and validity of the translation, checking for the similarities and differences (semantic equivalence); (4) finally, two independent bilinguals reviewed and made the final revision of both the Arabic and English versions separately and agreed on one final version that is similar and equivalent to the original English version in terms of content and meaning. The study findings showed that the overall Cronbach's α for the Arabic HCAHPS version was 0.90, showing good internal consistency across the 9 separate domains, which ranged from 0.70 to 0.97 Cronbach's α. The correlation coefficient between each statement for each separate domain revealed a highly positive significant correlation ranging from 0.72 to 0.89. The results of the study show empirical evidence of validity and reliability of HCAHPS in its Arabic version. Moreover, the Arabic version of HCAHPS in our study presented good internal consistency and it is highly recommended to be replicated and applied in the context of other Arab countries. Copyright © 2017
The predictive validity of a situational judgement test, a clinical problem solving test and the core medical training selection methods for performance in specialty training .

PubMed

Patterson, Fiona; Lopes, Safiatu; Harding, Stephen; Vaux, Emma; Berkin, Liz; Black, David

2017-02-01

The aim of this study was to follow up a sample of physicians who began core medical training (CMT) in 2009. This paper examines the long-term validity of CMT and GP selection methods in predicting performance in the Membership of Royal College of Physicians (MRCP(UK)) examinations. We performed a longitudinal study, examining the extent to which the GP and CMT selection methods (T1) predict performance in the MRCP(UK) examinations (T2). A total of 2,569 applicants from 2008-09 who completed CMT and GP selection methods were included in the study. Looking at MRCP(UK) part 1, part 2 written and PACES scores, both CMT and GP selection methods show evidence of predictive validity for the outcome variables, and hierarchical regressions show the GP methods add significant value to the CMT selection process. CMT selection methods predict performance in important outcomes and have good evidence of validity; the GP methods may have an additional role alongside the CMT selection methods. © Royal College of Physicians 2017. All rights reserved.
Development and Validation of the Narrative Quality Assessment Tool.

PubMed

Kim, Wonsun Sunny; Shin, Cha-Nam; Kathryn Larkey, Linda; Roe, Denise J

2017-04-01

The use of storytelling in health promotion has grown over the past 2 decades, showing promise for moving people to initiate healthy behavior change. Given the increasingly prevalent role of storytelling in health promotion research and the need to more clearly identify what storytelling elements and mediators may better predict behavior change, there is a need to develop measures to specifically assess these factors in a cultural community context. The purpose of this study is to develop and preliminarily validate a narrative quality assessment tool for measuring elements of storytelling that are predicted to affect attitude and behavior change (i.e., narrative characteristics, identification, and transportation) within a cultural community setting using a culture-centric model. Reliability and validity of these scales were assessed with repeated administrations among 74 Latino men and women with a mean age of 39.6 years (SD = 11.47 years). The confirmatory factor analysis in addition to internal consistency tests revealed preliminary evidence for reliability and validity of the narrative characteristics, identification, and transportation scales. Cronbach's alpha ranged from .92 to .94. Items revealed adequate factor loadings (.85-.98) and good model fit. The new scales provide the first step in moving the assessment of narrative quality into a culturally relevant context for evaluation of story use in health promotion. The results present valuable information for nurse researchers to guide the development and testing of culturally grounded storytelling interventions' potential to predict attitude and behavior change for patients.
A systematic review of the reliability and validity of discrete choice experiments in valuing non-market environmental goods.

PubMed

Rakotonarivo, O Sarobidy; Schaafsma, Marije; Hockley, Neal

2016-12-01

While discrete choice experiments (DCEs) are increasingly used in the field of environmental valuation, they remain controversial because of their hypothetical nature and the contested reliability and validity of their results. We systematically reviewed evidence on the validity and reliability of environmental DCEs from the past thirteen years (Jan 2003-February 2016). 107 articles met our inclusion criteria. These studies provide limited and mixed evidence of the reliability and validity of DCE. Valuation results were susceptible to small changes in survey design in 45% of outcomes reporting reliability measures. DCE results were generally consistent with those of other stated preference techniques (convergent validity), but hypothetical bias was common. Evidence supporting theoretical validity (consistency with assumptions of rational choice theory) was limited. In content validity tests, 2-90% of respondents protested against a feature of the survey, and a considerable proportion found DCEs to be incomprehensible or inconsequential (17-40% and 10-62% respectively). DCE remains useful for non-market valuation, but its results should be used with caution. Given the sparse and inconclusive evidence base, we recommend that tests of reliability and validity are more routinely integrated into DCE studies and suggest how this might be achieved. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Validity Evidence for the Security Scale as a Measure of Perceived Attachment Security in Adolescence

ERIC Educational Resources Information Center

Van Ryzin, Mark J.; Leve, Leslie D.

2012-01-01

In this study, the validity of a self-report measure of children's perceived attachment security (the Kerns Security Scale) was tested using adolescents. With regards to predictive validity, the Security Scale was significantly associated with (1) observed mother-adolescent interactions during conflict and (2) parent- and teacher-rated social…
Stakeholder validation of a model of readiness for transition to adult care.

PubMed

Schwartz, Lisa A; Brumley, Lauren D; Tuchman, Lisa K; Barakat, Lamia P; Hobbie, Wendy L; Ginsberg, Jill P; Daniel, Lauren C; Kazak, Anne E; Bevans, Katherine; Deatrick, Janet A

2013-10-01

That too few youth with special health care needs make the transition to adult-oriented health care successfully may be due, in part, to lack of readiness to transfer care. There is a lack of theoretical models to guide development and implementation of evidence-based guidelines, assessments, and interventions to improve transition readiness. To further validate the Social-ecological Model of Adolescent and Young Adult Readiness to Transition (SMART) via feedback from stakeholders (patients, parents, and providers) from a medically diverse population in need of life-long follow-up care, survivors of childhood cancer. Mixed-methods participatory research design. A large Mid-Atlantic children's hospital. Adolescent and young adult survivors of childhood cancer (n = 14), parents (n = 18), and pediatric providers (n = 10). Patients and parents participated in focus groups; providers participated in individual semi-structured interviews. Validity of SMART was assessed 3 ways: (1) ratings on importance of SMART components for transition readiness using a 5-point scale (0-4; ratings >2 support validity), (2) nominations of 3 "most important" components, and (3) directed content analysis of focus group/interview transcripts. Qualitative data supported the validity of SMART, with minor modifications to definitions of components. Quantitative ratings met criteria for validity; stakeholders endorsed all components of SMART as important for transition. No additional SMART variables were suggested by stakeholders and the "most important" components varied by stakeholders, thus supporting the comprehensiveness of SMART and need to involve multiple perspectives. SMART represents a comprehensive and empirically validated framework for transition research and program planning, supported by survivors of childhood cancer, parents, and pediatric providers. Future research should validate SMART among other populations with special health care needs.
Reliability and validity of the parent efficacy for child healthy weight behaviour (PECHWB) scale.

PubMed

Palmer, F; Davis, M C

2014-05-01

Interventions for childhood overweight and obesity that target parents as the agents of change by increasing parent self-efficacy for facilitating their child's healthy weight behaviours require a reliable and valid tool to measure parent self-efficacy before and after interventions. Nelson and Davis developed the Parent Efficacy for Child Healthy Weight Behaviour (PECHWB) scale with good preliminary evidence of reliability and validity. The aim of this research was to provide further psychometric evidence from an independent Australian sample. Data were provided by a convenience sample of 261 primary caregivers of children aged 4-17 years via an online survey. PECHWB scores were correlated with scores on other self-report measures of parenting efficacy and 2- to 4-week test-retest reliability of the PECHWB was assessed. The results of the study confirmed the four-factor structure of the PECHWB (Fat and Sugar, Sedentary Behaviours, Physical Activity, and Fruit and Vegetables) and provided strong evidence of internal consistency and test-retest reliability, as well as good evidence of convergent validity. Future research should investigate the properties of the PECHWB in a sample of parents of overweight or obese children, including measures of child weight and actual child healthy weight behaviours to provide evidence of the concurrent and predictive validity of PECHWB scores. © 2013 John Wiley & Sons Ltd.
20 CFR 220.14 - Weighing of evidence.

Code of Federal Regulations, 2013 CFR

2013-04-01

... capacity evaluation is based upon functional objective tests with high validity and reliability; (2) The... exam findings which is indicative of exaggerated or potential malingering response; (6) The evidence...
20 CFR 220.14 - Weighing of evidence.

Code of Federal Regulations, 2011 CFR

2011-04-01

... capacity evaluation is based upon functional objective tests with high validity and reliability; (2) The... exam findings which is indicative of exaggerated or potential malingering response; (6) The evidence...
20 CFR 220.14 - Weighing of evidence.

Code of Federal Regulations, 2014 CFR

2014-04-01

... capacity evaluation is based upon functional objective tests with high validity and reliability; (2) The... exam findings which is indicative of exaggerated or potential malingering response; (6) The evidence...
20 CFR 220.14 - Weighing of evidence.

Code of Federal Regulations, 2012 CFR

2012-04-01

... capacity evaluation is based upon functional objective tests with high validity and reliability; (2) The... exam findings which is indicative of exaggerated or potential malingering response; (6) The evidence...
Understanding the State of the Art for Measurement in Chemistry Education Research: Examining the Psychometric Evidence

ERIC Educational Resources Information Center

Arjoon, Janelle A.; Xu, Xiaoying; Lewis, Jennifer E.

2013-01-01

education community are relatively new. Because psychometric evidence dictates the validity of interpretations made from test scores, gathering and reporting validity and reliability evidence is of utmost importance. Therefore, the purpose of this study was to investigate what…
Policy and Validity Prospects for Performance-Based Assessment.

ERIC Educational Resources Information Center

Baker, Eva L.; And Others

1994-01-01

This article describes performance-based assessment as expounded by its proponents, comments on these conceptions, reviews evidence regarding the technical quality of performance-based assessment, and considers its validity under various policy options. (JDD)
Content validation of terms and definitions in a wound glossary.

PubMed

Milne, Catherine T; Paine, Tim; Sullivan, Valerie; Sawyer, Allen

2011-12-01

A common language and lexicon provide the easiest means of mutual understanding. Inconsistency in terminology makes effective information exchange difficult. Previous studies identified the need to determine standard, accepted definitions for the vocabulary frequently used in wound care. The objective of this study was to establish content validation for these terms and develop an evidence-based glossary for this specialty. Members of the Association for the Advancement of Wound Care Quality of Care Task Force reviewed literature to determine glossary content generation and the associated literature-based definitions. Thirty-nine wound care professionals from wound care stakeholder professional organizations in the United States and Canada participated in the content validation process. Participants were asked to quantify the degree of validity using a 367-item, 4-point Likert-type scale. On a scale of 1 to 4, the mean score of the entire instrument was 3.84. The instrument's overall scale content validity index was 0.96. Terms with an item content validity index of less than 0.70 were removed from the glossary, leaving 365 items with established content validity. Qualitative data analysis revealed themes suggesting that enhanced communication between providers improves patient outcomes. The need for ongoing updates of the glossary was also identified. The wound care glossary in its finalized form proved valid. An evidence-based glossary bridges the chasm of miscommunication and nonstandardization so that wound care, as an emerging specialized medical science field, can move forward to optimize both process and clinical outcomes.
Validation of the Social Appearance Anxiety Scale: factor, convergent, and divergent validity.

PubMed

Levinson, Cheri A; Rodebaugh, Thomas L

2011-09-01

The Social Appearance Anxiety Scale (SAAS) was created to assess fear of overall appearance evaluation. Initial psychometric work indicated that the measure had a single-factor structure and exhibited excellent internal consistency, test-retest reliability, and convergent validity. In the current study, the authors further examined the factor, convergent, and divergent validity of the SAAS in two samples of undergraduates. In Study 1 (N = 323), the authors tested the factor structure, convergent, and divergent validity of the SAAS with measures of the Big Five personality traits, negative affect, fear of negative evaluation, and social interaction anxiety. In Study 2 (N = 118), participants completed a body evaluation that included measurements of height, weight, and body fat content. The SAAS exhibited excellent convergent and divergent validity with self-report measures (i.e., self-esteem, trait anxiety, ethnic identity, and sympathy), predicted state anxiety experienced during the body evaluation, and predicted body fat content. In both studies, results confirmed a single-factor structure as the best fit to the data. These results lend additional support for the use of the SAAS as a valid measure of social appearance anxiety.
QSAR prediction of additive and non-additive mixture toxicities of antibiotics and pesticide.

PubMed

Qin, Li-Tang; Chen, Yu-Han; Zhang, Xin; Mo, Ling-Yun; Zeng, Hong-Hu; Liang, Yan-Peng

2018-05-01

Antibiotics and pesticides may exist as a mixture in real environment. The combined effect of mixture can either be additive or non-additive (synergism and antagonism). However, no effective predictive approach exists on predicting the synergistic and antagonistic toxicities of mixtures. In this study, we developed a quantitative structure-activity relationship (QSAR) model for the toxicities (half effect concentration, EC 50 ) of 45 binary and multi-component mixtures composed of two antibiotics and four pesticides. The acute toxicities of single compound and mixtures toward Aliivibrio fischeri were tested. A genetic algorithm was used to obtain the optimized model with three theoretical descriptors. Various internal and external validation techniques indicated that the coefficient of determination of 0.9366 and root mean square error of 0.1345 for the QSAR model predicted that 45 mixture toxicities presented additive, synergistic, and antagonistic effects. Compared with the traditional concentration additive and independent action models, the QSAR model exhibited an advantage in predicting mixture toxicity. Thus, the presented approach may be able to fill the gaps in predicting non-additive toxicities of binary and multi-component mixtures. Copyright © 2018 Elsevier Ltd. All rights reserved.
Translations of Developmental Screening Instruments: An Evidence Map of Available Research.

PubMed

El-Behadli, Ana F; Neger, Emily N; Perrin, Ellen C; Sheldrick, R Christopher

2015-01-01

Children whose parents do not speak English experience significant disparities in the identification of developmental delays and disorders; however, little is known about the availability and validity of translations of developmental screeners. The goal was to create a map of the scientific evidence regarding translations of the 9 Academy of Pediatrics-recommended screening instruments into languages other than English. The authors conducted a systematic search of Medline and PsycINFO, references of identified articles, publishers' Web sites, and official manuals. Through evidence mapping, a new methodology supported by AHRQ and the Cochrane Collaboration, the authors documented the extent and distribution of published evidence supporting translations of developmental screeners. Data extraction focused on 3 steps of the translation and validation process: (1) translation methods used, (2) collection of normative data in the target language, and (3) evidence for reliability and validity. The authors identified 63 distinct translations among the 9 screeners, of which 44 had supporting evidence published in peer-reviewed sources. Of the 63 translations, 35 had at least some published evidence regarding translation methods used, 28 involving normative data, and 32 regarding reliability and/or construct validity. One-third of the translations found were of the Denver Developmental Screening Test. Specific methods used varied greatly across screeners, as did the level of detail with which results were reported. Few developmental screeners have been translated into many languages. Evidence map of the authors demonstrates considerable variation in both the amount and the comprehensiveness of information available about translated instruments. Informal guidelines exist for conducting translation of psychometric instruments but not for documentation of this process. The authors propose that uniform guidelines be established for reporting translation research in peer
Examining construct and predictive validity of the Health-IT Usability Evaluation Scale: confirmatory factor analysis and structural equation modeling results.

PubMed

Yen, Po-Yin; Sousa, Karen H; Bakken, Suzanne

2014-10-01

In a previous study, we developed the Health Information Technology Usability Evaluation Scale (Health-ITUES), which is designed to support customization at the item level. Such customization matches the specific tasks/expectations of a health IT system while retaining comparability at the construct level, and provides evidence of its factorial validity and internal consistency reliability through exploratory factor analysis. In this study, we advanced the development of Health-ITUES to examine its construct validity and predictive validity. The health IT system studied was a web-based communication system that supported nurse staffing and scheduling. Using Health-ITUES, we conducted a cross-sectional study to evaluate users' perception toward the web-based communication system after system implementation. We examined Health-ITUES's construct validity through first and second order confirmatory factor analysis (CFA), and its predictive validity via structural equation modeling (SEM). The sample comprised 541 staff nurses in two healthcare organizations. The CFA (n=165) showed that a general usability factor accounted for 78.1%, 93.4%, 51.0%, and 39.9% of the explained variance in 'Quality of Work Life', 'Perceived Usefulness', 'Perceived Ease of Use', and 'User Control', respectively. The SEM (n=541) supported the predictive validity of Health-ITUES, explaining 64% of the variance in intention for system use. The results of CFA and SEM provide additional evidence for the construct and predictive validity of Health-ITUES. The customizability of Health-ITUES has the potential to support comparisons at the construct level, while allowing variation at the item level. We also illustrate application of Health-ITUES across stages of system development. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Measuring intra-operative decision-making during laparoscopic cholecystectomy: validity evidence for a novel interactive Web-based assessment tool.

PubMed

Madani, Amin; Watanabe, Yusuke; Bilgic, Elif; Pucher, Philip H; Vassiliou, Melina C; Aggarwal, Rajesh; Fried, Gerald M; Mitmaker, Elliot J; Feldman, Liane S

2017-03-01

Errors in judgment during laparoscopic cholecystectomy can lead to bile duct injuries and other complications. Despite correlations between outcomes, expertise and advanced cognitive skills, current methods to evaluate these skills remain subjective, rater- and situation-dependent and non-systematic. The purpose of this study was to develop objective metrics using a Web-based platform and to obtain validity evidence for their assessment of decision-making during laparoscopic cholecystectomy. An interactive online learning platform was developed ( www.thinklikeasurgeon.com ). Trainees and surgeons from six institutions completed a 12-item assessment, developed based on a cognitive task analysis. Five items required subjects to draw their answer on the surgical field, and accuracy scores were calculated based on an algorithm derived from experts' responses ("visual concordance test", VCT). Test-retest reliability, internal consistency, and correlation with self-reported experience, Global Operative Assessment of Laparoscopic Skills (GOALS) score and Objective Performance Rating Scale (OPRS) score were calculated. Questionnaires were administered to evaluate the platform's usability, feasibility and educational value. Thirty-nine subjects (17 surgeons, 22 trainees) participated. There was high test-retest reliability (intraclass correlation coefficient = 0.95; n = 10) and internal consistency (Cronbach's α = 0.87). The assessment demonstrated significant differences between novices, intermediates and experts in total score (p < 0.01) and VCT score (p < 0.01). There was high correlation between total case number and total score (ρ = 0.83, p < 0.01) and between total case number and VCT (ρ = 0.82, p < 0.01), and moderate to high correlations between total score and GOALS (ρ = 0.66, p = 0.05), VCT and GOALS (ρ = 0.83, p < 0.01), total score and OPRS (ρ = 0.67, p = 0.04), and VCT and OPRS (ρ = 0.78, p = 0.01). Most subjects agreed

Implicit attitudes towards homosexuality: reliability, validity, and controllability of the IAT.

PubMed

Banse, R; Seise, J; Zerbes, N

2001-01-01

Two experiments were conducted to investigate the psychometric properties of an Implicit Association Test (IAT; Greenwald, McGhee, & Schwartz, 1998) that was adapted to measure implicit attitudes towards homosexuality. In a first experiment, the validity of the Homosexuality-IAT was tested using a known group approach. Implicit and explicit attitudes were assessed in heterosexual and homosexual men and women (N = 101). The results provided compelling evidence for the convergent and discriminant validity of the Homosexuality-IAT as a measure of implicit attitudes. No evidence was found for two alternative explanations of IAT effects (familiarity with stimulus material and stereotype knowledge). The internal consistency of IAT scores was satisfactory (alpha s > .80), but retest correlations were lower. In a second experiment (N = 79) it was shown that uninformed participants were able to fake positive explicit but not implicit attitudes. Discrepancies between implicit and explicit attitudes towards homosexuality could be partially accounted for by individual differences in the motivation to control prejudiced behavior, thus providing independent evidence for the validity of the implicit attitude measure. Neither explicit nor implicit attitudes could be changed by persuasive messages. The results of both experiments are interpreted as evidence for a single construct account of implicit and explicit attitudes towards homosexuality.
Assessment of family functioning in Caucasian and Hispanic Americans: reliability, validity, and factor structure of the Family Assessment Device.

PubMed

Aarons, Gregory A; McDonald, Elizabeth J; Connelly, Cynthia D; Newton, Rae R

2007-12-01

The purpose of this study was to examine the factor structure, reliability, and validity of the Family Assessment Device (FAD) among a national sample of Caucasian and Hispanic American families receiving public sector mental health services. A confirmatory factor analysis conducted to test model fit yielded equivocal findings. With few exceptions, indices of model fit, reliability, and validity were poorer for Hispanic Americans compared with Caucasian Americans. Contrary to our expectation, an exploratory factor analysis did not result in a better fitting model of family functioning. Without stronger evidence supporting a reformulation of the FAD, we recommend against such a course of action. Findings highlight the need for additional research on the role of culture in measurement of family functioning.
The Cerebral Palsy Quality of Life for Children (CP QOL-Child): Evidence of Construct Validity

ERIC Educational Resources Information Center

Chen, Kuan-Lin; Wang, Hui-Yi; Tseng, Mei-Hui; Shieh, Jeng-Yi; Lu, Lu; Yao, Kai-Ping Grace; Huang, Chien-Yu

2013-01-01

The Cerebral Palsy Quality of Life for Children (CP QOL-Child) is the first health condition-specific questionnaire designed for measuring QOL in children with cerebral palsy (CP). However, its construct validity has not yet been confirmed by confirmatory factor analysis (CFA). Hence, this study assessed the construct validity of the caregiver…
Developing an evidence-based practice protocol: implications for midwifery practice.

PubMed

Carr, K C

2000-01-01

Evidence-based practice is defined and its importance to midwifery practice is presented. Guidelines are provided for the development of an evidence-based practice protocol. These include: identifying the clinical question, obtaining the evidence, evaluating the validity and importance of the evidence, synthesizing the evidence and applying it to the development of a protocol or clinical algorithm, and, finally, developing an evaluation plan or measurement strategy to see if the new protocol is effective.
Portuguese version of a stress and well-being evaluation tool (ASSET)at the workplace: validation of the psychometric properties

PubMed Central

Moreira, Sérgio; Carreiras, Joana; Cooper, Cary; Smeed, Matthew; Reis, Maria de Fátima; Pereira Miguel, José

2018-01-01

Objective The main objective of this work was to translate the English version of ASSET (A Shortened Stress Evaluation Tool) into the Portuguese version and to validate its psychometric properties. Additionally, this work tested the convergent validity of the instrument. Methods The translation and retroversion were conducted by experts and submitted to the authors for approval. Within an observational, cross-sectional study, regarding mental health at the workplace, ASSET together with other scales was applied to a sample of 405 participants. The psychometric validity of the subscales was studied using confirmatory factorial analysis. Results The factorial structure of ASSET is globally supported by the results, with the Perceptions of Your Job and Attitudes Towards your Organisation subscales requiring slight adjustments in the item structure and the Your Health subscales replicating the original structure. The convergent validity also supports the ASSET, showing that all subscales are significantly correlated with variables used to test convergence. Conclusions Globally, the results constitute an important contribution to ASSET and open the possibility of its usage among Portuguese-speaking countries. The results provide an evidence on the validity of the instrument and, in particular, of the mental and physical health subscales. PMID:29440211
The Riso-Hudson Enneagram Type Indicator: Estimates of Reliability and Validity

ERIC Educational Resources Information Center

Newgent, Rebecca A.; Parr, Patricia E.; Newman, Isadore; Higgins, Kristin K.

2004-01-01

This investigation was conducted to estimate the reliability and validity of scores on the Riso-Hudson Enneagram Type Indicator (D. R. Riso & R. Hudson, 1999a). Results of 287 participants were analyzed. Alpha suggests an adequate degree of internal consistency. Evidence provides mixed support for construct validity using correlational and…
Reliability and validity in a nutshell.

PubMed

Bannigan, Katrina; Watson, Roger

2009-12-01

To explore and explain the different concepts of reliability and validity as they are related to measurement instruments in social science and health care. There are different concepts contained in the terms reliability and validity and these are often explained poorly and there is often confusion between them. To develop some clarity about reliability and validity a conceptual framework was built based on the existing literature. The concepts of reliability, validity and utility are explored and explained. Reliability contains the concepts of internal consistency and stability and equivalence. Validity contains the concepts of content, face, criterion, concurrent, predictive, construct, convergent (and divergent), factorial and discriminant. In addition, for clinical practice and research, it is essential to establish the utility of a measurement instrument. To use measurement instruments appropriately in clinical practice, the extent to which they are reliable, valid and usable must be established.
Constructing a Validity Argument for the Objective Structured Assessment of Technical Skills (OSATS): A Systematic Review of Validity Evidence

ERIC Educational Resources Information Center

Hatala, Rose; Cook, David A.; Brydges, Ryan; Hawkins, Richard

2015-01-01

In order to construct and evaluate the validity argument for the Objective Structured Assessment of Technical Skills (OSATS), based on Kane's framework, we conducted a systematic review. We searched MEDLINE, EMBASE, CINAHL, PsycINFO, ERIC, Web of Science, Scopus, and selected reference lists through February 2013. Working in duplicate, we selected…
The preliminary analysis of the reliability and validity of the Chinese Edition of the CSBS DP.

PubMed

Lin, Chu-Sui; Chang, Shu-Hui; Cheng, Shu-Fen; Chao, Pen-Chiang; Chiu, Chun-Hao

2015-03-01

This study marked a preliminary attempt to standardize the Chinese Edition of the Communication and Symbolic Behavior Scales Developmental Profile (Wetherby & Prizant, 2002; CSBS DP) to assist in the early identification of young children with special needs in Taiwan. The study was conducted among 171 infants and toddlers aged 1-2. It also included a follow-up study one year after the initial test. Three domestically developed standardized child development inventories were used to measure the concurrent validity and predictive validity. The Chinese Edition of the CSBS DP demonstrated overall good test-retest and inter-rater reliability. It also showed good concurrent and predictive validity. The current study yields preliminary evidence that the Chinese Edition of the CSBS DP could be a valuable assessment tool worthy of wider distribution. Future research should employ random sampling to establish a true national norm. Additionally, the follow-up study needs to include atypical groups and to expand to children aged 6-12 months to strengthen the applicability of the instrument in Taiwan. Copyright © 2014 Elsevier Ltd. All rights reserved.
Reliability and validity of non-radiographic methods of thoracic kyphosis measurement: a systematic review.

PubMed

Barrett, Eva; McCreesh, Karen; Lewis, Jeremy

2014-02-01

A wide array of instruments are available for non-invasive thoracic kyphosis measurement. Guidelines for selecting outcome measures for use in clinical and research practice recommend that properties such as validity and reliability are considered. This systematic review reports on the reliability and validity of non-invasive methods for measuring thoracic kyphosis. A systematic search of 11 electronic databases located studies assessing reliability and/or validity of non-invasive thoracic kyphosis measurement techniques. Two independent reviewers used a critical appraisal tool to assess the quality of retrieved studies. Data was extracted by the primary reviewer. The results were synthesized qualitatively using a level of evidence approach. 27 studies satisfied the eligibility criteria and were included in the review. The reliability, validity and both reliability and validity were investigated by sixteen, two and nine studies respectively. 17/27 studies were deemed to be of high quality. In total, 15 methods of thoracic kyphosis were evaluated in retrieved studies. All investigated methods showed high (ICC ≥ .7) to very high (ICC ≥ .9) levels of reliability. The validity of the methods ranged from low to very high. The strongest levels of evidence for reliability exists in support of the Debrunner kyphometer, Spinal Mouse and Flexicurve index, and for validity supports the arcometer and Flexicurve index. Further reliability and validity studies are required to strengthen the level of evidence for the remaining methods of measurement. This should be addressed by future research. Copyright © 2013 Elsevier Ltd. All rights reserved.
EOS Terra Validation Program

NASA Technical Reports Server (NTRS)

Starr, David

2000-01-01

The EOS Terra mission will be launched in July 1999. This mission has great relevance to the atmospheric radiation community and global change issues. Terra instruments include Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), Clouds and Earth's Radiant Energy System (CERES), Multi-Angle Imaging Spectroradiometer (MISR), Moderate Resolution Imaging Spectroradiometer (MODIS) and Measurements of Pollution in the Troposphere (MOPITT). In addition to the fundamental radiance data sets, numerous global science data products will be generated, including various Earth radiation budget, cloud and aerosol parameters, as well as land surface, terrestrial ecology, ocean color, and atmospheric chemistry parameters. Significant investments have been made in on-board calibration to ensure the quality of the radiance observations. A key component of the Terra mission is the validation of the science data products. This is essential for a mission focused on global change issues and the underlying processes. The Terra algorithms have been subject to extensive pre-launch testing with field data whenever possible. Intensive efforts will be made to validate the Terra data products after launch. These include validation of instrument calibration (vicarious calibration) experiments, instrument and cross-platform comparisons, routine collection of high quality correlative data from ground-based networks, such as AERONET, and intensive sites, such as the SGP ARM site, as well as a variety field experiments, cruises, etc. Airborne simulator instruments have been developed for the field experiment and underflight activities including the MODIS Airborne Simulator (MAS) AirMISR, MASTER (MODIS-ASTER), and MOPITT-A. All are integrated on the NASA ER-2 though low altitude platforms are more typically used for MASTER. MATR is an additional sensor used for MOPITT algorithm development and validation. The intensive validation activities planned for the first year of the Terra
Reliability and validity of the Outcome Expectations for Exercise Scale-2.

PubMed

Resnick, Barbara

2005-10-01

Development of a reliable and valid measure of outcome expectations for exercise for older adults will help establish the relationship between outcome expectations and exercise and facilitate the development of interventions to increase physical activity in older adults. The purpose of this study was to test the reliability and validity of the Outcome Expectations for Exercise-2 Scale (OEE-2), a 13-item measure with two subscales: positive OEE (POEE) and negative OEE (NOEE). The OEE-2 scale was given to 161 residents in a continuing-care retirement community. There was some evidence of validity based on confirmatory factor analysis, Rasch-analysis INFIT and OUTFIT statistics, and convergent validity and test criterion relationships. There was some evidence for reliability of the OEE-2 based on alpha coefficients, person- and item-separation reliability indexes, and R(2)values. Based on analyses, suggested revisions are provided for future use of the OEE-2. Although ongoing reliability and validity testing are needed, the OEE-2 scale can be used to identify older adults with low outcome expectations for exercise, and interventions can then be implemented to strengthen these expectations and improve exercise behavior.
Initial evidence for the validity of the California Bullying Victimization Scale (CBVS-R) as a retrospective measure for adults.

PubMed

Green, Jennifer Greif; Oblath, Rachel; Felix, Erika D; Furlong, Michael J; Holt, Melissa K; Sharkey, Jill D

2018-06-07

Childhood bullying is an important predictor of psychological and health outcomes in adulthood; however, validated retrospective measures of childhood bullying are lacking. This study investigates the psychometric properties of an adult retrospective version of the California Bullying Victimization Scale (CBVS). The CBVS self-report measure was developed for use with children and adolescents to assess the three definitional characteristics of bullying (aggression that is chronic, intentional, and involves an imbalance of power), without using the term "bullying." In the current study, we evaluate patterns of retrospective reports of bullying victimization, and compare results to a common definition-first measure of bullying. Concurrent validity and 4-year stability are addressed. In the fall of 2012, entering first-year students at 4 universities in the United States (N = 1,209; 65.2% female) completed the California Bullying Victimization Scale-Retrospective (CBVS-R) as part of an online survey. In spring of 2016, participants at 2 universities who provided contact information (N = 175) completed a 4-year follow-up survey. Results support the validity of the CBVS-R as a retrospective self-report measure of bullying victimization experienced in childhood. In particular, the percent of respondents classified as being bullied (27.9%) and age- and gender-related patterns of victimization were consistent with known patterns of childhood bullying. In addition, respondents reporting childhood victimization indicated increased psychological distress in adulthood. However, stability of reports across a 4-year follow-up period were lower than expected (κ = .38). Implications for the use of retrospective reports of childhood bullying victimization are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Standards Performance Continuum: Development and Validation of a Measure of Effective Pedagogy.

ERIC Educational Resources Information Center

Doherty, R. William; Hilberg, R. Soleste; Epaloose, Georgia; Tharp, Roland G.

2002-01-01

Describes the development and validation of the Standards Performance Continuum (SPC) for assessing teacher performance of the Standards for Effective Pedagogy. Three studies involving Florida, California, and New Mexico public school teachers provided evidence of inter-rater reliability, concurrent validity, and criterion-related validity…
Validity Evidence in Accommodations for English Language Learners and Students with Disabilities

ERIC Educational Resources Information Center

Camara, Wayne

2009-01-01

The five papers in this special issue of the "Journal of Applied Testing Technology" address fundamental issues of validity when tests are modified or accommodations are provided to English Language Learners (ELL) or students with disabilities. Three papers employed differential item functioning (DIF) and factor analysis and found the…
A Validation and Reliability Study of the Physical Activity and Healthy Food Efficacy Scale for Children (PAHFE)

ERIC Educational Resources Information Center

Perry, Christina M.; De Ayala, R. J.; Lebow, Ryan; Hayden, Emily

2008-01-01

The purpose of this study was to obtain validity evidence for the Physical Activity and Healthy Food Efficacy Scale for Children (PAHFE). Construct validity evidence identifies four subscales: Goal-Setting for Physical Activity, Goal-Setting for Healthy Food Choices, Decision-Making for Physical Activity, and Decision-Making for Healthy Food…
Validity of the SAT® for Predicting First-Year Grades: 2011 SAT Validity Sample. Statistical Report 2013-3

ERIC Educational Resources Information Center

Patterson, Brian F.; Mattern, Krista D.

2013-01-01

The continued accumulation of validity evidence for the intended uses of educational assessments is critical to ensure that proper inferences will be made for those purposes. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides…
Validity of the SAT® for Predicting First-Year Grades: 2012 SAT Validity Sample. Statistical Report 2015 2

ERIC Educational Resources Information Center

Beard, Jonathan; Marini, Jessica P.

2015-01-01

The continued accumulation of validity evidence for the intended uses of educational assessment scores is critical to ensure that inferences made using the scores are sound. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides updated…
Hopes and Cautions for Instrument-Based Evaluation of Consent Capacity: Results of a Construct Validity Study of Three Instruments

PubMed Central

Moye, Jennifer; Azar, Annin R.; Karel, Michele J.; Gurrera, Ronald J.

2016-01-01

Does instrument based evaluation of consent capacity increase the precision and validity of competency assessment or does ostensible precision provide a false sense of confidence without in fact improving validity? In this paper we critically examine the evidence for construct validity of three instruments for measuring four functional abilities important in consent capacity: understanding, appreciation, reasoning, and expressing a choice. Instrument based assessment of these abilities is compared through investigation of a multi-trait multi-method matrix in 88 older adults with mild to moderate dementia. Results find variable support for validity. There appears to be strong evidence for good hetero-method validity for the measurement of understanding, mixed evidence for validity in the measurement of reasoning, and strong evidence for poor hetero-method validity for the concepts of appreciation and expressing a choice, although the latter is likely due to extreme range restrictions. The development of empirically based tools for use in capacity evaluation should ultimately enhance the reliability and validity of assessment, yet clearly more research is needed to define and measure the constructs of decisional capacity. We would also emphasize that instrument based assessment of capacity is only one part of a comprehensive evaluation of competency which includes consideration of diagnosis, psychiatric and/or cognitive symptomatology, risk involved in the situation, and individual and cultural differences. PMID:27330455
Calculating the weight of evidence in low-template forensic DNA casework.

PubMed

Lohmueller, Kirk E; Rudin, Norah

2013-01-01

Interpreting and assessing the weight of low-template DNA evidence presents a formidable challenge in forensic casework. This report describes a case in which a similar mixed DNA profile was obtained from four different bloodstains. The defense proposed that the low-level minor profile came from an alternate suspect, the defendant's mistress. The strength of the evidence was assessed using a probabilistic approach that employed likelihood ratios incorporating the probability of allelic drop-out. Logistic regression was used to model the probability of drop-out using empirical validation data from the government laboratory. The DNA profile obtained from the bloodstain described in this report is at least 47 billion times more likely if, in addition to the victim, the alternate suspect was the minor contributor, than if another unrelated individual was the minor contributor. This case illustrates the utility of the probabilistic approach for interpreting complex low-template DNA profiles. © 2012 American Academy of Forensic Sciences.

Toward Evidence-Informed Policy and Practice in Child Welfare

ERIC Educational Resources Information Center

Littell, Julia H.; Shlonsky, Aron

2010-01-01

Drawing on the authors' experience in the international Campbell Collaboration, this essay presents a principled and pragmatic approach to evidence-informed decisions about child welfare. This approach takes into account the growing body of empirical evidence on the reliability and validity of various methods of research synthesis. It also…
Independent validation of the MMPI-2-RF Somatic/Cognitive and Validity scales in TBI Litigants tested for effort.

PubMed

Youngjohn, James R; Wershba, Rebecca; Stevenson, Matthew; Sturgeon, John; Thomas, Michael L

2011-04-01

The MMPI-2 Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008) is replacing the MMPI-2 as the most widely used personality test in neuropsychological assessment, but additional validation studies are needed. Our study examines MMPI-2-RF Validity scales and the newly created Somatic/Cognitive scales in a recently reported sample of 82 traumatic brain injury (TBI) litigants who either passed or failed effort tests (Thomas & Youngjohn, 2009). The restructured Validity scales FBS-r (restructured symptom validity), F-r (restructured infrequent responses), and the newly created Fs (infrequent somatic responses) were not significant predictors of TBI severity. FBS-r was significantly related to passing or failing effort tests, and Fs and F-r showed non-significant trends in the same direction. Elevations on the Somatic/Cognitive scales profile (MLS-malaise, GIC-gastrointestinal complaints, HPC-head pain complaints, NUC-neurological complaints, and COG-cognitive complaints) were significant predictors of effort test failure. Additionally, HPC had the anticipated paradoxical inverse relationship with head injury severity. The Somatic/Cognitive scales as a group were better predictors of effort test failure than the RF Validity scales, which was an unexpected finding. MLS arose as the single best predictor of effort test failure of all RF Validity and Somatic/Cognitive scales. Item overlap analysis revealed that all MLS items are included in the original MMPI-2 Hy scale, making MLS essentially a subscale of Hy. This study validates the MMPI-2-RF as an effective tool for use in neuropsychological assessment of TBI litigants.
Reliability and Validity of Ambulatory Cognitive Assessments

PubMed Central

Sliwinski, Martin J.; Mogle, Jacqueline A.; Hyun, Jinshil; Munoz, Elizabeth; Smyth, Joshua M.; Lipton, Richard B.

2017-01-01

Mobile technologies are increasingly used to measure cognitive function outside of traditional clinic and laboratory settings. Although ambulatory assessments of cognitive function conducted in people’s natural environments offer potential advantages over traditional assessment approaches, the psychometrics of cognitive assessment procedures have been understudied. We evaluated the reliability and construct validity of ambulatory assessments of working memory and perceptual speed administered via smartphones as part of an ecological momentary assessment (EMA) protocol in a diverse adult sample (N=219). Results indicated excellent between-person reliability (≥.97) for average scores, and evidence of reliable within-person variability across measurement occasions (.41–.53). The ambulatory tasks also exhibited construct validity, as evidence by their loadings on working memory and perceptual speed factors defined by the in-lab assessments. Our findings demonstrate that averaging across brief cognitive assessments made in uncontrolled naturalistic settings provide measurements that are comparable in reliability to assessments made in controlled laboratory environments. PMID:27084835
Clonazepam responsive opsoclonus myoclonus syndrome: additional evidence in favour of fastigial nucleus disinhibition hypothesis?

PubMed

Paliwal, Vimal Kumar; Chandra, Satish; Verma, Ritu; Kalita, Jayantee; Misra, Usha K

2010-05-01

Opsoclonus myoclonus syndrome is a rare paraneoplastic syndrome seen in 50% of children with neuroblastoma. Neural generator of opsoclonus and myoclonus is not known but evidences suggest the role of fastigial nucleus disinhibition from the loss of function of inhibitory (GABAergic) Purkinje cells in the cerebellum. We present a child with paraneoplastic opsoclonus myoclonus syndrome who responded well to clonazepam. Response to clonazepam is an evidence for the involvement of GABAergic neural circuits in the genesis of opsoclonus myoclonus syndrome and is in agreement with fastigial nucleus disinhibition hypothesis.
Validity Evidence for the Organizational Commitment Questionnaire in the Japanese Corporate Culture.

ERIC Educational Resources Information Center

White, Marion M.; And Others

1995-01-01

The validity of the Organizational Commitment Questionnaire as a measure of organizational commitment in the Japanese culture was studied with 1,481 Japanese employees. The three-factor model was a better fit to the data than the one- or two-factor models. Results support the cross-cultural utility of the measure. (SLD)
Psychometric and cognitive validation of a social capital measurement tool in Peru and Vietnam.

PubMed

De Silva, Mary J; Harpham, Trudy; Tuan, Tran; Bartolini, Rosario; Penny, Mary E; Huttly, Sharon R

2006-02-01

Social capital is a relatively new concept which has attracted significant attention in recent years. No consensus has yet been reached on how to measure social capital, resulting in a large number of different tools available. While psychometric validation methods such as factor analysis have been used by a few studies to assess the internal validity of some tools, these techniques rely on data already collected by the tool and are therefore not capable of eliciting what the questions are actually measuring. The Young Lives (YL) study includes quantitative measures of caregiver's social capital in four countries (Vietnam, Peru, Ethiopia, and India) using a short version of the Adapted Social Capital Assessment Tool (SASCAT). A range of different psychometric methods including factor analysis were used to evaluate the construct validity of SASCAT in Peru and Vietnam. In addition, qualitative cognitive interviews with 20 respondents from Peru and 24 respondents from Vietnam were conducted to explore what each question is actually measuring. We argue that psychometric validation techniques alone are not sufficient to adequately validate multi-faceted social capital tools for use in different cultural settings. Psychometric techniques show SASCAT to be a valid tool reflecting known constructs and displaying postulated links with other variables. However, results from the cognitive interviews present a more mixed picture with some questions being appropriately interpreted by respondents, and others displaying significant differences between what the researchers intended them to measure and what they actually do. Using evidence from a range of methods of assessing validity has enabled the modification of an existing instrument into a valid and low cost tool designed to measure social capital within larger surveys in Peru and Vietnam, with the potential for use in other developing countries following local piloting and cultural adaptation of the tool.
Development and Validation of Chemistry Self-Efficacy Scale for College Students

ERIC Educational Resources Information Center

Uzuntiryaki, Esen; Aydin, Yesim Capa

2009-01-01

This study described the process of developing and validating the College Chemistry Self-Efficacy Scale (CCSS) that can be used to assess college students' beliefs in their ability to perform essential tasks in chemistry. In the first phase, data collected from 363 college students provided evidence for the validity and reliability of the new…
Contributions of Middle Grade Students to the Validation Process of a National Science Assessment Study

ERIC Educational Resources Information Center

Morell, Linda

2008-01-01

This study used a national validity project to investigate specific research questions regarding the intersections among aspects of validity, educational measurement, and cognitive theory. Validity evidence was collected through traditional paper and pencil tests, surveys, think-alouds, and exit interviews of fifth and sixth grade students, as…
Validity of the Thai EQ-5D in an occupational population in Thailand.

PubMed

Kimman, Merel; Vathesatogkit, Prin; Woodward, Mark; Tai, E-Shyong; Thumboo, Julian; Yamwong, Sukit; Ratanachaiwong, Wipa; Wee, Hwee-Lin; Sritara, Piyamitr

2013-08-01

To assess the construct validity of the Thai EuroQoL (EQ-5D) among an occupational population in Thailand. Data were derived from a large cohort study among employees of the Electricity Generating Authority of Thailand. In 2008 and 2009, 4,850 participants completed the Thai EQ-5D and Short-Form 36 version 2 (SF-36v2). Thai preferences weights were used to convert EQ-5D health states into EQ-5D index scores. Construct validity of the Thai EQ-5D was examined by specifying and testing hypotheses about the relationships between the EQ-5D, SF-36v2, and participants' demographic and medical characteristics. Construct validity of the Thai EQ-5D was supported by expected relationships with SF-36v2 scale and summary scores. For example, SF-36v2 scores on the mental health scale were much lower for participants who reported having problems on the EQ-5D anxiety/depression dimension compared to those reporting no problems (mean norm-based SF-36v2 scores: 52.9 vs. 41.8, p < 0.001). Additionally, reporting a problem in a given EQ-5D dimension was generally associated with lower SF-36v2 summary scores. The EQ-5D index score distinguished between groups of participants in the expected manner, on the basis of sex, age, education and self-reported health, thus providing evidence of known-groups validity. The study demonstrated good construct validity of the Thai EQ-5D in a large occupational population in Thailand.
Consequences of non-random species loss for decomposition dynamics: Experimental evidence for additive and non-additive effects

Treesearch

Becky A. Ball; Mark D. Hunter; John S. Kominoski; Christopher M. Swan; Mark A. Bradford

2008-01-01

Although litter decomposition is a fundamental ecological process, most of our understanding comes from studies of single-species decay. Recently, litter-mixing studies have tested whether monoculture data can be applied to mixed-litter systems. These studies have mainly attempted to detect non-additive effects of litter mixing, which address potential consequences of...
Rational selection of training and test sets for the development of validated QSAR models

NASA Astrophysics Data System (ADS)

Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander

2003-02-01

Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.
Validation of high throughput sequencing and microbial forensics applications

PubMed Central

2014-01-01

High throughput sequencing (HTS) generates large amounts of high quality sequence data for microbial genomics. The value of HTS for microbial forensics is the speed at which evidence can be collected and the power to characterize microbial-related evidence to solve biocrimes and bioterrorist events. As HTS technologies continue to improve, they provide increasingly powerful sets of tools to support the entire field of microbial forensics. Accurate, credible results allow analysis and interpretation, significantly influencing the course and/or focus of an investigation, and can impact the response of the government to an attack having individual, political, economic or military consequences. Interpretation of the results of microbial forensic analyses relies on understanding the performance and limitations of HTS methods, including analytical processes, assays and data interpretation. The utility of HTS must be defined carefully within established operating conditions and tolerances. Validation is essential in the development and implementation of microbial forensics methods used for formulating investigative leads attribution. HTS strategies vary, requiring guiding principles for HTS system validation. Three initial aspects of HTS, irrespective of chemistry, instrumentation or software are: 1) sample preparation, 2) sequencing, and 3) data analysis. Criteria that should be considered for HTS validation for microbial forensics are presented here. Validation should be defined in terms of specific application and the criteria described here comprise a foundation for investigators to establish, validate and implement HTS as a tool in microbial forensics, enhancing public safety and national security. PMID:25101166
Validation of high throughput sequencing and microbial forensics applications.

PubMed

Budowle, Bruce; Connell, Nancy D; Bielecka-Oder, Anna; Colwell, Rita R; Corbett, Cindi R; Fletcher, Jacqueline; Forsman, Mats; Kadavy, Dana R; Markotic, Alemka; Morse, Stephen A; Murch, Randall S; Sajantila, Antti; Schmedes, Sarah E; Ternus, Krista L; Turner, Stephen D; Minot, Samuel

2014-01-01

High throughput sequencing (HTS) generates large amounts of high quality sequence data for microbial genomics. The value of HTS for microbial forensics is the speed at which evidence can be collected and the power to characterize microbial-related evidence to solve biocrimes and bioterrorist events. As HTS technologies continue to improve, they provide increasingly powerful sets of tools to support the entire field of microbial forensics. Accurate, credible results allow analysis and interpretation, significantly influencing the course and/or focus of an investigation, and can impact the response of the government to an attack having individual, political, economic or military consequences. Interpretation of the results of microbial forensic analyses relies on understanding the performance and limitations of HTS methods, including analytical processes, assays and data interpretation. The utility of HTS must be defined carefully within established operating conditions and tolerances. Validation is essential in the development and implementation of microbial forensics methods used for formulating investigative leads attribution. HTS strategies vary, requiring guiding principles for HTS system validation. Three initial aspects of HTS, irrespective of chemistry, instrumentation or software are: 1) sample preparation, 2) sequencing, and 3) data analysis. Criteria that should be considered for HTS validation for microbial forensics are presented here. Validation should be defined in terms of specific application and the criteria described here comprise a foundation for investigators to establish, validate and implement HTS as a tool in microbial forensics, enhancing public safety and national security.
An Argument Approach to Observation Protocol Validity

ERIC Educational Resources Information Center

Bell, Courtney A.; Gitomer, Drew H.; McCaffrey, Daniel F.; Hamre, Bridget K.; Pianta, Robert C.; Qi, Yi

2012-01-01

This article develops a validity argument approach for use on observation protocols currently used to assess teacher quality for high-stakes personnel and professional development decisions. After defining the teaching quality domain, we articulate an interpretive argument for observation protocols. To illustrate the types of evidence that might…
Additives to local anesthetics for peripheral nerve blocks: Evidence, limitations, and recommendations.

PubMed

Bailard, Neil S; Ortiz, Jaime; Flores, Roland A

2014-03-01

The therapeutic rationale, clinical effectiveness, and potential adverse effects of medications used in combination with local anesthetics for peripheral nerve block therapy are reviewed. A wide range of agents have been tested as adjuncts to peripheral nerve blocks, which are commonly performed for regional anesthesia during or after hand or arm surgery, neck or spine surgery, and other procedures. Studies to determine the comparative merits of nerve block adjuncts are complicated by the wide variety of coadministered local anesthetics and sites of administration and by the heterogeneity of primary endpoints. Sodium bicarbonate has been shown to speed the onset of mepivacaine nerve blocks but delay the onset of others. Epinephrine has been shown to prolong sensory nerve blockade and delay systemic uptake of local anesthetics, thus reducing the risk of anesthetic toxicity. Tramadol, buprenorphine, dexamethasone, and clonidine appear to be effective additives in some situations. Midazolam, magnesium, dexmedetomidine, and ketamine cannot be routinely recommended as nerve block additives due to a dearth of supportive data, modest efficacy, and (in the case of ketamine) significant adverse effects. Recent studies suggest that administering additives intravenously or intramuscularly can provide many of the benefits of perineural administration while reducing the potential for neurotoxicity, contamination, and other hazards. Some additives to local anesthetics can hasten the onset of nerve block, prolong block duration, or reduce toxicity. On the other hand, poorly selected or unnecessary additives may not have the desired effect and may even expose patients to unnecessary risks.
Absorption in Sport: A Cross-Validation Study

PubMed Central

Koehn, Stefan; Stavrou, Nektarios A. M.; Cogley, Jeremy; Morris, Tony; Mosek, Erez; Watt, Anthony P.

2017-01-01

Absorption has been identified as readiness for experiences of deep involvement in the task. Conceptually, absorption is a key psychological construct, incorporating experiential, cognitive, and motivational components. Although, no operationalization of the construct has been provided to facilitate research in this area, the purpose of this research was the development and examination of the psychometric properties of a sport-specific measure of absorption that evolved from the use of the modified Tellegen Absorption Scale (MODTAS; Jamieson, 2005) in mainstream psychology. The study aimed to provide evidence of the psychometric properties, reliability, and validity of the Measure of Absorption in Sport Contexts (MASCs). The psychometric examination included a calibration sample from Scotland and a cross-validation sample from Australia using a cross-sectional design. The item pool was developed based on existing items from the modified Tellegen Absorption Scale (Jamieson, 2005). The MODTAS items were reworded and translated into a sport context. The Scottish sample consisted of 292 participants and the Australian sample of 314 participants. Congeneric model testing and confirmatory factor analysis for both samples and multi-group invariance testing across samples was used. In the cross-validation sample the MASC subscales showed acceptable internal consistency and construct reliability (≥0.70). Excellent fit indices were found for the final 18-item, six-factor measure in the cross-validation sample, χ(120)2 = 197.486, p < 0.001; CFI = 0.957; TLI = 0.945; RMSEA = 0.045; SRMR = 0.044. Multi-group invariance testing revealed no differences in item meaning, except for two items. The MASC and the Dispositional Flow Scale-2 showed moderate-to-strong positive correlations in both samples, r = 0.38, p < 0.001 and r = 0.42, p < 0.001, supporting the external validity of the MASC. This article provides initial evidence in support of the psychometric properties
Adaptation and Validation of the Brazilian Version of the Hope Index

ERIC Educational Resources Information Center

Pacico, Juliana Cerentini; Zanon, Cristian; Bastianello, Micheline Roat; Reppold, Caroline Tozzi; Hutz, Claudio Simon

2013-01-01

The objective of this study was to adapt and gather validity evidence for a Brazilian sample version of the Hope Index and to verify if cultural differences would produce different results than those found in the United States. In this study, we present a set of analyses that together comprise a comprehensive validity argument for the use of a…
Assessing impact of physical activity-based youth development programs: validation of the Life Skills Transfer Survey (LSTS).

PubMed

Weiss, Maureen R; Bolter, Nicole D; Kipp, Lindsay E

2014-09-01

A signature characteristic of positive youth development (PYD) programs is the opportunity to develop life skills, such as social, behavioral, and moral competencies, that can be generalized to domains beyond the immediate activity. Although context-specific instruments are available to assess developmental outcomes, a measure of life skills transfer would enable evaluation of PYD programs in successfully teaching skills that youth report using in other domains. The purpose of our studies was to develop and validate a measure of perceived life skills transfer, based on data collected with The First Tee, a physical activity-based PYD program. In 3 studies, we conducted a series of steps to provide content and construct validity and internal consistency reliability for the Life Skills Transfer Survey (LSTS), a measure of perceived life skills transfer. Study 1 provided content validity for the LSTS that included 8 life skills and 50 items. Study 2 revealed construct validity (structural validity) through a confirmatory factor analysis and convergent validity by correlating scores on the LSTS with scores on an assessment tool that measures a related construct. Study 3 offered additional construct validity by reassessing youth 1 year later and showing that scores during both time periods were invariant in factor pattern, loadings, and variances and covariances. Studies 2 and 3 demonstrated internal consistency reliability of the LSTS. RESULTS from 3 studies provide evidence of content and construct validity and internal consistency reliability for the LSTS, which can be used in evaluation research with youth development programs.
Validity evidence, sensibility and specificity of the severity dimension of the SDSS alcohol dependence scale.

PubMed

Vélez-Moreno, Antonio; Lozano, Óscar M; Fernández-Calderón, Fermín; Rojas, Antonio J; Sayans-Jiménez, Pablo; González-Saiz, Francisco; Ramírez López, Juan

2015-01-01

Therapeutic success in the treatment of alcohol use disorders highly depends on an appropriate diagnosis. The Substance Dependence Severity Scale –SDSS- is a scale that assesses substance dependence in dimensional terms and that follows the diagnostic criteria established by the international classification systems. The aim of this study is to provide validity evidence for the severity dimension of the alcohol dependence scale of the SDSS comparing it with the Mini International Neuropsychiatric Interview –MINI-, and others variables related to substance use included in the EuropASI. A total of 109 patients admitted for treatment in the Drug Abuse Center Services of Huelva who had used alcohol in the month previous to the interview participated. The SDSS, MINI and EuropASI were administered. The diagnostic capacity of the SDSS was assessed by Receiver Operating Characteristic (ROC) curve analysis, taking the MINI dependence diagnosis as standard. The area under the ROC curve (AUC) was 0.917 (CI=0.867-0.968). The trade-off between parameters was detected for a score of 9, with suitable values of sensitivity and specificity (83.58% and 83.72%). The results support the use of the SDSS for the diagnosis of alcohol dependence and for assessment the severity of dependence. Administration of this scale makes it possible to obtain information, with a single score, on how severe the disorder is and whether the dependence criteria have been met.
Factor Structure, Factorial Invariance, and Validity of the Multidimensional Shame-Related Response Inventory-21 (MSRI-21)

PubMed Central

Garcia, Antonio F.; Acosta, Melina; Pirani, Saifa; Edwards, Daniel; Osman, Augustine

2017-01-01

We describe 2 studies designed to evaluate scores on the Multidimensional Shame-related Response Inventory-21 (MSRI-21), a recently developed instrument that measures affective and behavioral responses to shame. The inventory assesses shame-related responses in 3 categories: negative self-evaluation, fear of social consequences, and maladaptive behavior tendency. For Study 1, (N = 743) undergraduates completed the MSRI-21. Confirmatory factor analysis supported the validity of the MSRI-21 3-factor structure. Latent variable modeling of coefficient-α provided strong evidence for the internal consistency of scores on each scale. In Study 2, (N = 540) undergraduates completed the instrument along with 5 concurrent measures chosen for clinical significance. Achievement of factorial invariance supported the use of MSRI-21 scale scores to make valid mean comparisons across gender. In addition, MSRI-21 scale scores were associated as expected with scores on measures of self-harm, suicide, and other risk factors. Taken together, results of 2 studies support the internal consistency reliability, factorial validity, factorial invariance, and convergent validity of scores on the MSRI-21. Further work is needed to assess the temporal stability of the MSRI-21 scale scores, invariance across clinical status and other groupings, item-level measurement properties, and viability in highly symptomatic samples. PMID:28182490

The 2002 NIMH Provisional Diagnostic Criteria for Depression of Alzheimer's Disease (PDC-dAD): Gauging their Validity over a Decade Later.

PubMed

Sepehry, Amir A; Lee, Philip E; Hsiung, Ging-Yuek R; Beattie, B Lynn; Feldman, Howard H; Jacova, Claudia

2017-01-01

Presented herein is evidence for criterion, content, and convergent/discriminant validity of the NIMH-Provisional Diagnostic Criteria for depression of Alzheimer's Disease (PDC-dAD) that were formulated to address depression in Alzheimer's disease (AD). Using meta-analytic and systematic review methods, we examined criterion validity evidence in epidemiological and clinical studies comparing the PDC-dAD to Diagnostic and Statistical Manual of Mental Disorders fourth edition (DSM-IV), and International Classification of Disease (ICD 9) depression diagnostic criteria. We estimated prevalence of depression by PDC, DSM, and ICD with an omnibus event rate effect-size. We also examined diagnostic agreement between PDC and DSM. To gauge content validity, we reviewed rates of symptom endorsement for each diagnostic approach. Finally, we examined the PDC's relationship with assessment scales (global cognition, neuropsychiatric, and depression definition) for convergent validity evidence. The aggregate evidence supports the validity of the PDC-dAD. Our findings suggest that depression in AD differs from other depressive disorders including Major Depressive Disorder (MDD) in that dAD is more prevalent, with generally a milder presentation and with unique features not captured by the DSM. Although the PDC are the current standard for diagnosis of depression in AD, we identified the need for their further optimization based on predictive validity evidence.
An initial validation of the Virtual Reality Paced Auditory Serial Addition Test in a college sample.

PubMed

Parsons, Thomas D; Courtney, Christopher G

2014-01-30

Numerous studies have demonstrated that the Paced Auditory Serial Addition Test (PASAT) has utility for the detection of cognitive processing deficits. While the PASAT has demonstrated high levels of internal consistency and test-retest reliability, administration of the PASAT has been known to create undue anxiety and frustration in participants. As a result, degradation of performance may be found on the PASAT. The difficult nature of the PASAT may subsequently decrease the probability of their return for follow up testing. This study is a preliminary attempt at assessing the potential of a PASAT embedded in a virtual reality environment. The Virtual Reality PASAT (VR-PASAT) was compared with a paper-and-pencil version of the PASAT as well as other standardized neuropsychological measures. The two modalities of the PASAT were conducted with a sample of 50 healthy university students, between the ages of 19 and 34 years. Equivalent distributions were found for age, gender, education, and computer familiarity. Moderate relationships were found between VR-PASAT and other putative attentional processing measures. The VR-PASAT was unrelated to indices of learning, memory, or visuospatial processing. Comparison of the VR-PASAT with the traditional paper-and-pencil PASAT indicated that both versions require the examinee to sustain attention at an increasingly demanding, externally determined rate. Results offer preliminary support for the construct validity (in a college sample) of the VR-PASAT as an attentional processing measure and suggest that this task may provide some unique information not tapped by traditional attentional processing tasks. Copyright © 2013 Elsevier B.V. All rights reserved.
Investigating Postgraduate College Admission Interviews: Generalizability Theory Reliability and Incremental Predictive Validity

ERIC Educational Resources Information Center

Arce-Ferrer, Alvaro J.; Castillo, Irene Borges

2007-01-01

The use of face-to-face interviews is controversial for college admissions decisions in light of the lack of availability of validity and reliability evidence for most college admission processes. This study investigated reliability and incremental predictive validity of a face-to-face postgraduate college admission interview with a sample of…
Problem-solving style and multicultural personality dispositions: a study of construct validity.

PubMed

Houtz, John C; Ponterotto, Joseph G; Burger, Claudia; Marino, Cherylynn

2010-06-01

This exploratory study examined the relationship between problem-solving styles and multicultural personality dispositions among 91 graduate students enrolled in an urban university located in the northeast United States. Problem-solving style was assessed with the three dimensions of the VIEW: an Assessment of Problem Solving Style. Multicultural personality was assessed with the five-factor Multicultural Personality Questionnaire (MPQ); its factors of Cultural Empathy, Open-mindedness, Social Initiative, and Flexibility correlated significantly with Explorer and External problem-solving styles, as predicted. The Emotional Stability subscale also correlated significantly with scores on Explorer style, suggesting that individuals who prefer "thinking in new directions" in problem solving are more likely to report remaining calm under stressful situations. Collectively, study results provided additional evidence of construct validity for the VIEW.
Generalizability and Validity of a Mathematics Performance Assessment.

ERIC Educational Resources Information Center

Lane, Suzanne; And Others

1996-01-01

Evidence from test results of 3,604 sixth and seventh graders is provided for the generalizability and validity of the Quantitative Understanding: Amplifying Student Achievement and Reasoning (QUASAR) Cognitive Assessment Instrument, which is designed to measure program outcomes and growth in mathematics. (SLD)
Clinical audit project in undergraduate medical education curriculum: an assessment validation study

PubMed Central

Steketee, Carole; Mak, Donna

2016-01-01

Objectives To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. Methods A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). Results The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students’ and examiners’ response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach’s alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. Conclusions This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole. PMID:27716612
Clinical audit project in undergraduate medical education curriculum: an assessment validation study.

PubMed

Tor, Elina; Steketee, Carole; Mak, Donna

2016-09-24

To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students' and examiners' response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach's alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole.
Replacing and Additive Horizontal Gene Transfer in Streptococcus

PubMed Central

Choi, Sang Chul; Rasmussen, Matthew D.; Hubisz, Melissa J.; Gronau, Ilan; Stanhope, Michael J.; Siepel, Adam

2012-01-01

The prominent role of Horizontal Gene Transfer (HGT) in the evolution of bacteria is now well documented, but few studies have differentiated between evolutionary events that predominantly cause genes in one lineage to be replaced by homologs from another lineage (“replacing HGT”) and events that result in the addition of substantial new genomic material (“additive HGT”). Here in, we make use of the distinct phylogenetic signatures of replacing and additive HGTs in a genome-wide study of the important human pathogen Streptococcus pyogenes (SPY) and its close relatives S. dysgalactiae subspecies equisimilis (SDE) and S. dysgalactiae subspecies dysgalactiae (SDD). Using recently developed statistical models and computational methods, we find evidence for abundant gene flow of both kinds within each of the SPY and SDE clades and of reduced levels of exchange between SPY and SDD. In addition, our analysis strongly supports a pronounced asymmetry in SPY–SDE gene flow, favoring the SPY-to-SDE direction. This finding is of particular interest in light of the recent increase in virulence of pathogenic SDE. We find much stronger evidence for SPY–SDE gene flow among replacing than among additive transfers, suggesting a primary influence from homologous recombination between co-occurring SPY and SDE cells in human hosts. Putative virulence genes are correlated with transfer events, but this correlation is found to be driven by additive, not replacing, HGTs. The genes affected by additive HGTs are enriched for functions having to do with transposition, recombination, and DNA integration, consistent with previous findings, whereas replacing HGTs seen to influence a more diverse set of genes. Additive transfers are also found to be associated with evidence of positive selection. These findings shed new light on the manner in which HGT has shaped pathogenic bacterial genomes. PMID:22617954
Evaluating Convergent and Discriminant Validity of Temperament Questionnaires for Preschoolers, Toddlers, and Infants.

ERIC Educational Resources Information Center

Goldsmith, H. H.; And Others

1991-01-01

Examined convergent and discriminant validity of eight widely used preschooler, toddler, and infant temperament questionnaires. There was surprisingly strong evidence for convergence among scales intended to measure similar concepts, with most convergent validity coefficients falling in the .50s, .60s, and .70s. (SH)
GJ 581 update: Additional evidence for a Super-Earth in the habitable zone

NASA Astrophysics Data System (ADS)

Vogt, S. S.; Butler, R. P.; Haghighipour, N.

2012-08-01

We present an analysis of the significantly expanded HARPS 2011 radial velocity data set for GJ 581 that was presented by Forveille et al. (2011). Our analysis reaches substantially different conclusions regarding the evidence for a Super-Earth-mass planet in the star's Habitable Zone. We were able to reproduce their reported χν2 and RMS values only after removing some outliers from their models and refitting the trimmed down RV set. A suite of 4000 N-body simulations of their Keplerian model all resulted in unstable systems and revealed that their reported 3.6σ detection of e=0.32 for the eccentricity of GJ 581e is manifestly incompatible with the system's dynamical stability. Furthermore, their Keplerian model, when integrated only over the time baseline of the observations, significantly increases the χν2 and demonstrates the need for including non-Keplerian orbital precession when modeling this system. We find that a four-planet model with all of the planets on circular or nearly circular orbits provides both an excellent self-consistent fit to their RV data and also results in a very stable configuration. The periodogram of the residuals to a 4-planet all-circular-orbit model reveals significant peaks that suggest one or more additional planets in this system. We conclude that the present 240-point HARPS data set, when analyzed in its entirety, and modeled with fully self-consistent stable orbits, by and of itself does offer significant support for a fifth signal in the data with a period near 32 days. This signal has a false alarm probability of <4 % and is consistent with a planet of minimum mass 2.2 M_⊙, orbiting squarely in the star's habitable zone at 0.13 AU, where liquid water on planetary surfaces is a distinct possibility.
Assessing young children's intention-reading in authentic communicative contexts: preliminary evidence and clinical utility.

PubMed

Greenslade, Kathryn J; Coggins, Truman E

2014-01-01

Identifying what a communication partner is looking at (referential intention) and why (social intention) is essential to successful social communication, and may be challenging for children with social communication deficits. This study explores a clinical task that assesses these intention-reading abilities within an authentic context. To gather evidence of the task's reliability and validity, and to discuss its clinical utility. The intention-reading task was administered to twenty 4-7-year-olds with typical development (TD) and ten with autism spectrum disorder (ASD). Task items were embedded in an authentic activity, and they targeted the child's ability to identify the examiner's referential and social intentions, which were communicated through joint attention behaviours. Reliability and construct validity evidence were addressed using established psychometric methods. Reliability and validity evidence supported the use of task scores for identifying children whose intention-reading warranted concern. Evidence supported the reliability of task administration and coding, and item-level codes were highly consistent with overall task performance. Supporting task validity, group differences aligned with predictions, with children with ASD exhibiting poorer and more variable task scores than children with TD. Also, as predicted, task scores correlated significantly with verbal mental age and ratings of parental concerns regarding social communication abilities. The evidence provides preliminary support for the reliability and validity of the clinical task's scores in assessing young children's real-time intention-reading abilities, which are essential for successful interactions in school and beyond. © 2014 Royal College of Speech and Language Therapists.
Official Position of the American Academy of Clinical Neuropsychology Social Security Administration Policy on Validity Testing: Guidance and Recommendations for Change.

PubMed

Chafetz, M D; Williams, M A; Ben-Porath, Y S; Bianchini, K J; Boone, K B; Kirkwood, M W; Larrabee, G J; Ord, J S

2015-01-01

The milestone publication by Slick, Sherman, and Iverson (1999) of criteria for determining malingered neurocognitive dysfunction led to extensive research on validity testing. Position statements by the National Academy of Neuropsychology and the American Academy of Clinical Neuropsychology (AACN) recommended routine validity testing in neuropsychological evaluations. Despite this widespread scientific and professional support, the Social Security Administration (SSA) continued to discourage validity testing, a stance that led to a congressional initiative for SSA to reevaluate their position. In response, SSA commissioned the Institute of Medicine (IOM) to evaluate the science concerning the validation of psychological testing. The IOM concluded that validity assessment was necessary in psychological and neuropsychological examinations (IOM, 2015 ). The AACN sought to provide independent expert guidance and recommendations concerning the use of validity testing in disability determinations. A panel of contributors to the science of validity testing and its application to the disability process was charged with describing why the disability process for SSA needs improvement, and indicating the necessity for validity testing in disability exams. This work showed how the determination of malingering is a probability proposition, described how different types of validity tests are appropriate, provided evidence concerning non-credible findings in children and low-functioning individuals, and discussed the appropriate evaluation of pain disorders typically seen outside of mental consultations. A scientific plan for validity assessment that additionally protects test security is needed in disability determinations and in research on classification accuracy of disability decisions.
Effort, symptom validity testing, performance validity testing and traumatic brain injury.

PubMed

Bigler, Erin D

2014-01-01

To understand the neurocognitive effects of brain injury, valid neuropsychological test findings are paramount. This review examines the research on what has been referred to a symptom validity testing (SVT). Above a designated cut-score signifies a 'passing' SVT performance which is likely the best indicator of valid neuropsychological test findings. Likewise, substantially below cut-point performance that nears chance or is at chance signifies invalid test performance. Significantly below chance is the sine qua non neuropsychological indicator for malingering. However, the interpretative problems with SVT performance below the cut-point yet far above chance are substantial, as pointed out in this review. This intermediate, border-zone performance on SVT measures is where substantial interpretative challenges exist. Case studies are used to highlight the many areas where additional research is needed. Historical perspectives are reviewed along with the neurobiology of effort. Reasons why performance validity testing (PVT) may be better than the SVT term are reviewed. Advances in neuroimaging techniques may be key in better understanding the meaning of border zone SVT failure. The review demonstrates the problems with rigidity in interpretation with established cut-scores. A better understanding of how certain types of neurological, neuropsychiatric and/or even test conditions may affect SVT performance is needed.
Supervisor Health and Safety Support: Scale Development and Validation

PubMed Central

Butts, Marcus M.; Hurst, Carrie S.; Eby, Lillian T.

2013-01-01

Executive Summary Two studies were conducted to develop a psychometrically sound measure of supervisor health and safety support (SHSS). We identified three dimensions of supervisor support (physical health, psychological health, safety) and used Study 1 to develop items and establish content validity. Study 2 was used to establish the dimensionality of the new measure and provide criterion-related and discriminant validity evidence of the measure using supervisor and subordinate data. The measure had incremental validity in predicting employee performance and psychological strain outcomes above and beyond general work support variables. Implications of these findings and for workplace support theory and practice are discussed. PMID:24771991
75 FR 62462 - Additions to the List of Validated End-Users in the People's Republic of China: Hynix...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-10-12

...In this final rule, the Bureau of Industry and Security amends the Export Administration Regulations (EAR) to add three end-users, Hynix Semiconductor (China) Ltd., Hynix Semiconductor (Wuxi) Ltd. and Lam Research Corporation to the list of validated end-users in the People's Republic of China (PRC). With this rule, exports, reexports and transfers (in-country) of certain items to one facility of Hynix Semiconductor (China) Ltd., one facility of Hynix Semiconductor (Wuxi) Ltd. and nine facilities of Lam Research Corporation in the PRC are now authorized under Authorization Validated End-User (VEU).
Reaction time as an indicator of insufficient effort: Development and validation of an embedded performance validity parameter.

PubMed

Stevens, Andreas; Bahlo, Simone; Licha, Christina; Liske, Benjamin; Vossler-Thies, Elisabeth

2016-11-30

Subnormal performance in attention tasks may result from various sources including lack of effort. In this report, the derivation and validation of a performance validity parameter for reaction time is described, using a set of malingering-indices ("Slick-criteria"), and 3 independent samples of participants (total n =893). The Slick-criteria yield an estimate of the probability of malingering based on the presence of an external incentive, evidence from neuropsychological testing, from self-report and clinical data. In study (1) a validity parameter is derived using reaction time data of a sample, composed of inpatients with recent severe brain lesions not involved in litigation and of litigants with and without brain lesion. In study (2) the validity parameter is tested in an independent sample of litigants. In study (3) the parameter is applied to an independent sample comprising cooperative and non-cooperative testees. Logistic regression analysis led to a derived validity parameter based on median reaction time and standard deviation. It performed satisfactorily in studies (2) and (3) (study 2 sensitivity=0.94, specificity=1.00; study 3 sensitivity=0.79, specificity=0.87). The findings suggest that median reaction time and standard deviation may be used as indicators of negative response bias. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A New Method for Analyzing Content Validity Data Using Multidimensional Scaling

ERIC Educational Resources Information Center

Li, Xueming; Sireci, Stephen G.

2013-01-01

Validity evidence based on test content is of essential importance in educational testing. One source for such evidence is an alignment study, which helps evaluate the congruence between tested objectives and those specified in the curriculum. However, the results of an alignment study do not always sufficiently capture the degree to which a test…
Validating Measurement of Knowledge Integration in Science Using Multiple-Choice and Explanation Items

ERIC Educational Resources Information Center

Lee, Hee-Sun; Liu, Ou Lydia; Linn, Marcia C.

2011-01-01

This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item…
Associations among Classroom Emotional Processes, Student Interest, and Engagement: A Convergent Validity Test

ERIC Educational Resources Information Center

Mazer, Joseph P.

2017-01-01

The results of this study compile convergent validity evidence for the Student Interest Scale and Student Engagement Scale through associations among emotional support, emotion work, student interest, and engagement. Confirmatory factor analysis indicates that the factor structures of the measures are stable, reliable, and valid. The results…
Validation of Aura Data: Needs and Implementation

NASA Astrophysics Data System (ADS)

Froidevaux, L.; Douglass, A. R.; Schoeberl, M. R.; Hilsenrath, E.; Kinnison, D. E.; Kroon, M.; Sander, S. P.

2003-12-01

Validation of Aura data: needs and implementation L. Froidevaux, A. R. Douglass, M. R. Schoeberl, E. Hilsenrath, D. Kinnison, M. Kroon, and S. P. Sander We describe the needs for validation of the Aura scientific data products expected in 2004 and for several years thereafter, as well as the implementation plan to fullfill these needs. Many profiles of stratospheric and tropospheric composition are expected from the combination of four instruments aboard Aura, along with column abundances, aerosol and cloud information. The Aura validation working group and the Aura Project have been developing programs and collaborations that are expected to lead to a significant number of validation activities after the Aura launch (in early 2004). Spatial and temporal variability in the lower stratosphere and troposphere present challenges to validation of Aura measurements even where cloud contamination effects can be minimized. Data from ground-based networks, balloons, and other satellites will contribute in a major way to Aura data validation. In addition, plans are in place to obtain correlative data for special conditions, such as profiles of O3 and NO2 in polluted areas. Several aircraft campaigns planned for the 2004-2007 time period will provide additional tropospheric and lower stratospheric validation opportunities for Aura; some atmospheric science goals will be addressed by the eventual combination of these data sets. A team of "Aura liaisons" will assist in the dissemination of information about various correlative measurements to be expected in the above timeframe, along with any needed protocols and agreements on data exchange and file formats. A data center is being established at the Goddard Space Flight Center to collect and distribute the various data files to be used in the validation of the Aura data.

Observational evidence and strength of evidence domains: case examples

PubMed Central

2014-01-01

Background Systematic reviews of healthcare interventions most often focus on randomized controlled trials (RCTs). However, certain circumstances warrant consideration of observational evidence, and such studies are increasingly being included as evidence in systematic reviews. Methods To illustrate the use of observational evidence, we present case examples of systematic reviews in which observational evidence was considered as well as case examples of individual observational studies, and how they demonstrate various strength of evidence domains in accordance with current Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Center (EPC) methods guidance. Results In the presented examples, observational evidence is used when RCTs are infeasible or raise ethical concerns, lack generalizability, or provide insufficient data. Individual study case examples highlight how observational evidence may fulfill required strength of evidence domains, such as study limitations (reduced risk of selection, detection, performance, and attrition); directness; consistency; precision; and reporting bias (publication, selective outcome reporting, and selective analysis reporting), as well as additional domains of dose-response association, plausible confounding that would decrease the observed effect, and strength of association (magnitude of effect). Conclusions The cases highlighted in this paper demonstrate how observational studies may provide moderate to (rarely) high strength evidence in systematic reviews. PMID:24758494
Convergent and Discriminant Validity of the Microcomputer Evaluation Screening and Assessment (MESA) Interest Survey.

ERIC Educational Resources Information Center

Janikowski, Timothy P.; And Others

1990-01-01

Examined construct validity of Microcomputer Evaluation Screening and Assessment (MESA) Interest Survey. Administered MESA and United States Employment Service (USES) Interest Inventory to 74 volunteer rehabilitation clients. Evidence supported convergent and discriminant validity of MESA. Found fewer significant intercorrelations among MESA…
The brief negative symptom scale: validation of the German translation and convergent validity with self-rated anhedonia and observer-rated apathy.

PubMed

Bischof, Martin; Obermann, Caitriona; Hartmann, Matthias N; Hager, Oliver M; Kirschner, Matthias; Kluge, Agne; Strauss, Gregory P; Kaiser, Stefan

2016-11-22

Negative symptoms are considered core symptoms of schizophrenia. The Brief Negative Symptom Scale (BNSS) was developed to measure this symptomatic dimension according to a current consensus definition. The present study examined the psychometric properties of the German version of the BNSS. To expand former findings on convergent validity, we employed the Temporal Experience Pleasure Scale (TEPS), a hedonic self-report that distinguishes between consummatory and anticipatory pleasure. Additionally, we addressed convergent validity with observer-rated assessment of apathy with the Apathy Evaluation Scale (AES), which was completed by the patient's primary nurse. Data were collected from 75 in- and outpatients from the Psychiatric Hospital, University Zurich diagnosed with either schizophrenia or schizoaffective disorder. We assessed convergent and discriminant validity, internal consistency and inter-rater reliability. We largely replicated the findings of the original version showing good psychometric properties of the BNSS. In addition, the primary nurses evaluation correlated moderately with interview-based clinician rating. BNSS anhedonia items showed good convergent validity with the TEPS. Overall, the German BNSS shows good psychometric properties comparable to the original English version. Convergent validity extends beyond interview-based assessments of negative symptoms to self-rated anhedonia and observer-rated apathy.
Reliability and validity of the Brief Pain Inventory in individuals with chronic obstructive pulmonary disease.

PubMed

Chen, Y-W; HajGhanbari, B; Road, J D; Coxson, H O; Camp, P G; Reid, W D

2018-06-08

Pain is prevalent in chronic obstructive pulmonary disease (COPD) and the Brief Pain Inventory (BPI) appears to be a feasible questionnaire to assess this symptom. However, the reliability and validity of the BPI have not been determined in individuals with COPD. This study aimed to determine the internal consistency, test-retest reliability and validity (construct, convergent, divergent and discriminant) of the BPI in individuals with COPD. In order to examine the test-retest reliability, individuals with COPD were recruited from pulmonary rehabilitation programmes to complete the BPI twice 1 week apart. In order to investigate validity, de-identified data was retrieved from two previous studies, including forced expiratory volume in 1-s, age, sex and data from four questionnaires: the BPI, short-form McGill Pain Questionnaire (SF-MPQ), 36-Item Short Form Survey (SF-36) and Community Health Activities Model Program for Seniors (CHAMPS) questionnaire. In total, 123 participants were included in the analyses (eligible data were retrieved from 86 participants and additional 37 participants were recruited). The BPI demonstrated excellent internal consistency and test-retest reliability. It also showed convergent validity with the SF-MPQ and divergent validity with the SF-36. The factor analysis yielded two factors of the BPI, which demonstrated that the two domains of the BPI measure the intended constructs. The BPI can also discriminate pain levels among COPD patients with varied levels of quality of life (SF-36) and physical activity (CHAMPS). The BPI is a reliable and valid pain questionnaire that can be used to evaluate pain in COPD. This study formally established the reliability and validity of the BPI in individuals with COPD, which have not been determined in this patient group. The results of this study provide strong evidence that assessment results from this pain questionnaire are reliable and valid. © 2018 European Pain Federation - EFIC®.
Evidence-based medicine and contemporary certification: Analysis of the American Board of Vascular Medicine endovascular board examination.

PubMed

Slovut, David Paul; Gray, Bruce H; Saiar, Amin; Bates, Mark C

2017-08-01

Since 2005, the American Board of Vascular Medicine (ABVM) endovascular examination has been used to certify vascular practitioners. Annual rigorous review has confirmed it is psychometrically valid and reliable. However, the evidence basis underlying the examination items has not been studied systematically. The aim of this study was to adjudicate class of recommendation (COR) and level of evidence (LOE) for the 2015 ABVM endovascular examination and establish an additional feedback mechanism for examination improvement based on contemporary evidence-based guidelines. We performed a pooled consensus process to classify each of the 110 items in the 2015 ABVM endovascular examination by COR and LOE as detailed in the current guideline statements. We added additional categories for items that were not eligible for assignment using traditional current evidence-based metrics: 'COR X', cannot be determined, not applicable, or simple recognition; and 'LOE X', cannot be determined or not applicable. COR classifications were assigned in the following proportion: Class I=15%, Class II=40%, Class III=3%, COR X=42%. LOE classifications were assigned in the following proportion: Level A=12%, Level B=34%, Level C=32%, LOE X=22%. Our analysis showed that nearly half of the 2015 ABVM endovascular examination items were supported by strong scientific evidence or fact-based knowledge. COR and LOE analysis yielded notably different results. Use of alternate classification schema may be powerful tools for improving certification exams in healthcare.
Drive: Theory and Construct Validation

PubMed Central

Petrides, K. V.

2016-01-01

This article explicates the theory of drive and describes the development and validation of two measures. A representative set of drive facets was derived from an extensive corpus of human attributes (Study 1). Operationalised using an International Personality Item Pool version (the Drive:IPIP), a three-factor model was extracted from the facets in two samples and confirmed on a third sample (Study 2). The multi-item IPIP measure showed congruence with a short form, based on single-item ratings of the facets, and both demonstrated cross-informant reliability. Evidence also supported the measures’ convergent, discriminant, concurrent, and incremental validity (Study 3). Based on very promising findings, the authors hope to initiate a stream of research in what is argued to be a rather neglected niche of individual differences and non-cognitive assessment. PMID:27409773
Validation of the PedsQL Epilepsy Module: A pediatric epilepsy-specific health-related quality of life measure.

PubMed

Modi, Avani C; Junger, Katherine F; Mara, Constance A; Kellermann, Tanja; Barrett, Lauren; Wagner, Janelle; Mucci, Grace A; Bailey, Laurie; Almane, Dace; Guilfoyle, Shanna M; Urso, Lauryn; Hater, Brooke; Hustzi, Heather; Smith, Gigi; Herrmann, Bruce; Perry, M Scott; Zupanc, Mary; Varni, James W

2017-11-01

To validate a brief and reliable epilepsy-specific, health-related quality of life (HRQOL) measure in children with various seizure types, treatments, and demographic characteristics. This national validation study was conducted across five epilepsy centers in the United States. Youth 5-18 years and caregivers of youth 2-18 years diagnosed with epilepsy completed the PedsQL Epilepsy Module and additional questionnaires to establish reliability and validity of the epilepsy-specific HRQOL instrument. Demographic and medical data were collected through chart reviews. Factor analysis was conducted, and internal consistency (Cronbach's alphas), test-retest reliability, and construct validity were assessed. Questionnaires were analyzed from 430 children with epilepsy (M age = 9.9 years; range 2-18 years; 46% female; 62% white: non-Hispanic; 76% monotherapy, 54% active seizures) and their caregivers. The final PedsQL Epilepsy Module is a 29-item measure with five subscales (i.e., Impact, Cognitive, Sleep, Executive Functioning, and Mood/Behavior) with parallel child and caregiver reports. Internal consistency coefficients ranged from 0.70-0.94. Construct validity and convergence was demonstrated in several ways, including strong relationships with seizure outcomes, antiepileptic drug (AED) side effects, and well-established measures of executive, cognitive, and emotional/behavioral functioning. The PedsQL Epilepsy Module is a reliable measure of HRQOL with strong evidence of its validity across the epilepsy spectrum in both clinical and research settings. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.
Validation of GC and HPLC systems for residue studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, M.

1995-12-01

For residue studies, GC and HPLC system performance must be validated prior to and during use. One excellent measure of system performance is the standard curve and associated chromatograms used to construct that curve. The standard curve is a model of system response to an analyte over a specific time period, and is prima facia evidence of system performance beginning at the auto sampler and proceeding through the injector, column, detector, electronics, data-capture device, and printer/plotter. This tool measures the performance of the entire chromatographic system; its power negates most of the benefits associated with costly and time-consuming validation ofmore » individual system components. Other measures of instrument and method validation will be discussed, including quality control charts and experimental designs for method validation.« less
Reynolds Adolescent Depression Scale - Second Edition: initial validation of the Korean version.

PubMed

Hyun, Myung-Sun; Nam, Kyoung-A; Kang, Hee Sun; Reynolds, William M

2009-03-01

This paper is a report of a study conducted to test the validity and reliability of the Reynolds Adolescent Depression Scale - Second Edition in Korean culture. Depression is a significant mental health problem in adolescents. The Reynolds Adolescent Depression Scale - Second Edition has been shown to be a useful tool to assess depression in adolescents, with extensive research on this measure having been conducted in western cultures. Measures developed in western cultures need to be tested and validated before being used in Asian cultures. The participants were a convenience sample of 440 Korean adolescents with a mean age of 13.78 years (sd = 0.95) from grades 7 to 9 in three public middle schools in South Korea. A cross-sectional design was used. Back-translation was used to create the Korean version, with additional testing for cultural meaning and comprehension. The data were collected at the end of 2004. Internal consistency reliability for the Korean version of the Reynolds Adolescent Depression Scale - Second Edition was 0.89, with subscale reliability ranging from 0.66 to 0.81. Evidence for criterion-related, convergent and discriminant validity for the Korean version of the Reynolds Adolescent Depression Scale - Second Edition was found. Confirmatory factor analysis supported the 4-factor structure of Reynolds Adolescent Depression Scale - Second Edition. Our results support the validity and reliability for the Korean version of the Reynolds Adolescent Depression Scale - Second Edition as a measure of depression and suggest that it can be used to screen students and to evaluate the effectiveness of preventive interventions in school settings.
Validity of the Foot and Ankle Ability Measure in athletes with chronic ankle instability.

PubMed

Carcia, Christopher R; Martin, RobRoy L; Drouin, Joshua M

2008-01-01

The Foot and Ankle Ability Measure (FAAM) is a region-specific, non-disease-specific outcome instrument that possesses many of the clinimetric qualities recommended for an outcome instrument. Evidence of validity to support the use of the FAAM is available in individuals with a wide array of ankle and foot disorders. However, additional evidence to support the use of the FAAM for those with chronic ankle instability (CAI) is needed. To provide evidence of construct validity for the FAAM based on hypothesis testing in athletes with CAI. Between-groups comparison. Athletic training room. Thirty National Collegiate Athletic Association Division II athletes (16 men, 14 women) from one university. The FAAM including activities of daily living (ADL) and sports subscales and the global and categorical ratings of function. For both the ADL and sports subscales, FAAM scores were greater in healthy participants (100 +/- 0.0 and 99 +/- 3.5, respectively) than in subjects with CAI (88 +/- 7.7 and 76 +/- 12.7, respectively; P < .001). Similarly, for both ADL and sports subscales, FAAM scores were greater in athletes who indicated that their ankles were normal (98 +/- 6.3 and 96 +/- 6.9, respectively) than in those who classified their ankles as either nearly normal or abnormal (87 +/- 6.6 and 71 +/- 11.1, respectively; P < .001). We found relationships between FAAM scores and self-reported global ratings of function for both ADL and sports subscales. Relationships were stronger when all athletes, rather than just those with CAI, were included in the analyses. The FAAM may be used to detect self-reported functional deficits related to CAI.
A Confirmatory Factor Analysis of the Student Evidence-Based Practice Questionnaire (S-EBPQ) in an Australian sample.

PubMed

Beccaria, Lisa; Beccaria, Gavin; McCosker, Catherine

2018-03-01

It is crucial that nursing students develop skills and confidence in using Evidence-Based Practice principles early in their education. This should be assessed with valid tools however, to date, few measures have been developed and applied to the student population. To examine the structural validity of the Student Evidence-Based Practice Questionnaire (S-EBPQ), with an Australian online nursing student cohort. A cross-sectional study for constructing validity. Three hundred and forty-five undergraduate nursing students from an Australian regional university were recruited across two semesters. Confirmatory Factor Analysis was used to examine the structural validity. Confirmatory Factor Analysis was applied which resulted in a good fitting model, based on a revised 20-item tool. The S-EBPQ tool remains a psychometrically robust measure of evidence-based practice use, attitudes, and knowledge and skills and can be applied in an online Australian student context. The findings of this study provided further evidence of the reliability and four factor structure of the S-EBPQ. Opportunities for further refinement of the tool may result in improvements in structural validity. Copyright © 2018 Elsevier Ltd. All rights reserved.
Development and Validation of the Organizational Dissent Scale.

ERIC Educational Resources Information Center

Kassing, Jeffrey W.

1998-01-01

Develops a measure for operationalizing how employees verbally express their contradictory opinions and disagreements about organizational phenomena. Tests the Organizational Dissent Scale (ODS) in a series of studies designed to generate evidence of validity/reliability for the measure. Indicates that the scale measures how employees express…
Valid methods: the quality assurance of test method development, validation, approval, and transfer for veterinary testing laboratories.

PubMed

Wiegers, Ann L

2003-07-01

Third-party accreditation is a valuable tool to demonstrate a laboratory's competence to conduct testing. Accreditation, internationally and in the United States, has been discussed previously. However, accreditation is only I part of establishing data credibility. A validated test method is the first component of a valid measurement system. Validation is defined as confirmation by examination and the provision of objective evidence that the particular requirements for a specific intended use are fulfilled. The international and national standard ISO/IEC 17025 recognizes the importance of validated methods and requires that laboratory-developed methods or methods adopted by the laboratory be appropriate for the intended use. Validated methods are therefore required and their use agreed to by the client (i.e., end users of the test results such as veterinarians, animal health programs, and owners). ISO/IEC 17025 also requires that the introduction of methods developed by the laboratory for its own use be a planned activity conducted by qualified personnel with adequate resources. This article discusses considerations and recommendations for the conduct of veterinary diagnostic test method development, validation, evaluation, approval, and transfer to the user laboratory in the ISO/IEC 17025 environment. These recommendations are based on those of nationally and internationally accepted standards and guidelines, as well as those of reputable and experienced technical bodies. They are also based on the author's experience in the evaluation of method development and transfer projects, validation data, and the implementation of quality management systems in the area of method development.
Assessing Attachment Security With the Attachment Q Sort: Meta-Analytic Evidence for the Validity of the Observer AQS

ERIC Educational Resources Information Center

van I Jzendoorn,Marinus H.; Vereijken, Carolus M.J.L.; Bakermans-Kranenburg, Marian J.; Riksen-Walraven, Marianne J.

2004-01-01

The reliability and validity of the Attachment Q Sort (AQS; Waters & Deane, 1985) was tested in a series of meta-analyses on 139 studies with 13,835 children. The observer AQS security score showed convergent validity with Strange Situation procedure (SSP) security (r=31) and excellent predictive validity with sensitivity measures (r=39). Its…
Validation and Estimation of Additive Genetic Variation Associated with DNA Tests for Quantitative Beef Cattle Traits

USDA-ARS?s Scientific Manuscript database

The U.S. National Beef Cattle Evaluation Consortium (NBCEC) has been involved in the validation of commercial DNA tests for quantitative beef quality traits since their first appearance on the U.S. market in the early 2000s. The NBCEC Advisory Council initially requested that the NBCEC set up a syst...
Extensive validation of the pain disability index in 3 groups of patients with musculoskeletal pain.

PubMed

Soer, Remko; Köke, Albère J A; Vroomen, Patrick C A J; Stegeman, Patrick; Smeets, Rob J E M; Coppes, Maarten H; Reneman, Michiel F

2013-04-20

A cross-sectional study design was performed. To validate the pain disability index (PDI) extensively in 3 groups of patients with musculoskeletal pain. The PDI is a widely used and studied instrument for disability related to various pain syndromes, although there is conflicting evidence concerning factor structure, test-retest reliability, and missing items. Additionally, an official translation of the Dutch language version has never been performed. For reliability, internal consistency, factor structure, test-retest reliability and measurement error were calculated. Validity was tested with hypothesized correlations with pain intensity, kinesiophobia, Rand-36 subscales, Depression, Roland-Morris Disability Questionnaire, Quality of Life, and Work Status. Structural validity was tested with independent backward translation and approval from the original authors. One hundred seventy-eight patients with acute back pain, 425 patients with chronic low back pain and 365 with widespread pain were included. Internal consistency of the PDI was good. One factor was identified with factor analyses. Test-retest reliability was good for the PDI (intraclass correlation coefficient, 0.76). Standard error of measurement was 6.5 points and smallest detectable change was 17.9 points. Little correlations between the PDI were observed with kinesiophobia and depression, fair correlations with pain intensity, work status, and vitality and moderate correlations with the Rand-36 subscales and the Roland-Morris Disability Questionnaire. The PDI-Dutch language version is internally consistent as a 1-factor structure, and test-retest reliable. Missing items seem high in sexual and professional items. Using the PDI as a 2-factor questionnaire has no additional value and is unreliable.
Evidence-based surgery: barriers, solutions, and the role of evidence synthesis.

PubMed

Garas, George; Ibrahim, Amel; Ashrafian, Hutan; Ahmed, Kamran; Patel, Vanash; Okabayashi, Koji; Skapinakis, Petros; Darzi, Ara; Athanasiou, Thanos

2012-08-01

Surgery is a rapidly evolving field, making the rigorous testing of emerging innovations vital. However, most surgical research fails to employ randomized controlled trials (RCTs) and has particularly been based on low-quality study designs. Subsequently, the analysis of data through meta-analysis and evidence synthesis is particularly difficult. Through a systematic review of the literature, this article explores the barriers to achieving a strong evidence base in surgery and offers potential solutions to overcome the barriers. Many barriers exist to evidence-based surgical research. They include enabling factors, such as funding, time, infrastructure, patient preference, ethical issues, and additionally barriers associated with specific attributes related to researchers, methodologies, or interventions. Novel evidence synthesis techniques in surgery are discussed, including graphics synthesis, treatment networks, and network meta-analyses that help overcome many of the limitations associated with existing techniques. They offer the opportunity to assess gaps and quantitatively present inconsistencies within the existing evidence of RCTs. Poorly or inadequately performed RCTs and meta-analyses can give rise to incorrect results and thus fail to inform clinical practice or revise policy. The above barriers can be overcome by providing academic leadership and good organizational support to ensure that adequate personnel, resources, and funding are allocated to the researcher. Training in research methodology and data interpretation can ensure that trials are conducted correctly and evidence is adequately synthesized and disseminated. The ultimate goal of overcoming the barriers to evidence-based surgery includes the improved quality of patient care in addition to enhanced patient outcomes.
A Validity Agenda for Growth Models: One Size Doesn't Fit All!

ERIC Educational Resources Information Center

Patelis, Thanos

2012-01-01

This is a keynote presentation given at AERA on developing a validity agenda for growth models in a large scale (e.g., state) setting. The emphasis of this presentation was to indicate that growth models and the validity agenda designed to provide evidence in supporting the claims to be made need to be personalized to meet the local or…
Analytical Validation of a Portable Mass Spectrometer Featuring Interchangeable, Ambient Ionization Sources for High Throughput Forensic Evidence Screening

NASA Astrophysics Data System (ADS)

Lawton, Zachary E.; Traub, Angelica; Fatigante, William L.; Mancias, Jose; O'Leary, Adam E.; Hall, Seth E.; Wieland, Jamie R.; Oberacher, Herbert; Gizzi, Michael C.; Mulligan, Christopher C.

2017-06-01

Forensic evidentiary backlogs are indicative of the growing need for cost-effective, high-throughput instrumental methods. One such emerging technology that shows high promise in meeting this demand while also allowing on-site forensic investigation is portable mass spectrometric (MS) instrumentation, particularly that which enables the coupling to ambient ionization techniques. While the benefits of rapid, on-site screening of contraband can be anticipated, the inherent legal implications of field-collected data necessitates that the analytical performance of technology employed be commensurate with accepted techniques. To this end, comprehensive analytical validation studies are required before broad incorporation by forensic practitioners can be considered, and are the focus of this work. Pertinent performance characteristics such as throughput, selectivity, accuracy/precision, method robustness, and ruggedness have been investigated. Reliability in the form of false positive/negative response rates is also assessed, examining the effect of variables such as user training and experience level. To provide flexibility toward broad chemical evidence analysis, a suite of rapidly-interchangeable ion sources has been developed and characterized through the analysis of common illicit chemicals and emerging threats like substituted phenethylamines. [Figure not available: see fulltext.
Validation in the clinical process: four settings for objectification of the subjectivity of understanding.

PubMed

Beland, H

1994-12-01

Clinical material is presented for discussion with the aim of exemplifying the author's conceptions of validation in a number of sessions and in psychoanalytic research and of making them verifiable, susceptible to consensus and/or falsifiable. Since Freud's postscript to the Dora case, the first clinical validation in the history of psychoanalysis, validation has been group-related and society-related, that is to say, it combines the evidence of subjectivity with the consensus of the research community (the scientific community). Validation verifies the conformity of the unconscious transference meaning with the analyst's understanding. The deciding criterion is the patient's reaction to the interpretation. In terms of the theory of science, validation in the clinical process corresponds to experimental testing of truth in the sphere of inanimate nature. Four settings of validation can be distinguished: the analyst's self-supervision during the process of understanding, which goes from incomprehension to comprehension (container-contained, PS-->D, selected fact); the patient's reaction to the interpretation (insight) and the analyst's assessment of the reaction; supervision and second thoughts; and discussion in groups and publications leading to consensus. It is a peculiarity of psychoanalytic research that in the event of positive validation the three criteria of truth (evidence, consensus and utility) coincide.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.