Sample records for valid assessment system

  1. Validation of a National Teacher Assessment and Improvement System

    ERIC Educational Resources Information Center

    Taut, Sandy; Santelices, Maria Veronica; Stecher, Brian

    2012-01-01

    The task of validating a teacher assessment and improvement system is similar whether the system operates in the United States or in another country. Chile has a national teacher evaluation system (NTES) that is standards based, uses multiple instruments, and is intended to serve both formative and summative purposes. For the past 6 years the…

  2. Five-level emergency triage systems: variation in assessment of validity.

    PubMed

    Kuriyama, Akira; Urushidani, Seigo; Nakayama, Takeo

    2017-11-01

    Triage systems are scales developed to rate the degree of urgency among patients who arrive at EDs. A number of different scales are in use; however, the way in which they have been validated is inconsistent. Also, it is difficult to define a surrogate that accurately predicts urgency. This systematic review described reference standards and measures used in previous validation studies of five-level triage systems. We searched PubMed, EMBASE and CINAHL to identify studies that had assessed the validity of five-level triage systems and described the reference standards and measures applied in these studies. Studies were divided into those using criterion validity (reference standards developed by expert panels or triage systems already in use) and those using construct validity (prognosis, costs and resource use). A total of 57 studies examined criterion and construct validity of 14 five-level triage systems. Criterion validity was examined by evaluating (1) agreement between the assigned degree of urgency with objective standard criteria (12 studies), (2) overtriage and undertriage (9 studies) and (3) sensitivity and specificity of triage systems (7 studies). Construct validity was examined by looking at (4) the associations between the assigned degree of urgency and measures gauged in EDs (48 studies) and (5) the associations between the assigned degree of urgency and measures gauged after hospitalisation (13 studies). Particularly, among 46 validation studies of the most commonly used triages (Canadian Triage and Acuity Scale, Emergency Severity Index and Manchester Triage System), 13 and 39 studies examined criterion and construct validity, respectively. Previous studies applied various reference standards and measures to validate five-level triage systems. They either created their own reference standard or used a combination of severity/resource measures. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All

  3. Assessment of capabilities in persons with advanced stage of dementia: Validation of The Montessori Assessment System (MAS).

    PubMed

    Erkes, Jérôme; Camp, Cameron J; Raffard, Stéphane; Gély-Nargeot And, Marie-Christine; Bayard, Sophie

    2017-01-01

    This study evaluated the validity and reliability of the Montessori Assessment System. The Montessori Assessment System assesses preserved abilities in persons with moderate to severe dementia. In this respect, this instrument provides crucial information for the development of effective person-centered care plans. A total of 196 persons with a diagnosis of dementia in the moderate to severe stages of dementia were recruited in 10 long-term care facilities in France. All participants completed the Montessori Assessment System, the Clinical Dementia Rating Scale and/or the Mini Mental State Examination and the Severe Impairment Battery-short form. The internal consistency and temporal stability of the Montessori Assessment System were high. Additionally, good construct and divergent validity were demonstrated. Factor analysis showed a one-factor structure. The Montessori Assessment System demonstrated satisfactory psychometric properties while being a useful instrument to assess capabilities in persons with advanced stages of dementia and hence to develop person-centered plans of care.

  4. Temporal Stability and Convergent Validity of the Behavior Assessment System for Children.

    ERIC Educational Resources Information Center

    Merydith, Scott P.

    2001-01-01

    Assesses the temporal stability and convergent validity of the Behavioral Assessment System for Children (BASC). Teachers and parents rated kindergarten and first-grade students using BASC. Teachers were more stable in rating children's externalizing behaviors and attention problems. Discusses results in terms of the accuracy of information…

  5. RELIABILITY AND VALIDITY OF AN ACCELEROMETRIC SYSTEM FOR ASSESSING VERTICAL JUMPING PERFORMANCE

    PubMed Central

    Laffaye, G.; Taiar, R.

    2014-01-01

    The validity of an accelerometric system (Myotest©) for assessing vertical jump height, vertical force and power, leg stiffness and reactivity index was examined. 20 healthy males performed 3ד5 hops in place”, 3ד1 squat jump” and 3× “1 countermovement jump” during 2 test-retest sessions. The variables were simultaneously assessed using an accelerometer and a force platform at a frequency of 0.5 and 1 kHz, respectively. Both reliability and validity of the accelerometric system were studied. No significant differences between test and retest data were found (p < 0.05), showing a high level of reliability. Besides, moderate to high intraclass correlation coefficients (ICCs) (from 0.74 to 0.96) were obtained for all variables whereas weak to moderate ICCs (from 0.29 to 0.79) were obtained for force and power during the countermovement jump. With regards to validity, the difference between the two devices was not significant for 5 hops in place height (1.8 cm), force during squat (-1.4 N · kg−1) and countermovement (0.1 N · kg−1) jumps, leg stiffness (7.8 kN · m−1) and reactivity index (0.4). So, the measurements of these variables with this accelerometer are valid, which is not the case for the other variables. The main causes of non-validity for velocity, power and contact time assessment are temporal biases of the takeoff and touchdown moments detection. PMID:24917690

  6. Reliability and validity of an accele-rometric system for assessing vertical jumping performance.

    PubMed

    Choukou, M-A; Laffaye, G; Taiar, R

    2014-03-01

    The validity of an accelerometric system (Myotest©) for assessing vertical jump height, vertical force and power, leg stiffness and reactivity index was examined. 20 healthy males performed 3×"5 hops in place", 3×"1 squat jump" and 3× "1 countermovement jump" during 2 test-retest sessions. The variables were simultaneously assessed using an accelerometer and a force platform at a frequency of 0.5 and 1 kHz, respectively. Both reliability and validity of the accelerometric system were studied. No significant differences between test and retest data were found (p < 0.05), showing a high level of reliability. Besides, moderate to high intraclass correlation coefficients (ICCs) (from 0.74 to 0.96) were obtained for all variables whereas weak to moderate ICCs (from 0.29 to 0.79) were obtained for force and power during the countermovement jump. With regards to validity, the difference between the two devices was not significant for 5 hops in place height (1.8 cm), force during squat (-1.4 N · kg(-1)) and countermovement (0.1 N · kg(-1)) jumps, leg stiffness (7.8 kN · m(-1)) and reactivity index (0.4). So, the measurements of these variables with this accelerometer are valid, which is not the case for the other variables. The main causes of non-validity for velocity, power and contact time assessment are temporal biases of the takeoff and touchdown moments detection.

  7. Assessment of bachelor's theses in a nursing degree with a rubrics system: Development and validation study.

    PubMed

    González-Chordá, Víctor M; Mena-Tudela, Desirée; Salas-Medina, Pablo; Cervera-Gasch, Agueda; Orts-Cortés, Isabel; Maciá-Soler, Loreto

    2016-02-01

    Writing a bachelor thesis (BT) is the last step to obtain a nursing degree. In order to perform an effective assessment of a nursing BT, certain reliable and valid tools are required. To develop and validate a 3-rubric system (drafting process, dissertation, and viva) to assess final year nursing students' BT. A multi-disciplinary study of content validity and psychometric properties. The study was carried out between December 2014 and July 2015. Nursing Degree at Universitat Jaume I. Spain. Eleven experts (9 nursing professors and 2 education professors from 6 different universities) took part in the development and content validity stages. Fifty-two theses presented during the 2014-2015 academic year were included by consecutive sampling of cases in order to study the psychometric properties. First, a group of experts was created to validate the content of the assessment system based on three rubrics (drafting process, dissertation, and viva). Subsequently, a reliability and validity study of the rubrics was carried out on the 52 theses presented during the 2014-2015 academic year. The BT drafting process rubric has 8 criteria (S-CVI=0.93; α=0.837; ICC=0.614), the dissertation rubric has 7 criteria (S-CVI=0.9; α=0.893; ICC=0.74), and the viva rubric has 4 criteria (S-CVI=0.86; α=8.16; ICC=0.895). A nursing BT assessment system based on three rubrics (drafting process, dissertation, and viva) has been validated. This system may be transferred to other nursing degrees or degrees from other academic areas. It is necessary to continue with the validation process taking into account factors that may affect the results obtained. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. A Validation of the Classroom Assessment Scoring System in Finnish Kindergartens

    ERIC Educational Resources Information Center

    Pakarinen, Eija; Lerkkanen, Marja-Kristiina; Poikkeus, Anna-Maija; Kiuru, Noona; Siekkinen, Martti; Rasku-Puttonen, Helena; Nurmi, Jari-Erik

    2010-01-01

    Research Findings: This study examined the validity and reliability of the Classroom Assessment Scoring System (CLASS; R. C. Pianta, K. M. La Paro, & B. K. Hamre, 2008) in Finnish kindergartens. A pair of trained observers used the CLASS to observe 49 kindergarten teachers (47 female, 2 male) on two different days. Questionnaires measuring…

  9. Multi-institutional validation of a web-based core competency assessment system.

    PubMed

    Tabuenca, Arnold; Welling, Richard; Sachdeva, Ajit K; Blair, Patrice G; Horvath, Karen; Tarpley, John; Savino, John A; Gray, Richard; Gulley, Julie; Arnold, Teresa; Wolfe, Kevin; Risucci, Donald A

    2007-01-01

    The Association of Program Directors in Surgery and the Division of Education of the American College of Surgeons developed and implemented a web-based system for end-of-rotation faculty assessment of ACGME core competencies of residents. This study assesses its reliability and validity across multiple programs. Each assessment included ratings (1-5 scale) on 23 items reflecting the 6 core competencies. A total of 4241 end-of-rotation assessments were completed for 332 general surgery residents (> or =5 evaluations each) at 5 sites during the 2004-2005 and 2005-2006 academic years. The mean rating for each resident on each item was computed for each academic year. The mean rating of items representing each competency was computed for each resident. Additional data included USMLE and ABSITE scores, PGY, and status in program (categorical, designated preliminary, and undesignated preliminary). Coefficient alpha was greater than 0.90 for each competency score. Mean ratings for each competency increased significantly (p < 0.01) as a function of PGY. Mean ratings for professionalism and interpersonal/communication skills (IPC) were significantly higher than all other competencies at all PGY levels. Competency ratings of PGY 1 residents correlated significantly with USMLE Step I, ranging from (r = 0.26, p < 0.01) for Professionalism to (r = 0.41, p < 0.001) for Systems-Based Practice. Ratings of Knowledge (r = 0.31, p < 0.01), Practice-Based Learning & Improvement (PBLI; r = 0.22, p < 0.05), and Systems-Based Practice (r = 0.20, p < 0.05) correlated significantly with 2005 ABSITE Total Percentile. Ratings of all competencies correlated significantly with the 2006 ABSITE Total Percentile Score (range: r = 0.20, p < 0.05 for professionalism to r = 0.35, p < 0.001 for knowledge). Categorical and designated preliminary residents received significantly higher ratings (p < 0.05) than nondesignated preliminaries for knowledge, patient care, PBLI, and systems-based practice only

  10. Validation of a Computerized Cognitive Assessment System for Persons with Stroke: A Pilot Study

    ERIC Educational Resources Information Center

    Yip, Chi Kwong; Man, David W. K.

    2009-01-01

    This study investigates the validity of a newly developed computerized cognitive assessment system (CCAS) that is equipped with rich multimedia to generate simulated testing situations and considers both test item difficulty and the test taker's ability. It is also hypothesized that better predictive validity of the CCAS in self-care of persons…

  11. Development and validation of an automated delirium risk assessment system (Auto-DelRAS) implemented in the electronic health record system.

    PubMed

    Moon, Kyoung-Ja; Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

    2018-01-01

    A key component of the delirium management is prevention and early detection. To develop an automated delirium risk assessment system (Auto-DelRAS) that automatically alerts health care providers of an intensive care unit (ICU) patient's delirium risk based only on data collected in an electronic health record (EHR) system, and to evaluate the clinical validity of this system. Cohort and system development designs were used. Medical and surgical ICUs in two university hospitals in Seoul, Korea. A total of 3284 patients for the development of Auto-DelRAS, 325 for external validation, 694 for validation after clinical applications. The 4211 data items were extracted from the EHR system and delirium was measured using CAM-ICU (Confusion Assessment Method for Intensive Care Unit). The potential predictors were selected and a logistic regression model was established to create a delirium risk scoring algorithm to construct the Auto-DelRAS. The Auto-DelRAS was evaluated at three months and one year after its application to clinical practice to establish the predictive validity of the system. Eleven predictors were finally included in the logistic regression model. The results of the Auto-DelRAS risk assessment were shown as high/moderate/low risk on a Kardex screen. The predictive validity, analyzed after the clinical application of Auto-DelRAS after one year, showed a sensitivity of 0.88, specificity of 0.72, positive predictive value of 0.53, negative predictive value of 0.94, and a Youden index of 0.59. A relatively high level of predictive validity was maintained with the Auto-DelRAS system, even one year after it was applied to clinical practice. Copyright © 2017. Published by Elsevier Ltd.

  12. British isles lupus assessment group 2004 index is valid for assessment of disease activity in systemic lupus erythematosus

    PubMed Central

    Yee, Chee-Seng; Farewell, Vernon; Isenberg, David A; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; D'Cruz, David; Khamashta, Munther A; Maddison, Peter; Gordon, Caroline

    2007-01-01

    Objective To determine the construct and criterion validity of the British Isles Lupus Assessment Group 2004 (BILAG-2004) index for assessing disease activity in systemic lupus erythematosus (SLE). Methods Patients with SLE were recruited into a multicenter cross-sectional study. Data on SLE disease activity (scores on the BILAG-2004 index, Classic BILAG index, and Systemic Lupus Erythematosus Disease Activity Index 2000 [SLEDAI-2K]), investigations, and therapy were collected. Overall BILAG-2004 and overall Classic BILAG scores were determined by the highest score achieved in any of the individual systems in the respective index. Erythrocyte sedimentation rates (ESRs), C3 levels, C4 levels, anti–double-stranded DNA (anti-dsDNA) levels, and SLEDAI-2K scores were used in the analysis of construct validity, and increase in therapy was used as the criterion for active disease in the analysis of criterion validity. Statistical analyses were performed using ordinal logistic regression for construct validity and logistic regression for criterion validity. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Results Of the 369 patients with SLE, 92.7% were women, 59.9% were white, 18.4% were Afro-Caribbean and 18.4% were South Asian. Their mean ± SD age was 41.6 ± 13.2 years and mean disease duration was 8.8 ± 7.7 years. More than 1 assessment was obtained on 88.6% of the patients, and a total of 1,510 assessments were obtained. Increasing overall scores on the BILAG-2004 index were associated with increasing ESRs, decreasing C3 levels, decreasing C4 levels, elevated anti-dsDNA levels, and increasing SLEDAI-2K scores (all P < 0.01). Increase in therapy was observed more frequently in patients with overall BILAG-2004 scores reflecting higher disease activity. Scores indicating active disease (overall BILAG-2004 scores of A and B) were significantly associated with increase in therapy (odds ratio [OR] 19.3, P

  13. Validating the Use of pPerformance Risk Indices for System-Level Risk and Maturity Assessments

    NASA Astrophysics Data System (ADS)

    Holloman, Sherrica S.

    With pressure on the U.S. Defense Acquisition System (DAS) to reduce cost overruns and schedule delays, system engineers' performance is only as good as their tools. Recent literature details a need for 1) objective, analytical risk quantification methodologies over traditional subjective qualitative methods -- such as, expert judgment, and 2) mathematically rigorous system-level maturity assessments. The Mahafza, Componation, and Tippett (2005) Technology Performance Risk Index (TPRI) ties the assessment of technical performance to the quantification of risk of unmet performance; however, it is structured for component- level data as input. This study's aim is to establish a modified TPRI with systems-level data as model input, and then validate the modified index with actual system-level data from the Department of Defense's (DoD) Major Defense Acquisition Programs (MDAPs). This work's contribution is the establishment and validation of the System-level Performance Risk Index (SPRI). With the introduction of the SPRI, system-level metrics are better aligned, allowing for better assessment, tradeoff and balance of time, performance and cost constraints. This will allow system engineers and program managers to ultimately make better-informed system-level technical decisions throughout the development phase.

  14. A Reliability and Validity of an Instrument to Evaluate the School-Based Assessment System: A Pilot Study

    ERIC Educational Resources Information Center

    Ghazali, Nor Hasnida Md

    2016-01-01

    A valid, reliable and practical instrument is needed to evaluate the implementation of the school-based assessment (SBA) system. The aim of this study is to develop and assess the validity and reliability of an instrument to measure the perception of teachers towards the SBA implementation in schools. The instrument is developed based on a…

  15. Assessing Attachment in Psychotherapy: Validation of the Patient Attachment Coding System (PACS).

    PubMed

    Talia, Alessandro; Miller-Bottome, Madeleine; Daniel, Sarah I F

    2017-01-01

    The authors present and validate the Patient Attachment Coding System (PACS), a transcript-based instrument that assesses clients' in-session attachment based on any session of psychotherapy, in multiple treatment modalities. One-hundred and sixty clients in different types of psychotherapy (cognitive-behavioural, cognitive-behavioural-enhanced, psychodynamic, relational, supportive) and from three different countries were administered the Adult Attachment Interview (AAI) prior to treatment, and one session for each client was rated with the PACS by independent coders. Results indicate strong inter-rater reliability, and high convergent validity of the PACS scales and classifications with the AAI. These results present the PACS as a practical alternative to the AAI in psychotherapy research and suggest that clinicians using the PACS can assess clients' attachment status on an ongoing basis by monitoring clients' verbal activity. These results also provide information regarding the ways in which differences in attachment status play out in therapy sessions and further the study of attachment in psychotherapy from a pre-treatment client factor to a process variable. Copyright © 2015 John Wiley & Sons, Ltd. The Patient Attachment Coding System is a valid measure of attachment that can classify clients' attachment based on any single psychotherapy transcript, in many therapeutic modalities Client differences in attachment manifest in part independently of the therapist's contributions Client adult attachment patterns are likely to affect psychotherapeutic processes. Copyright © 2015 John Wiley & Sons, Ltd.

  16. Validity and reliability of the Myotest accelerometric system for the assessment of vertical jump height.

    PubMed

    Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A

    2010-11-01

    The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p < 0.001) with a systematic bias of approximately 7 cm, even though random errors were low (2.7 cm) and intraclass correlation coefficients (ICCs) where high (>0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p < 0.001), with high random errors (>12 cm), high limits of agreement ratios (>36%), and low ICCs (<0.75), that is, poor validity. As regards reliability, Myotest-T showed high ICCs (range: 0.92-0.96), whereas Myotest-V showed low ICCs (range: 0.56-0.89), and high random errors (>9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.

  17. Assessment of validity with polytrauma Veteran populations.

    PubMed

    Bush, Shane S; Bass, Carmela

    2015-01-01

    Veterans with polytrauma have suffered injuries to multiple body parts and organs systems, including the brain. The injuries can generate a triad of physical, neurologic/cognitive, and emotional symptoms. Accurate diagnosis is essential for the treatment of these conditions and for fair allocation of benefits. To accurately diagnose polytrauma disorders and their related problems, clinicians take into account the validity of reported history and symptoms, as well as clinical presentations. The purpose of this article is to describe the assessment of validity with polytrauma Veteran populations. Review of scholarly and other relevant literature and clinical experience are utilized. A multimethod approach to validity assessment that includes objective, standardized measures increases the confidence that can be placed in the accuracy of self-reported symptoms and physical, cognitive, and emotional test results. Due to the multivariate nature of polytrauma and the multiple disciplines that play a role in diagnosis and treatment, an ideal model of validity assessment with polytrauma Veteran populations utilizes neurocognitive, neurological, neuropsychiatric, and behavioral measures of validity. An overview of these validity assessment approaches as applied to polytrauma Veteran populations is presented. Veterans, the VA, and society are best served when accurate diagnoses are made.

  18. Geographic Information Systems to Assess External Validity in Randomized Trials.

    PubMed

    Savoca, Margaret R; Ludwig, David A; Jones, Stedman T; Jason Clodfelter, K; Sloop, Joseph B; Bollhalter, Linda Y; Bertoni, Alain G

    2017-08-01

    To support claims that RCTs can reduce health disparities (i.e., are translational), it is imperative that methodologies exist to evaluate the tenability of external validity in RCTs when probabilistic sampling of participants is not employed. Typically, attempts at establishing post hoc external validity are limited to a few comparisons across convenience variables, which must be available in both sample and population. A Type 2 diabetes RCT was used as an example of a method that uses a geographic information system to assess external validity in the absence of a priori probabilistic community-wide diabetes risk sampling strategy. A geographic information system, 2009-2013 county death certificate records, and 2013-2014 electronic medical records were used to identify community-wide diabetes prevalence. Color-coded diabetes density maps provided visual representation of these densities. Chi-square goodness of fit statistic/analysis tested the degree to which distribution of RCT participants varied across density classes compared to what would be expected, given simple random sampling of the county population. Analyses were conducted in 2016. Diabetes prevalence areas as represented by death certificate and electronic medical records were distributed similarly. The simple random sample model was not a good fit for death certificate record (chi-square, 17.63; p=0.0001) and electronic medical record data (chi-square, 28.92; p<0.0001). Generally, RCT participants were oversampled in high-diabetes density areas. Location is a highly reliable "principal variable" associated with health disparities. It serves as a directly measurable proxy for high-risk underserved communities, thus offering an effective and practical approach for examining external validity of RCTs. Copyright © 2017 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  19. Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

    PubMed

    Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

    2013-12-01

    A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Validity and reliability of a pilot scale for assessment of multiple system atrophy symptoms.

    PubMed

    Matsushima, Masaaki; Yabe, Ichiro; Takahashi, Ikuko; Hirotani, Makoto; Kano, Takahiro; Horiuchi, Kazuhiro; Houzen, Hideki; Sasaki, Hidenao

    2017-01-01

    Multiple system atrophy (MSA) is a rare progressive neurodegenerative disorder for which brief yet sensitive scale is required in order for use in clinical trials and general screening. We previously compared several scales for the assessment of MSA symptoms and devised an eight-item pilot scale with large standardized response mean [handwriting, finger taps, transfers, standing with feet together, turning trunk, turning 360°, gait, body sway]. The aim of the present study is to investigate the validity and reliability of a simple pilot scale for assessment of multiple system atrophy symptoms. Thirty-two patients with MSA (15 male/17 female; 20 cerebellar subtype [MSA-C]/12 parkinsonian subtype [MSA-P]) were prospectively registered between January 1, 2014 and February 28, 2015. Patients were evaluated by two independent raters using the Unified MSA Rating Scale (UMSARS), Scale for Assessment and Rating of Ataxia (SARA), and the pilot scale. Correlations between UMSARS, SARA, pilot scale scores, intraclass correlation coefficients (ICCs), and Cronbach's alpha coefficients were calculated. Pilot scale scores significantly correlated with scores for UMSARS Parts I, II, and IV as well as with SARA scores. Intra-rater and inter-rater ICCs and Cronbach's alpha coefficients remained high (> 0.94) for all measures. The results of the present study indicate the validity and reliability of the eight-item pilot scale, particularly for the assessment of symptoms in patients with early state multiple system atrophy.

  1. Validity of retrospective disease activity assessment in systemic lupus erythematosus.

    PubMed

    Arce-Salinas, A; Cardiel, M H; Guzmán, J; Alcocer-Varela, J

    1996-05-01

    To evaluate the validity of retrospective disease activity assessment derived from clinical charts. We prospectively evaluated 37 patients with systemic lupus erythematosus (SLE) in 90 visits using the SLE Disease Activity Index (SLEDAI), the Mexican SLEDAI (Mex-SLEDAI), and the Lupus Activity Criteria Count (LACC) indices. Routine clinical observations were written by rheumatologists blind to index scores. These notes were reviewed 2 years later to obtain retrospective index scores and their validity was assessed using prospective scores as the standard. Statistical analysis was by Spearman's rank correlation coefficient (rs), Wilcoxon matched pairs test, kappa statistic, and intraclass correlation coefficient (ri). We calculated the sensitivity and specificity of retrospective indices to detect active disease. Median retrospective scores were lower in all indices: SLEDAI (4 VS 2, p =0.004, RS = 0.68, ri = 0.30); Mex-SLEDAI (2 vs 1, p < 0.0003, rs = 0.79, ri = 0.31); and LACC (1 vs 1, p = 0.007, rs = 0.65, ri = 0.21). Used to detect active SLE, the retrospective SLEDAI had a sensitivity of 0.68 and a specificity of 0.86; corresponding values for the Mex-SLEDAI were 0.72 and 0.91, and for the LACC, 0.77 and 0.76. Retrospective disease activity indices tended to provide lower scores than prospective evaluations. They often missed patients with mildly active disease, but when positive they were good predictors of disease activity.

  2. Validation, Edits, and Application Processing System Report: Phase I.

    ERIC Educational Resources Information Center

    Gray, Susan; And Others

    Findings of phase 1 of a study of the 1979-1980 Basic Educational Opportunity Grants validation, edits, and application processing system are presented. The study was designed to: assess the impact of the validation effort and processing system edits on the correct award of Basic Grants; and assess the characteristics of students most likely to…

  3. Validity Issues in Clinical Assessment.

    ERIC Educational Resources Information Center

    Foster, Sharon L.; Cone, John D.

    1995-01-01

    Validation issues that arise with measures of constructs and behavior are addressed with reference to general reasons for using assessment procedures in clinical psychology. A distinction is made between the representational phase of validity assessment and the elaborative validity phase in which the meaning and utility of scores are examined.…

  4. The Individualized Classroom Assessment Scoring System (inCLASS): Preliminary Reliability and Validity of a System for Observing Preschoolers’ Competence in Classroom Interactions

    PubMed Central

    Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C.

    2012-01-01

    This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children’s interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability, normal distributions and adequate range, construct validity, and criterion-related validity. These initial findings suggest that the inCLASS has the potential to provide an authentic, contextualized assessment of young children’s classroom behaviors. Future directions for research with the inCLASS are discussed. PMID:23175598

  5. Assessing preschoolers interactive behaviour: A validation study of the "Coding System for Mother-Child Interaction".

    PubMed

    Baiao, R; Baptista, J; Carneiro, A; Pinto, R; Toscano, C; Fearon, P; Soares, I; Mesquita, A R

    2018-07-01

    The preschool years are a period of great developmental achievements, which impact critically on a child's interactive skills. Having valid and reliable measures to assess interactive behaviour at this stage is therefore crucial. The aim of this study was to describe the adaptation and validation of the child coding of the Coding System for Mother-Child Interactions and discuss its applications and implications in future research and practice. Two hundred twenty Portuguese preschoolers and their mothers were videotaped during a structured task. Child and mother interactive behaviours were coded based on the task. Maternal reports on the child's temperament and emotional and behaviour problems were also collected, along with family psychosocial information. Interrater agreement was confirmed. The use of child Cooperation, Enthusiasm, and Negativity as subscales was supported by their correlations across tasks. Moreover, these subscales were correlated with each other, which supports the use of a global child interactive behaviour score. Convergent validity with a measure of emotional and behavioural problems (Child Behaviour Checklist 1 ½-5) was established, as well as divergent validity with a measure of temperament (Children's Behaviour Questionnaire-Short Form). Regarding associations with family variables, child interactive behaviour was only associated with maternal behaviour. Findings suggest that this coding system is a valid and reliable measure for assessing child interactive behaviour in preschool age children. It therefore represents an important alternative to this area of research and practice, with reduced costs and with more flexible training requirements. Attention should be given in future research to expanding this work to clinical populations and different age groups. © 2018 John Wiley & Sons Ltd.

  6. Validity and validation of expert (Q)SAR systems.

    PubMed

    Hulzebos, E; Sijm, D; Traas, T; Posthumus, R; Maslankiewicz, L

    2005-08-01

    At a recent workshop in Setubal (Portugal) principles were drafted to assess the suitability of (quantitative) structure-activity relationships ((Q)SARs) for assessing the hazards and risks of chemicals. In the present study we applied some of the Setubal principles to test the validity of three (Q)SAR expert systems and validate the results. These principles include a mechanistic basis, the availability of a training set and validation. ECOSAR, BIOWIN and DEREK for Windows have a mechanistic or empirical basis. ECOSAR has a training set for each QSAR. For half of the structural fragments the number of chemicals in the training set is >4. Based on structural fragments and log Kow, ECOSAR uses linear regression to predict ecotoxicity. Validating ECOSAR for three 'valid' classes results in predictivity of > or = 64%. BIOWIN uses (non-)linear regressions to predict the probability of biodegradability based on fragments and molecular weight. It has a large training set and predicts non-ready biodegradability well. DEREK for Windows predictions are supported by a mechanistic rationale and literature references. The structural alerts in this program have been developed with a training set of positive and negative toxicity data. However, to support the prediction only a limited number of chemicals in the training set is presented to the user. DEREK for Windows predicts effects by 'if-then' reasoning. The program predicts best for mutagenicity and carcinogenicity. Each structural fragment in ECOSAR and DEREK for Windows needs to be evaluated and validated separately.

  7. Development and validation of a patient-reported questionnaire assessing systemic therapy induced diarrhea in oncology patients.

    PubMed

    Lui, Michelle; Gallo-Hershberg, Daniela; DeAngelis, Carlo

    2017-12-22

    Systemic therapy-induced diarrhea (STID) is a common side effect experienced by more than half of cancer patients. Despite STID-associated complications and poorer quality of life (QoL), no validated assessment tools exist to accurately assess STID occurrence and severity to guide clinical management. Therefore, we developed and validated a patient-reported questionnaire (STIDAT). The STIDAT was developed using the FDA iterative process for patient-reported outcomes. A literature search uncovered potential items and questions for questionnaire construction used by oncology clinicians to develop questions for the preliminary instrument. The instrument was evaluated on its face validity and content validity by patient interviews. Repetitive, similar and different themes uncovered from patient interviews were implemented to revise the instrument to the version used for validation. Patients starting high-risk STID treatments were monitored using the STIDAT, bowel diaries and EORTC QLQ-C30. The STIDAT was evaluated for construct validity using exploratory factor analysis (EFA) using minimal residual method with Promax rotation, reliability and consistency. A weighted scoring system was developed and a receiver-operating characteristic (ROC) curve evaluated the tool's ability to detect STID occurrence. Median scores and variability were analysed to determine how well it differentiates between diarrhea severities. A post-hoc analysis determined how diarrhea severity impacted QoL of cancer patients. Patients defined diarrhea based on presence of watery stool. The STIDAT assessed patient's perception of having diarrhea, daily number of bowel movements, daily number of diarrhea episodes, antidiarrheal medication use, the presence of urgency, abdominal pain, abdominal spasms or fecal incontinence, patient's perception of diarrhea severity, and QoL. These dimensions were sorted into four clusters using EFA - patient's perception of diarrhea, frequency of diarrhea, fecal

  8. Translation, Cross-Cultural Adaptation, and Validation of the Malay Version of the System Usability Scale Questionnaire for the Assessment of Mobile Apps.

    PubMed

    Mohamad Marzuki, Muhamad Fadhil; Yaacob, Nor Azwany; Yaacob, Najib Majdi

    2018-05-14

    A mobile app is a programmed system designed to be used by a target user on a mobile device. The usability of such a system refers not only to the extent to which product can be used to achieve the task that it was designed for, but also its effectiveness and efficiency, as well as user satisfaction. The System Usability Scale is one of the most commonly used questionnaires used to assess the usability of a system. The original 10-item version of System Usability Scale was developed in English and thus needs to be adapted into local languages to assess the usability of a mobile apps developed in other languages. The aim of this study is to translate and validate (with cross-cultural adaptation) the English System Usability Scale questionnaire into Malay, the main language spoken in Malaysia. The development of a translated version will allow the usability of mobile apps to be assessed in Malay. Forward and backward translation of the questionnaire was conducted by groups of Malay native speakers who spoke English as their second language. The final version was obtained after reconciliation and cross-cultural adaptation. The content of the Malay System Usability Scale questionnaire for mobile apps was validated by 10 experts in mobile app development. The efficacy of the questionnaire was further probed by testing the face validity on 10 mobile phone users, followed by reliability testing involving 54 mobile phone users. The content validity index was determined to be 0.91, indicating good relevancy of the 10 items used to assess the usability of a mobile app. Calculation of the face validity index resulted in a value of 0.94, therefore indicating that the questionnaire was easily understood by the users. Reliability testing showed a Cronbach alpha value of .85 (95% CI 0.79-0.91) indicating that the translated System Usability Scale questionnaire is a reliable tool for the assessment of usability of a mobile app. The Malay System Usability Scale questionnaire is a

  9. Validation of learning assessments: A primer.

    PubMed

    Peeters, Michael J; Martin, Beth A

    2017-09-01

    The Accreditation Council for Pharmacy Education's Standards 2016 has placed greater emphasis on validating educational assessments. In this paper, we describe validity, reliability, and validation principles, drawing attention to the conceptual change that highlights one validity with multiple evidence sources; to this end, we recommend abandoning historical (confusing) terminology associated with the term validity. Further, we describe and apply Kane's framework (scoring, generalization, extrapolation, and implications) for the process of validation, with its inferences and conclusions from varied uses of assessment instruments by different colleges and schools of pharmacy. We then offer five practical recommendations that can improve reporting of validation evidence in pharmacy education literature. We describe application of these recommendations, including examples of validation evidence in the context of pharmacy education. After reading this article, the reader should be able to understand the current concept of validation, and use a framework as they validate and communicate their own institution's learning assessments. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Assessing the validity and reliability of three indicators self-reported on the pregnancy risk assessment monitoring system survey.

    PubMed

    Ahluwalia, Indu B; Helms, Kristen; Morrow, Brian

    2013-01-01

    We investigated the reliability and validity of three self-reported indicators from the Pregnancy Risk Assessment Monitoring System (PRAMS) survey. We used 2008 PRAMS (n=15,646) data from 12 states that had implemented the 2003 revised U.S. Certificate of Live Birth. We estimated reliability by kappa coefficient and validity by sensitivity and specificity using the birth certificate data as the reference for the following: prenatal participation in the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC); Medicaid payment for delivery; and breastfeeding initiation. These indicators were examined across several demographic subgroups. The reliability was high for all three measures: 0.81 for WIC participation, 0.67 for Medicaid payment of delivery, and 0.72 for breastfeeding initiation. The validity of PRAMS indicators was also high: WIC participation (sensitivity = 90.8%, specificity = 90.6%), Medicaid payment for delivery (sensitivity = 82.4%, specificity = 85.6%), and breastfeeding initiation (sensitivity = 94.3%, specificity = 76.0%). The prevalence estimates were higher on PRAMS than the birth certificate for each of the indicators except Medicaid-paid delivery among non-Hispanic black women. Kappa values within most subgroups remained in the moderate range (0.40-0.80). Sensitivity and specificity values were lower for Hispanic women who responded to the PRAMS survey in Spanish and for breastfeeding initiation among women who delivered very low birthweight and very preterm infants. The validity and reliability of the PRAMS data for measures assessed were high. Our findings support the use of PRAMS data for epidemiological surveillance, research, and planning.

  11. Assessing Procedural Competence: Validity Considerations.

    PubMed

    Pugh, Debra M; Wood, Timothy J; Boulet, John R

    2015-10-01

    Simulation-based medical education (SBME) offers opportunities for trainees to learn how to perform procedures and to be assessed in a safe environment. However, SBME research studies often lack robust evidence to support the validity of the interpretation of the results obtained from tools used to assess trainees' skills. The purpose of this paper is to describe how a validity framework can be applied when reporting and interpreting the results of a simulation-based assessment of skills related to performing procedures. The authors discuss various sources of validity evidence because they relate to SBME. A case study is presented.

  12. IV&V Project Assessment Process Validation

    NASA Technical Reports Server (NTRS)

    Driskell, Stephen

    2012-01-01

    The Space Launch System (SLS) will launch NASA's Multi-Purpose Crew Vehicle (MPCV). This launch vehicle will provide American launch capability for human exploration and travelling beyond Earth orbit. SLS is designed to be flexible for crew or cargo missions. The first test flight is scheduled for December 2017. The SLS SRR/SDR provided insight into the project development life cycle. NASA IV&V ran the standard Risk Based Assessment and Portfolio Based Risk Assessment to identify analysis tasking for the SLS program. This presentation examines the SLS System Requirements Review/System Definition Review (SRR/SDR), IV&V findings for IV&V process validation correlation to/from the selected IV&V tasking and capabilities. It also provides a reusable IEEE 1012 scorecard for programmatic completeness across the software development life cycle.

  13. Design for validation: An approach to systems validation

    NASA Technical Reports Server (NTRS)

    Carter, William C.; Dunham, Janet R.; Laprie, Jean-Claude; Williams, Thomas; Howden, William; Smith, Brian; Lewis, Carl M. (Editor)

    1989-01-01

    Every complex system built is validated in some manner. Computer validation begins with review of the system design. As systems became too complicated for one person to review, validation began to rely on the application of adhoc methods by many individuals. As the cost of the changes mounted and the expense of failure increased, more organized procedures became essential. Attempts at devising and carrying out those procedures showed that validation is indeed a difficult technical problem. The successful transformation of the validation process into a systematic series of formally sound, integrated steps is necessary if the liability inherent in the future digita-system-based avionic and space systems is to be minimized. A suggested framework and timetable for the transformtion are presented. Basic working definitions of two pivotal ideas (validation and system life-cyle) are provided and show how the two concepts interact. Many examples are given of past and present validation activities by NASA and others. A conceptual framework is presented for the validation process. Finally, important areas are listed for ongoing development of the validation process at NASA Langley Research Center.

  14. Validity evidence for an OSCE to assess competency in systems-based practice and practice-based learning and improvement: a preliminary investigation.

    PubMed

    Varkey, Prathibha; Natt, Neena; Lesnick, Timothy; Downing, Steven; Yudkowsky, Rachel

    2008-08-01

    To determine the psychometric properties and validity of an OSCE to assess the competencies of Practice-Based Learning and Improvement (PBLI) and Systems-Based Practice (SBP) in graduate medical education. An eight-station OSCE was piloted at the end of a three-week Quality Improvement elective for nine preventive medicine and endocrinology fellows at Mayo Clinic. The stations assessed performance in quality measurement, root cause analysis, evidence-based medicine, insurance systems, team collaboration, prescription errors, Nolan's model, and negotiation. Fellows' performance in each of the stations was assessed by three faculty experts using checklists and a five-point global competency scale. A modified Angoff procedure was used to set standards. Evidence for the OSCE's validity, feasibility, and acceptability was gathered. Evidence for content and response process validity was judged as excellent by institutional content experts. Interrater reliability of scores ranged from 0.85 to 1 for most stations. Interstation correlation coefficients ranged from -0.62 to 0.99, reflecting case specificity. Implementation cost was approximately $255 per fellow. All faculty members agreed that the OSCE was realistic and capable of providing accurate assessments. The OSCE provides an opportunity to systematically sample the different subdomains of Quality Improvement. Furthermore, the OSCE provides an opportunity for the demonstration of skills rather than the testing of knowledge alone, thus making it a potentially powerful assessment tool for SBP and PBLI. The study OSCE was well suited to assess SBP and PBLI. The evidence gathered through this study lays the foundation for future validation work.

  15. Quadruplex digital flight control system assessment

    NASA Technical Reports Server (NTRS)

    Mulcare, D. B.; Downing, L. E.; Smith, M. K.

    1988-01-01

    Described are the development and validation of a double fail-operational digital flight control system architecture for critical pitch axis functions. Architectural tradeoffs are assessed, system simulator modifications are described, and demonstration testing results are critiqued. Assessment tools and their application are also illustrated. Ultimately, the vital role of system simulation, tailored to digital mechanization attributes, is shown to be essential to validating the airworthiness of full-time critical functions such as augmented fly-by-wire systems for relaxed static stability airplanes.

  16. When Assessment Data Are Words: Validity Evidence for Qualitative Educational Assessments.

    PubMed

    Cook, David A; Kuper, Ayelet; Hatala, Rose; Ginsburg, Shiphra

    2016-10-01

    Quantitative scores fail to capture all important features of learner performance. This awareness has led to increased use of qualitative data when assessing health professionals. Yet the use of qualitative assessments is hampered by incomplete understanding of their role in forming judgments, and lack of consensus in how to appraise the rigor of judgments therein derived. The authors articulate the role of qualitative assessment as part of a comprehensive program of assessment, and translate the concept of validity to apply to judgments arising from qualitative assessments. They first identify standards for rigor in qualitative research, and then use two contemporary assessment validity frameworks to reorganize these standards for application to qualitative assessment.Standards for rigor in qualitative research include responsiveness, reflexivity, purposive sampling, thick description, triangulation, transparency, and transferability. These standards can be reframed using Messick's five sources of validity evidence (content, response process, internal structure, relationships with other variables, and consequences) and Kane's four inferences in validation (scoring, generalization, extrapolation, and implications). Evidence can be collected and evaluated for each evidence source or inference. The authors illustrate this approach using published research on learning portfolios.The authors advocate a "methods-neutral" approach to assessment, in which a clearly stated purpose determines the nature of and approach to data collection and analysis. Increased use of qualitative assessments will necessitate more rigorous judgments of the defensibility (validity) of inferences and decisions. Evidence should be strategically sought to inform a coherent validity argument.

  17. Reliability and Validity of the Arthroscopic International Cartilage Repair Society Classification System: Correlation With Histological Assessment of Depth.

    PubMed

    Dwyer, Tim; Martin, C Ryan; Kendra, Rita; Sermer, Corey; Chahal, Jaskarndip; Ogilvie-Harris, Darrell; Whelan, Daniel; Murnaghan, Lucas; Nauth, Aaron; Theodoropoulos, John

    2017-06-01

    To determine the interobserver reliability of the International Cartilage Repair Society (ICRS) grading system of chondral lesions in cadavers, to determine the intraobserver reliability of the ICRS grading system comparing arthroscopy and video assessment, and to compare the arthroscopic ICRS grading system with histological grading of lesion depth. Eighteen lesions in 5 cadaveric knee specimens were arthroscopically graded by 7 fellowship-trained arthroscopic surgeons using the ICRS classification system. The arthroscopic video of each lesion was sent to the surgeons 6 weeks later for repeat grading and determination of intraobserver reliability. Lesions were biopsied, and the depth of the cartilage lesion was assessed. Reliability was calculated using intraclass correlations. The interobserver reliability was 0.67 (95% confidence interval, 0.5-0.89) for the arthroscopic grading, and the intraobserver reliability with the video grading was 0.8 (95% confidence interval, 0.67-0.9). A high correlation was seen between the arthroscopic grading of depth and the histological grading of depth (0.91); on average, surgeons graded lesions using arthroscopy a mean of 0.37 (range, 0-0.86) deeper than the histological grade. The arthroscopic ICRS classification system has good interobserver and intraobserver reliability. A high correlation with histological assessment of depth provides evidence of validity for this classification system. As cartilage lesions are treated on the basis of the arthroscopic ICRS classification, it is important to ascertain the reliability and validity of this method. Copyright © 2016 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.

  18. Institutional Effectiveness: A Model for Planning, Assessment & Validation.

    ERIC Educational Resources Information Center

    Truckee Meadows Community Coll., Sparks, NV.

    The report presents Truckee Meadows Community College's (Colorado) model for assessing institutional effectiveness and validating the College's mission and vision, and the strategic plan for carrying out the institutional effectiveness model. It also outlines strategic goals for the years 1999-2001. From the system-wide directive that education…

  19. Assessing Meritorious Teacher Performance: A Differential Validity Study.

    ERIC Educational Resources Information Center

    Ellett, Chad D; Capie, William

    The Teacher Assessment and Development System (TADS) - Meritorious Teacher Program (MTP) FORM instrument is used in the Dade County Public Schools, Miami, Florida, to evaluate teachers. Its validity for decisions concerning merit pay for master teachers was examined in this study. Specifically, its ability to discriminate between high performing…

  20. Structural Validation of the Holistic Wellness Assessment

    ERIC Educational Resources Information Center

    Brown, Charlene; Applegate, E. Brooks; Yildiz, Mustafa

    2015-01-01

    The Holistic Wellness Assessment (HWA) is a relatively new assessment instrument based on an emergent transdisciplinary model of wellness. This study validated the factor structure identified via exploratory factor analysis (EFA), assessed test-retest reliability, and investigated concurrent validity of the HWA in three separate samples. The…

  1. The Modified Cognitive Constructions Coding System: Reliability and Validity Assessments

    ERIC Educational Resources Information Center

    Moran, Galia S.; Diamond, Gary M.

    2006-01-01

    The cognitive constructions coding system (CCCS) was designed for coding client's expressed problem constructions on four dimensions: intrapersonal-interpersonal, internal-external, responsible-not responsible, and linear-circular. This study introduces, and examines the reliability and validity of, a modified version of the CCCS--a version that…

  2. Validity and reliability of balance assessment software using the Nintendo Wii balance board: usability and validation.

    PubMed

    Park, Dae-Sung; Lee, GyuChang

    2014-06-10

    A balance test provides important information such as the standard to judge an individual's functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment.

  3. Validity of the Microsoft Kinect for assessment of postural control.

    PubMed

    Clark, Ross A; Pua, Yong-Hao; Fortin, Karine; Ritchie, Callan; Webster, Kate E; Denehy, Linda; Bryant, Adam L

    2012-07-01

    Clinically feasible methods of assessing postural control such as timed standing balance and functional reach tests provide important information, however, they cannot accurately quantify specific postural control mechanisms. The Microsoft Kinect™ system provides real-time anatomical landmark position data in three dimensions (3D), and given that it is inexpensive, portable and simple to setup it may bridge this gap. This study assessed the concurrent validity of the Microsoft Kinect™ against a benchmark reference, a multiple-camera 3D motion analysis system, in 20 healthy subjects during three postural control tests: (i) forward reach, (ii) lateral reach, and (iii) single-leg eyes-closed standing balance. For the reach tests, the outcome measures consisted of distance reached and trunk flexion angle in the sagittal (forward reach) and coronal (lateral reach) planes. For the standing balance test the range and deviation of movement in the anatomical landmark positions for the sternum, pelvis, knee and ankle and the lateral and anterior trunk flexion angle were assessed. The Microsoft Kinect™ and 3D motion analysis systems had comparable inter-trial reliability (ICC difference=0.06±0.05; range, 0.00-0.16) and excellent concurrent validity, with Pearson's r-values >0.90 for the majority of measurements (r=0.96±0.04; range, 0.84-0.99). However, ordinary least products analyses demonstrated proportional biases for some outcome measures associated with the pelvis and sternum. These findings suggest that the Microsoft Kinect™ can validly assess kinematic strategies of postural control. Given the potential benefits it could therefore become a useful tool for assessing postural control in the clinical setting. Copyright © 2012 Elsevier B.V. All rights reserved.

  4. Automated Pressure Injury Risk Assessment System Incorporated Into an Electronic Health Record System.

    PubMed

    Jin, Yinji; Jin, Taixian; Lee, Sun-Mi

    Pressure injury risk assessment is the first step toward preventing pressure injuries, but traditional assessment tools are time-consuming, resulting in work overload and fatigue for nurses. The objectives of the study were to build an automated pressure injury risk assessment system (Auto-PIRAS) that can assess pressure injury risk using data, without requiring nurses to collect or input additional data, and to evaluate the validity of this assessment tool. A retrospective case-control study and a system development study were conducted in a 1,355-bed university hospital in Seoul, South Korea. A total of 1,305 pressure injury patients and 5,220 nonpressure injury patients participated for the development of a risk scoring algorithm: 687 and 2,748 for the validation of the algorithm and 237 and 994 for validation after clinical implementation, respectively. A total of 4,211 pressure injury-related clinical variables were extracted from the electronic health record (EHR) systems to develop a risk scoring algorithm, which was validated and incorporated into the EHR. That program was further evaluated for predictive and concurrent validity. Auto-PIRAS, incorporated into the EHR system, assigned a risk assessment score of high, moderate, or low and displayed this on the Kardex nursing record screen. Risk scores were updated nightly according to 10 predetermined risk factors. The predictive validity measures of the algorithm validation stage were as follows: sensitivity = .87, specificity = .90, positive predictive value = .68, negative predictive value = .97, Youden index = .77, and the area under the receiver operating characteristic curve = .95. The predictive validity measures of the Braden Scale were as follows: sensitivity = .77, specificity = .93, positive predictive value = .72, negative predictive value = .95, Youden index = .70, and the area under the receiver operating characteristic curve = .85. The kappa of the Auto-PIRAS and Braden Scale risk

  5. Technical skills assessment toolbox: a review using the unitary framework of validity.

    PubMed

    Ghaderi, Iman; Manji, Farouq; Park, Yoon Soo; Juul, Dorthea; Ott, Michael; Harris, Ilene; Farrell, Timothy M

    2015-02-01

    The purpose of this study was to create a technical skills assessment toolbox for 35 basic and advanced skills/procedures that comprise the American College of Surgeons (ACS)/Association of Program Directors in Surgery (APDS) surgical skills curriculum and to provide a critical appraisal of the included tools, using contemporary framework of validity. Competency-based training has become the predominant model in surgical education and assessment of performance is an essential component. Assessment methods must produce valid results to accurately determine the level of competency. A search was performed, using PubMed and Google Scholar, to identify tools that have been developed for assessment of the targeted technical skills. A total of 23 assessment tools for the 35 ACS/APDS skills modules were identified. Some tools, such as Operative Performance Rating System (OSATS) and Objective Structured Assessment of Technical Skill (OPRS), have been tested for more than 1 procedure. Therefore, 30 modules had at least 1 assessment tool, with some common surgical procedures being addressed by several tools. Five modules had none. Only 3 studies used Messick's framework to design their validity studies. The remaining studies used an outdated framework on the basis of "types of validity." When analyzed using the contemporary framework, few of these studies demonstrated validity for content, internal structure, and relationship to other variables. This study provides an assessment toolbox for common surgical skills/procedures. Our review shows that few authors have used the contemporary unitary concept of validity for development of their assessment tools. As we progress toward competency-based training, future studies should provide evidence for various sources of validity using the contemporary framework.

  6. Select Methodology for Validating Advanced Satellite Measurement Systems

    NASA Technical Reports Server (NTRS)

    Larar, Allen M.; Zhou, Daniel K.; Liu, Xi; Smith, William L.

    2008-01-01

    Advanced satellite sensors are tasked with improving global measurements of the Earth's atmosphere, clouds, and surface to enable enhancements in weather prediction, climate monitoring capability, and environmental change detection. Measurement system validation is crucial to achieving this goal and maximizing research and operational utility of resultant data. Field campaigns including satellite under-flights with well calibrated FTS sensors aboard high-altitude aircraft are an essential part of the validation task. This presentation focuses on an overview of validation methodology developed for assessment of high spectral resolution infrared systems, and includes results of preliminary studies performed to investigate the performance of the Infrared Atmospheric Sounding Interferometer (IASI) instrument aboard the MetOp-A satellite.

  7. Validation of a scenario-based assessment of critical thinking using an externally validated tool.

    PubMed

    Buur, Jennifer L; Schmidt, Peggy; Smylie, Dean; Irizarry, Kris; Crocker, Carlos; Tyler, John; Barr, Margaret

    2012-01-01

    With medical education transitioning from knowledge-based curricula to competency-based curricula, critical thinking skills have emerged as a major competency. While there are validated external instruments for assessing critical thinking, many educators have created their own custom assessments of critical thinking. However, the face validity of these assessments has not been challenged. The purpose of this study was to compare results from a custom assessment of critical thinking with the results from a validated external instrument of critical thinking. Students from the College of Veterinary Medicine at Western University of Health Sciences were administered a custom assessment of critical thinking (ACT) examination and the externally validated instrument, California Critical Thinking Skills Test (CCTST), in the spring of 2011. Total scores and sub-scores from each exam were analyzed for significant correlations using Pearson correlation coefficients. Significant correlations between ACT Blooms 2 and deductive reasoning and total ACT score and deductive reasoning were demonstrated with correlation coefficients of 0.24 and 0.22, respectively. No other statistically significant correlations were found. The lack of significant correlation between the two examinations illustrates the need in medical education to externally validate internal custom assessments. Ultimately, the development and validation of custom assessments of non-knowledge-based competencies will produce higher quality medical professionals.

  8. The accuracy of Internet search engines to predict diagnoses from symptoms can be assessed with a validated scoring system.

    PubMed

    Shenker, Bennett S

    2014-02-01

    To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (p<0.0001). Spearman's rho for test-retest reliability was 0.72 (p<0.0001). There was no difference in scores based on Internet search engine. We found a significant difference in scores based on the webpage's order on the Internet search engine webpage (p=0.007). Pairwise comparisons revealed higher scores in the first webpages vs. the fourth (corr p=0.009) and fifth (corr p=0.017). However, this significance was lost when creating composite scores. The five point scoring system to assess diagnostic accuracy of Internet search engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  9. A Content Validity Study of AIMIT (Assessing Interpersonal Motivation in Transcripts).

    PubMed

    Fassone, Giovanni; Lo Reto, Floriana; Foggetti, Paola; Santomassimo, Chiara; D'Onofrio, Maria Rita; Ivaldi, Antonella; Liotti, Giovanni; Trincia, Valeria; Picardi, Angelo

    2016-07-01

    Multi-motivational theories of human relatedness state that different motivational systems with an evolutionary basis modulate interpersonal relationships. The reliable assessment of their dynamics may usefully inform the understanding of the therapeutic relationship. The coding system of the Assessing Interpersonal Motivation in Transcripts (AIMIT) allows to identify in the clinical the activity of five main interpersonal motivational systems (IMSs): attachment (care-seeking), caregiving, ranking, sexuality and peer cooperation. To assess whether the criteria currently used to score the AIMIT are consistently correlated with the conceptual formulation of the interpersonal multi-motivational theory, two different studies were designed. Study 1: Content validity as assessed by highly qualified independent raters. Study 2: Content validity as assessed by unqualified raters. Results of study 1 show that out of the total 60 AIMIT verbal criteria, 52 (86.7%) met the required minimum degree of correspondence. The average semantic correspondence scores between these items and the related IMSs were quite good (overall mean: 3.74, standard deviation: 0.61). In study 2, a group of 20 naïve raters had to identify each prevalent motivation (IMS) in a random sequence of 1000 utterances drawn from therapy sessions. Cohen's Kappa coefficient was calculated for each rater with reference to each IMS and then calculated the average Kappa for all raters for each IMS. All average Kappa values were satisfactory (>0.60) and ranged between 0.63 (ranking system) and 0.83 (sexuality system). Data confirmed the overall soundness of AIMIT's theoretical-applicative approach. Results are discussed, corroborating the hypothesis that the AIMIT possesses the required criteria for content validity. Copyright © 2015 John Wiley & Sons, Ltd. Assessing Interpersonal Motivations in psychotherapy transcripts as a useful tool to better understand links between motivational systems and intersubjectivity

  10. Quantitative model validation of manipulative robot systems

    NASA Astrophysics Data System (ADS)

    Kartowisastro, Iman Herwidiana

    This thesis is concerned with applying the distortion quantitative validation technique to a robot manipulative system with revolute joints. Using the distortion technique to validate a model quantitatively, the model parameter uncertainties are taken into account in assessing the faithfulness of the model and this approach is relatively more objective than the commonly visual comparison method. The industrial robot is represented by the TQ MA2000 robot arm. Details of the mathematical derivation of the distortion technique are given which explains the required distortion of the constant parameters within the model and the assessment of model adequacy. Due to the complexity of a robot model, only the first three degrees of freedom are considered where all links are assumed rigid. The modelling involves the Newton-Euler approach to obtain the dynamics model, and the Denavit-Hartenberg convention is used throughout the work. The conventional feedback control system is used in developing the model. The system behavior to parameter changes is investigated as some parameters are redundant. This work is important so that the most important parameters to be distorted can be selected and this leads to a new term called the fundamental parameters. The transfer function approach has been chosen to validate an industrial robot quantitatively against the measured data due to its practicality. Initially, the assessment of the model fidelity criterion indicated that the model was not capable of explaining the transient record in term of the model parameter uncertainties. Further investigations led to significant improvements of the model and better understanding of the model properties. After several improvements in the model, the fidelity criterion obtained was almost satisfied. Although the fidelity criterion is slightly less than unity, it has been shown that the distortion technique can be applied in a robot manipulative system. Using the validated model, the importance of

  11. PRA (Probabilistic Risk Assessments) Participation versus Validation

    NASA Technical Reports Server (NTRS)

    DeMott, Diana; Banke, Richard

    2013-01-01

    Probabilistic Risk Assessments (PRAs) are performed for projects or programs where the consequences of failure are highly undesirable. PRAs primarily address the level of risk those projects or programs posed during operations. PRAs are often developed after the design has been completed. Design and operational details used to develop models include approved and accepted design information regarding equipment, components, systems and failure data. This methodology basically validates the risk parameters of the project or system design. For high risk or high dollar projects, using PRA methodologies during the design process provides new opportunities to influence the design early in the project life cycle to identify, eliminate or mitigate potential risks. Identifying risk drivers before the design has been set allows the design engineers to understand the inherent risk of their current design and consider potential risk mitigation changes. This can become an iterative process where the PRA model can be used to determine if the mitigation technique is effective in reducing risk. This can result in more efficient and cost effective design changes. PRA methodology can be used to assess the risk of design alternatives and can demonstrate how major design changes or program modifications impact the overall program or project risk. PRA has been used for the last two decades to validate risk predictions and acceptability. Providing risk information which can positively influence final system and equipment design the PRA tool can also participate in design development, providing a safe and cost effective product.

  12. A validation study of an alternate state science assessment: Alignment of the Pennsylvania Alternate System of Assessment (PASA) science assessment

    NASA Astrophysics Data System (ADS)

    Heh, Peter

    The current study examined the validation and alignment of the PASA-Science by determining whether the alternate science assessment anchors linked to the regular education science anchors; whether the PASA-Science assessment items are science; whether the PASA-Science assessment items linked to the alternate science eligible content, and what PASA-Science assessment content was considered important by parents and teachers. Special education and science education university faculty determined all but one alternate science assessment anchor linked to the regular science assessment anchors. Special education and science education teachers determined that the PASA-Science assessment items are indeed science and linked to the alternate science eligible content. Finally, parents and teachers indicated the most important science content assessed in the PASA-Science involved safety and independence.

  13. Personality disorder assessment: the challenge of construct validity.

    PubMed

    Clark, L A; Livesley, W J; Morey, L

    1997-01-01

    We begin with a review of the data that challenge the current categorical system for classifying personality disorder, focusing on the central assessment issues of convergent and discriminant validity. These data indicate that while there is room for improvement in assessment, even greater change is needed in conceptualization than in instrumentation. Accordingly, we then refocus the categorical-dimensional debate in assessment terms, and place it in the broader context of such issues as the hierarchical structure of personality, overlap and distinctions between normal and abnormal personality, sources of information in personality disorder assessment, and overlap and discrimination of trait and state assessment. We conclude that more complex conceptual models that can incorporate both biological and environmental influences on the development of adaptive and maladaptive personality are needed.

  14. Elbow-specific clinical rating systems: extent of established validity, reliability, and responsiveness.

    PubMed

    The, Bertram; Reininga, Inge H F; El Moumni, Mostafa; Eygendaal, Denise

    2013-10-01

    The modern standard of evaluating treatment results includes the use of rating systems. Elbow-specific rating systems are frequently used in studies aiming at elbow-specific pathology. However, proper validation studies seem to be relatively sparse. In addition, these scoring systems might not always be used for appropriate populations of interest. Both of these issues might give rise to invalid conclusions being reported in the literature. Our aim was to investigate the extent to which the available elbow-specific outcome measurement tools have been validated and the quality of the validation itself. We also aimed to provide characteristics of the populations used for validation of these scales to enable clinicians to use them appropriately. A literature search identified 17 studies of 12 different elbow-specific scoring systems. These were assessed for validity, reliability, and responsiveness characteristics. The quality of these assessments was rated according to the Consensus Based Standards for the Selection of Health Measurement Instruments (COSMIN) checklist criteria, a standardized and validated tool developed specifically for this purpose. Currently, the only elbow-specific rating system that is validated using high-quality methodology is the Oxford Elbow Score, a patient-administered outcome measure tool that has been validated on heterogeneous study populations. Other rating systems still have to be proven in the future to be as good as the Oxford Elbow Score for clinical or research purposes. Additional validation studies are needed. Copyright © 2013 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.

  15. Validity and reliability of balance assessment software using the Nintendo Wii balance board: usability and validation

    PubMed Central

    2014-01-01

    Background A balance test provides important information such as the standard to judge an individual’s functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Methods Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). Results The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. Conclusion The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment. PMID:24912769

  16. Rethinking Validation in Complex High-Stakes Assessment Contexts

    ERIC Educational Resources Information Center

    Koch, Martha J.; DeLuca, Christopher

    2012-01-01

    In this article we rethink validation within the complex contexts of high-stakes assessment. We begin by considering the utility of existing models for validation and argue that these models tend to overlook some of the complexities inherent to assessment use, including the multiple interpretations of assessment purposes and the potential…

  17. Validity in work-based assessment: expanding our horizons.

    PubMed

    Govaerts, Marjan; van der Vleuten, Cees P M

    2013-12-01

    Although work-based assessments (WBA) may come closest to assessing habitual performance, their use for summative purposes is not undisputed. Most criticism of WBA stems from approaches to validity consistent with the quantitative psychometric framework. However, there is increasing research evidence that indicates that the assumptions underlying the predictive, deterministic framework of psychometrics may no longer hold. In this discussion paper we argue that meaningfulness and appropriateness of current validity evidence can be called into question and that we need alternative strategies to assessment and validity inquiry that build on current theories of learning and performance in complex and dynamic workplace settings. Drawing from research in various professional fields we outline key issues within the mechanisms of learning, competence and performance in the context of complex social environments and illustrate their relevance to WBA. In reviewing recent socio-cultural learning theory and research on performance and performance interpretations in work settings, we demonstrate that learning, competence (as inferred from performance) as well as performance interpretations are to be seen as inherently contextualised, and can only be under-stood 'in situ'. Assessment in the context of work settings may, therefore, be more usefully viewed as a socially situated interpretive act. We propose constructivist-interpretivist approaches towards WBA in order to capture and understand contextualised learning and performance in work settings. Theoretical assumptions underlying interpretivist assessment approaches call for a validity theory that provides the theoretical framework and conceptual tools to guide the validation process in the qualitative assessment inquiry. Basic principles of rigour specific to qualitative research have been established, and they can and should be used to determine validity in interpretivist assessment approaches. If used properly, these

  18. Validation of educational assessments: a primer for simulation and beyond.

    PubMed

    Cook, David A; Hatala, Rose

    2016-01-01

    Simulation plays a vital role in health professions assessment. This review provides a primer on assessment validation for educators and education researchers. We focus on simulation-based assessment of health professionals, but the principles apply broadly to other assessment approaches and topics. Validation refers to the process of collecting validity evidence to evaluate the appropriateness of the interpretations, uses, and decisions based on assessment results. Contemporary frameworks view validity as a hypothesis, and validity evidence is collected to support or refute the validity hypothesis (i.e., that the proposed interpretations and decisions are defensible). In validation, the educator or researcher defines the proposed interpretations and decisions, identifies and prioritizes the most questionable assumptions in making these interpretations and decisions (the "interpretation-use argument"), empirically tests those assumptions using existing or newly-collected evidence, and then summarizes the evidence as a coherent "validity argument." A framework proposed by Messick identifies potential evidence sources: content, response process, internal structure, relationships with other variables, and consequences. Another framework proposed by Kane identifies key inferences in generating useful interpretations: scoring, generalization, extrapolation, and implications/decision. We propose an eight-step approach to validation that applies to either framework: Define the construct and proposed interpretation, make explicit the intended decision(s), define the interpretation-use argument and prioritize needed validity evidence, identify candidate instruments and/or create/adapt a new instrument, appraise existing evidence and collect new evidence as needed, keep track of practical issues, formulate the validity argument, and make a judgment: does the evidence support the intended use? Rigorous validation first prioritizes and then empirically evaluates key

  19. Validation of a method for assessing resident physicians' quality improvement proposals.

    PubMed

    Leenstra, James L; Beckman, Thomas J; Reed, Darcy A; Mundell, William C; Thomas, Kris G; Krajicek, Bryan J; Cha, Stephen S; Kolars, Joseph C; McDonald, Furman S

    2007-09-01

    Residency programs involve trainees in quality improvement (QI) projects to evaluate competency in systems-based practice and practice-based learning and improvement. Valid approaches to assess QI proposals are lacking. We developed an instrument for assessing resident QI proposals--the Quality Improvement Proposal Assessment Tool (QIPAT-7)-and determined its validity and reliability. QIPAT-7 content was initially obtained from a national panel of QI experts. Through an iterative process, the instrument was refined, pilot-tested, and revised. Seven raters used the instrument to assess 45 resident QI proposals. Principal factor analysis was used to explore the dimensionality of instrument scores. Cronbach's alpha and intraclass correlations were calculated to determine internal consistency and interrater reliability, respectively. QIPAT-7 items comprised a single factor (eigenvalue = 3.4) suggesting a single assessment dimension. Interrater reliability for each item (range 0.79 to 0.93) and internal consistency reliability among the items (Cronbach's alpha = 0.87) were high. This method for assessing resident physician QI proposals is supported by content and internal structure validity evidence. QIPAT-7 is a useful tool for assessing resident QI proposals. Future research should determine the reliability of QIPAT-7 scores in other residency and fellowship training programs. Correlations should also be made between assessment scores and criteria for QI proposal success such as implementation of QI proposals, resident scholarly productivity, and improved patient outcomes.

  20. Validity and reliability of the infant breastfeeding assessment tool, the mother baby assessment tool, and the LATCH scoring system.

    PubMed

    Altuntas, Nilgun; Turkyilmaz, Canan; Yildiz, Havva; Kulali, Ferit; Hirfanoglu, Ibrahim; Onal, Esra; Ergenekon, Ebru; Koç, Esin; Atalay, Yıldız

    2014-05-01

    We aimed to evaluate the validity and reliability of the Infant Breastfeeding Assessment Tool (IBFAT), the Mother Baby Assessment (MBA) Tool, and the LATCH scoring system. Mothers who delivered healthy, full-term infants in the Obstetrics & Gynecology Service of Gazi University, Ankara, Turkey, between December 2013 and January 2014 and their infants were included in the study. Forty-six randomly selected breastfeeding sessions were monitored and scored simultaneously by three researchers (Raters 1, 2, and 3) using LATCH, IBFAT, and the MBA Tool. Researchers put the score sheets in an envelope in order to hide them from each other. The compatibility of the scores given by three researchers was assessed by statistical methods. We found positive and significant correlation coefficients between 0.81 to 0.88 for the total MBA score, between 0.90 to 0.95 for the total IBFAT score, and between 0.85 to 0.91 for the total LATCH score. Correlation coefficients testing these three tools ranged from 0.71 to 0.88, with the minimum value being noted for the correlation between LATCH and IBFAT scores and the maximum value being noted for the correlation between LATCH and MBA scores. We found positive and significant correlations between researchers' scores for 46 observations using the three assessment tools. This study showed that these above-mentioned tools were compatible for the assessment of the efficiency of breastfeeding.

  1. Operational calibration and validation of landsat data continuity mission (LDCM) sensors using the image assessment system (IAS)

    USGS Publications Warehouse

    Micijevic, Esad; Morfitt, Ron

    2010-01-01

    Systematic characterization and calibration of the Landsat sensors and the assessment of image data quality are performed using the Image Assessment System (IAS). The IAS was first introduced as an element of the Landsat 7 (L7) Enhanced Thematic Mapper Plus (ETM+) ground segment and recently extended to Landsat 4 (L4) and 5 (L5) Thematic Mappers (TM) and Multispectral Sensors (MSS) on-board the Landsat 1-5 satellites. In preparation for the Landsat Data Continuity Mission (LDCM), the IAS was developed for the Earth Observer 1 (EO-1) Advanced Land Imager (ALI) with a capability to assess pushbroom sensors. This paper describes the LDCM version of the IAS and how it relates to unique calibration and validation attributes of its on-board imaging sensors. The LDCM IAS system will have to handle a significantly larger number of detectors and the associated database than the previous IAS versions. An additional challenge is that the LDCM IAS must handle data from two sensors, as the LDCM products will combine the Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS) spectral bands.

  2. LAnd surface remote sensing Products VAlidation System (LAPVAS) and its preliminary application

    NASA Astrophysics Data System (ADS)

    Lin, Xingwen; Wen, Jianguang; Tang, Yong; Ma, Mingguo; Dou, Baocheng; Wu, Xiaodan; Meng, Lumin

    2014-11-01

    The long term record of remote sensing product shows the land surface parameters with spatial and temporal change to support regional and global scientific research widely. Remote sensing product with different sensors and different algorithms is necessary to be validated to ensure the high quality remote sensing product. Investigation about the remote sensing product validation shows that it is a complex processing both the quality of in-situ data requirement and method of precision assessment. A comprehensive validation should be needed with long time series and multiple land surface types. So a system named as land surface remote sensing product is designed in this paper to assess the uncertainty information of the remote sensing products based on a amount of in situ data and the validation techniques. The designed validation system platform consists of three parts: Validation databases Precision analysis subsystem, Inter-external interface of system. These three parts are built by some essential service modules, such as Data-Read service modules, Data-Insert service modules, Data-Associated service modules, Precision-Analysis service modules, Scale-Change service modules and so on. To run the validation system platform, users could order these service modules and choreograph them by the user interactive and then compete the validation tasks of remote sensing products (such as LAI ,ALBEDO ,VI etc.) . Taking SOA-based architecture as the framework of this system. The benefit of this architecture is the good service modules which could be independent of any development environment by standards such as the Web-Service Description Language(WSDL). The standard language: C++ and java will used as the primary programming language to create service modules. One of the key land surface parameter, albedo, is selected as an example of the system application. It is illustrated that the LAPVAS has a good performance to implement the land surface remote sensing product

  3. Assessing students' communication skills: validation of a global rating.

    PubMed

    Scheffer, Simone; Muehlinghaus, Isabel; Froehmel, Annette; Ortwein, Heiderose

    2008-12-01

    Communication skills training is an accepted part of undergraduate medical programs nowadays. In addition to learning experiences its importance should be emphasised by performance-based assessment. As detailed checklists have been shown to be not well suited for the assessment of communication skills for different reasons, this study aimed to validate a global rating scale. A Canadian instrument was translated to German and adapted to assess students' communication skills during an end-of-semester-OSCE. Subjects were second and third year medical students at the reformed track of the Charité-Universitaetsmedizin Berlin. Different groups of raters were trained to assess students' communication skills using the global rating scale. Validity testing included concurrent validity and construct validity: Judgements of different groups of raters were compared to expert ratings as a defined gold standard. Furthermore, the amount of agreement between scores obtained with this global rating scale and a different instrument for assessing communication skills was determined. Results show that communication skills can be validly assessed by trained non-expert raters as well as standardised patients using this instrument.

  4. Assessing validity of observational intervention studies - the Benchmarking Controlled Trials.

    PubMed

    Malmivaara, Antti

    2016-09-01

    Benchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations. To create and pilot test a checklist for appraising methodological validity of a BCT. The checklist was created by extracting the most essential elements from the comprehensive set of criteria in the previous paper on BCTs. Also checklists and scientific papers on observational studies and respective systematic reviews were utilized. Ten BCTs published in the Lancet and in the New England Journal of Medicine were used to assess feasibility of the created checklist. The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies. The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies. However, the piloted checklist should be validated in further studies. Key messages Benchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations. This paper presents a checklist for appraising methodological validity of BCTs and pilot-tests the checklist with ten BCTs published in leading medical journals. The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies. The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies.

  5. Corporate Entrepreneurship Assessment Instrument (CEAI): Systematic Validation of a Measure

    DTIC Science & Technology

    2006-03-01

    CORPORATE ENTREPRENEURSHIP ASSESSMENT INSTRUMENT (CEAI): SYSTEMATIC VALIDATION OF A MEASURE THESIS...the United States Government. AFIT/GIR/ENV/06M-05 CORPORATE ENTREPRENEURSHIP ASSESSMENT INSTRUMENT (CEAI): SYSTEMATIC VALIDATION...DISTRIBUTION UNLIMITED. AFIT/GIR/ENV/06M-05 CORPORATE ENTREPRENEURSHIP ASSESSMENT INSTRUMENT (CEAI): SYSTEMATIC VALIDATION OF A MEASURE

  6. Absolute fracture risk assessment using lumbar spine and femoral neck bone density measurements: derivation and validation of a hybrid system.

    PubMed

    Leslie, William D; Lix, Lisa M

    2011-03-01

    The World Health Organization (WHO) Fracture Risk Assessment Tool (FRAX) computes 10-year probability of major osteoporotic fracture from multiple risk factors, including femoral neck (FN) T-scores. Lumbar spine (LS) measurements are not currently part of the FRAX formulation but are used widely in clinical practice, and this creates confusion when there is spine-hip discordance. Our objective was to develop a hybrid 10-year absolute fracture risk assessment system in which nonvertebral (NV) fracture risk was assessed from the FN and clinical vertebral (V) fracture risk was assessed from the LS. We identified 37,032 women age 45 years and older undergoing baseline FN and LS dual-energy X-ray absorptiometry (DXA; 1990-2005) from a population database that contains all clinical DXA results for the Province of Manitoba, Canada. Results were linked to longitudinal health service records for physician billings and hospitalizations to identify nontrauma vertebral and nonvertebral fracture codes after bone mineral density (BMD) testing. The population was randomly divided into equal-sized derivation and validation cohorts. Using the derivation cohort, three fracture risk prediction systems were created from Cox proportional hazards models (adjusted for age and multiple FRAX risk factors): FN to predict combined all fractures, FN to predict nonvertebral fractures, and LS to predict vertebral (without nonvertebral) fractures. The hybrid system was the sum of nonvertebral risk from the FN model and vertebral risk from the LS model. The FN and hybrid systems were both strongly predictive of overall fracture risk (p < .001). In the validation cohort, ROC analysis showed marginally better performance of the hybrid system versus the FN system for overall fracture prediction (p = .24) and significantly better performance for vertebral fracture prediction (p < .001). In a discordance subgroup with FN and LS T-score differences greater than 1 SD, there was a significant

  7. Validation of Bioreactor and Human-on-a-Chip Devices for Chemical Safety Assessment.

    PubMed

    Rebelo, Sofia P; Dehne, Eva-Maria; Brito, Catarina; Horland, Reyk; Alves, Paula M; Marx, Uwe

    2016-01-01

    Equipment and device qualification and test assay validation in the field of tissue engineered human organs for substance assessment remain formidable tasks with only a few successful examples so far. The hurdles seem to increase with the growing complexity of the biological systems, emulated by the respective models. Controlled single tissue or organ culture in bioreactors improves the organ-specific functions and maintains their phenotypic stability for longer periods of time. The reproducibility attained with bioreactor operations is, per se, an advantage for the validation of safety assessment. Regulatory agencies have gradually altered the validation concept from exhaustive "product" to rigorous and detailed process characterization, valuing reproducibility as a standard for validation. "Human-on-a-chip" technologies applying micro-physiological systems to the in vitro combination of miniaturized human organ equivalents into functional human micro-organisms are nowadays thought to be the most elaborate solution created to date. They target the replacement of the current most complex models-laboratory animals. Therefore, we provide here a road map towards the validation of such "human-on-a-chip" models and qualification of their respective bioreactor and microchip equipment along a path currently used for the respective animal models.

  8. Reliability and validity of the Microsoft Kinect for assessment of manual wheelchair propulsion.

    PubMed

    Milgrom, Rachel; Foreman, Matthew; Standeven, John; Engsberg, Jack R; Morgan, Kerri A

    2016-01-01

    Concurrent validity and test-retest reliability of the Microsoft Kinect in quantification of manual wheelchair propulsion were examined. Data were collected from five manual wheelchair users on a roller system. Three Kinect sensors were used to assess test-retest reliability with a still pose. Three systems were used to assess concurrent validity of the Kinect to measure propulsion kinematics (joint angles, push loop characteristics): Kinect, Motion Analysis, and Dartfish ProSuite (Dartfish joint angles were limited to shoulder and elbow flexion). Intraclass correlation coefficients revealed good reliability (0.87-0.99) between five of the six joint angles (neck flexion, shoulder flexion, shoulder abduction, elbow flexion, wrist flexion). ICCs suggested good concurrent validity for elbow flexion between the Kinect and Dartfish and between the Kinect and Motion Analysis. Good concurrent validity was revealed for maximum height, hand-axle relationship, and maximum area (0.92-0.95) between the Kinect and Dartfish and maximum height and hand-axle relationship (0.89-0.96) between the Kinect and Motion Analysis. Analysis of variance revealed significant differences (p < 0.05) in maximum length between Dartfish (mean 58.76 cm) and the Kinect (40.16 cm). Results pose promising research and clinical implications for propulsion assessment and overuse injury prevention with the application of current findings to future technology.

  9. Valid and Reliable Science Content Assessments for Science Teachers

    NASA Astrophysics Data System (ADS)

    Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn

    2013-03-01

    Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper describes multiple sources of validity and reliability (Cronbach's alpha greater than 0.8) evidence for physical, life, and earth/space science assessments—part of the Diagnostic Teacher Assessments of Mathematics and Science (DTAMS) project. Validity was strengthened by systematic synthesis of relevant documents, extensive use of external reviewers, and field tests with 900 teachers during assessment development process. Subsequent results from 4,400 teachers, analyzed with Rasch IRT modeling techniques, offer construct and concurrent validity evidence.

  10. Construct Validity of the Behavior Assessment System for Children (BASC) Self-Report of Personality: Evidence from Adolescents Referred to Residential Treatment

    ERIC Educational Resources Information Center

    Weis, Robert; Smenner, Lindsey

    2007-01-01

    The authors investigate the construct validity of the Behavior Assessment System for Children Self-Report of Personality (BASC-SRP; Reynolds & Kamphaus, 1998). A sample of 970 adolescents (16-18 years) with histories of disruptive behavior problems and truancy complete the SRP; a subsample of 290 adolescents also completed the Minnesota…

  11. Development and Validity of Western University's On-Road Assessment.

    PubMed

    Classen, Sherrilene; Krasniuk, Sarah; Alvarez, Liliana; Monahan, Miriam; Morrow, Sarah A; Danter, Tim

    2017-01-01

    Although used across North America, many on-road studies do not explicitly document the content and metrics of on-road courses and accompanying assessments. This article discusses the development of the University of Western Ontario's on-road course, and elucidates the validity of its accompanying on-road assessment. We identified main components for developing an on-road course and used measurement theory to establish face, content, and initial construct validity. Five adult volunteer drivers and 30 drivers with multiple sclerosis participated in the study. The road course had face and content validity, representing 100% of roadway components determined through a content validity matrix and index. The known-groups method showed that debilitated drivers (vs. not debilitated), made more driving errors ( W = 463.50, p = .03), and failed the on-road course, indicating preliminary construct validity of the on-road assessment. This research guides and empirically supports a process for developing a road course and its assessment.

  12. Validation of the Behavioral Risk Factor Surveillance System Sleep Questions

    PubMed Central

    Jungquist, Carla R.; Mund, Jaime; Aquilina, Alan T.; Klingman, Karen; Pender, John; Ochs-Balcom, Heather; van Wijngaarden, Edwin; Dickerson, Suzanne S.

    2016-01-01

    Study Objective: Sleep problems may constitute a risk for health problems, including cardiovascular disease, depression, diabetes, poor work performance, and motor vehicle accidents. The primary purpose of this study was to assess the validity of the current Behavioral Risk Factor Surveillance System (BRFSS) sleep questions by establishing the sensitivity and specificity for detection of sleep/ wake disturbance. Methods: Repeated cross-sectional assessment of 300 community dwelling adults over the age of 18 who did not wear CPAP or oxygen during sleep. Reliability and validity testing of the BRFSS sleep questions was performed comparing to BFRSS responses to data from home sleep study, actigraphy for 14 days, Insomnia Severity Index, Epworth Sleepiness Scale, and PROMIS-57. Results: Only two of the five BRFSS sleep questions were found valid and reliable in determining total sleep time and excessive daytime sleepiness. Conclusions: Refinement of the BRFSS questions is recommended. Citation: Jungquist CR, Mund J, Aquilina AT, Klingman K, Pender J, Ochs-Balcom H, van Wijngaarden E, Dickerson SS. Validation of the behavioral risk factor surveillance system sleep questions. J Clin Sleep Med 2016;12(3):301–310. PMID:26446246

  13. A systematic review of publications assessing reliability and validity of the Behavioral Risk Factor Surveillance System (BRFSS), 2004–2011

    PubMed Central

    2013-01-01

    Background In recent years response rates on telephone surveys have been declining. Rates for the behavioral risk factor surveillance system (BRFSS) have also declined, prompting the use of new methods of weighting and the inclusion of cell phone sampling frames. A number of scholars and researchers have conducted studies of the reliability and validity of the BRFSS estimates in the context of these changes. As the BRFSS makes changes in its methods of sampling and weighting, a review of reliability and validity studies of the BRFSS is needed. Methods In order to assess the reliability and validity of prevalence estimates taken from the BRFSS, scholarship published from 2004–2011 dealing with tests of reliability and validity of BRFSS measures was compiled and presented by topics of health risk behavior. Assessments of the quality of each publication were undertaken using a categorical rubric. Higher rankings were achieved by authors who conducted reliability tests using repeated test/retest measures, or who conducted tests using multiple samples. A similar rubric was used to rank validity assessments. Validity tests which compared the BRFSS to physical measures were ranked higher than those comparing the BRFSS to other self-reported data. Literature which undertook more sophisticated statistical comparisons was also ranked higher. Results Overall findings indicated that BRFSS prevalence rates were comparable to other national surveys which rely on self-reports, although specific differences are noted for some categories of response. BRFSS prevalence rates were less similar to surveys which utilize physical measures in addition to self-reported data. There is very little research on reliability and validity for some health topics, but a great deal of information supporting the validity of the BRFSS data for others. Conclusions Limitations of the examination of the BRFSS were due to question differences among surveys used as comparisons, as well as mode of data

  14. [Computerized system validation of clinical researches].

    PubMed

    Yan, Charles; Chen, Feng; Xia, Jia-lai; Zheng, Qing-shan; Liu, Daniel

    2015-11-01

    Validation is a documented process that provides a high degree of assurance. The computer system does exactly and consistently what it is designed to do in a controlled manner throughout the life. The validation process begins with the system proposal/requirements definition, and continues application and maintenance until system retirement and retention of the e-records based on regulatory rules. The objective to do so is to clearly specify that each application of information technology fulfills its purpose. The computer system validation (CSV) is essential in clinical studies according to the GCP standard, meeting product's pre-determined attributes of the specifications, quality, safety and traceability. This paper describes how to perform the validation process and determine relevant stakeholders within an organization in the light of validation SOPs. Although a specific accountability in the implementation of the validation process might be outsourced, the ultimate responsibility of the CSV remains on the shoulder of the business process owner-sponsor. In order to show that the compliance of the system validation has been properly attained, it is essential to set up comprehensive validation procedures and maintain adequate documentations as well as training records. Quality of the system validation should be controlled using both QC and QA means.

  15. Reliability and Validity of Ambulatory Cognitive Assessments

    PubMed Central

    Sliwinski, Martin J.; Mogle, Jacqueline A.; Hyun, Jinshil; Munoz, Elizabeth; Smyth, Joshua M.; Lipton, Richard B.

    2017-01-01

    Mobile technologies are increasingly used to measure cognitive function outside of traditional clinic and laboratory settings. Although ambulatory assessments of cognitive function conducted in people’s natural environments offer potential advantages over traditional assessment approaches, the psychometrics of cognitive assessment procedures have been understudied. We evaluated the reliability and construct validity of ambulatory assessments of working memory and perceptual speed administered via smartphones as part of an ecological momentary assessment (EMA) protocol in a diverse adult sample (N=219). Results indicated excellent between-person reliability (≥.97) for average scores, and evidence of reliable within-person variability across measurement occasions (.41–.53). The ambulatory tasks also exhibited construct validity, as evidence by their loadings on working memory and perceptual speed factors defined by the in-lab assessments. Our findings demonstrate that averaging across brief cognitive assessments made in uncontrolled naturalistic settings provide measurements that are comparable in reliability to assessments made in controlled laboratory environments. PMID:27084835

  16. Validation of risk assessment scoring systems for an audit of elective surgery for gastrointestinal cancer in elderly patients: an audit.

    PubMed

    Wakabayashi, Hisao; Sano, Takanori; Yachida, Shinichi; Okano, Keiichi; Izuishi, Kunihiko; Suzuki, Yasuyuki

    2007-10-01

    The goal of this study was to validate the usefulness of risk assessment scoring systems for a surgical audit in elective digestive surgery for elderly patients. The validated scoring systems used were the Physiological and Operative Severity Score for enUmeration of Mortality and morbidity (POSSUM) and the Portsmouth predictor equation for mortality (P-POSSUM). This study involved 153 consecutive patients aged 75 years and older who underwent elective gastric or colorectal surgery between July 2004 and June 2006. A retrospective analysis was performed on data collected prior to each surgery. The predicted mortality and morbidity risks were calculated using each of the scoring systems and were used to obtain the observed/predicted (O/E) mortality and morbidity ratios. New logistic regression equations for morbidity and mortality were then calculated using the scores from the POSSUM system and applied retrospectively. The O/E ratio for morbidity obtained from POSSUM score was 0.23. The O/E ratios for mortality from the POSSUM score and the P-POSSUM were 0.15 and 0.38, respectively. Utilizing the new equations using scores from the POSSUM, the O/E ratio increased to 0.88. Both the POSSUM and P-POSSUM over-predicted the morbidity and mortality in elective gastrointestinal surgery for malignant tumors in elderly patients. However, if a surgical unit makes appropriate calculations using its own patient series and updates these equations, the POSSUM system can be useful in the risk assessment for surgery in elderly patients.

  17. Reliable and valid assessment of Lichtenstein hernia repair skills.

    PubMed

    Carlsen, C G; Lindorff-Larsen, K; Funch-Jensen, P; Lund, L; Charles, P; Konge, L

    2014-08-01

    Lichtenstein hernia repair is a common surgical procedure and one of the first procedures performed by a surgical trainee. However, formal assessment tools developed for this procedure are few and sparsely validated. The aim of this study was to determine the reliability and validity of an assessment tool designed to measure surgical skills in Lichtenstein hernia repair. Key issues were identified through a focus group interview. On this basis, an assessment tool with eight items was designed. Ten surgeons and surgical trainees were video recorded while performing Lichtenstein hernia repair, (four experts, three intermediates, and three novices). The videos were blindly and individually assessed by three raters (surgical consultants) using the assessment tool. Based on these assessments, validity and reliability were explored. The internal consistency of the items was high (Cronbach's alpha = 0.97). The inter-rater reliability was very good with an intra-class correlation coefficient (ICC) = 0.93. Generalizability analysis showed a coefficient above 0.8 even with one rater. The coefficient improved to 0.92 if three raters were used. One-way analysis of variance found a significant difference between the three groups which indicates construct validity, p < 0.001. Lichtenstein hernia repair skills can be assessed blindly by a single rater in a reliable and valid fashion with the new procedure-specific assessment tool. We recommend this tool for future assessment of trainees performing Lichtenstein hernia repair to ensure that the objectives of competency-based surgical training are met.

  18. Assessment of abdominal muscle function using the Biodex System-4. Validity and reliability in healthy volunteers and patients with giant ventral hernia.

    PubMed

    Gunnarsson, U; Johansson, M; Strigård, K

    2011-08-01

    The decrease in recurrence rates in ventral hernia surgery have led to a redirection of focus towards other important patient-related endpoints. One such endpoint is abdominal wall function. The aim of the present study was to evaluate the reliability and external validity of abdominal wall strength measurement using the Biodex System-4 with a back abdomen unit. Ten healthy volunteers and ten patients with ventral hernias exceeding 10 cm were recruited. Test-retest reliability, both with and without girdle, was evaluated by comparison of measurements at two test occasions 1 week apart. Reliability was calculated by the interclass correlation coefficients (ICC) method. Validity was evaluated by correlation with the well-established International Physical Activity Questionnaire (IPAQ) and a self-assessment of abdominal wall strength. One person in the healthy group was excluded after the first test due to neck problems following minor trauma. The reliability was excellent (>0.75), with ICC values between 0.92 and 0.97 for the different modalities tested. No differences were seen between testing with and without a girdle. Validity was also excellent both when calculated as correlation to self-assessment of abdominal wall strength, and to IPAQ, giving Kendall tau values of 0.51 and 0.47, respectively, and corresponding P values of 0.002 and 0.004. Measurement of abdominal muscle function using the Biodex System-4 is a reliable and valid method to assess this important patient-related endpoint. Further investigations will be made to explore the potential of this technique in the evaluation of the results of ventral hernia surgery, and to compare muscle function after different abdominal wall reconstruction techniques.

  19. The Validation of a Case-Based, Cumulative Assessment and Progressions Examination

    PubMed Central

    Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David

    2016-01-01

    Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435

  20. Development and initial validity of the in-hand manipulation assessment.

    PubMed

    Klymenko, Gabrielle; Liu, Karen P Y; Bissett, Michelle; Fong, Kenneth N K; Welage, Nandana; Wong, Rebecca S M

    2018-04-01

    A review of the literature related to in-hand manipulation (IHM) revealed that there is no assessment which specifically measures this construct in the adult population. This study reports the face and content validity of an IHM assessment for adults with impaired hand function based on expert opinion. The definition of IHM skills, assessment tasks and scoring methods identified from literature was discussed in a focus group (n = 4) to establish face validity. An expert panel (n = 16) reviewed the content validity of the proposed assessment; evaluating the representativeness and relevance of encompassing the IHM skills in the proposed assessment tasks, the clarity and importance to daily life of the task and the clarity and applicability to clinical environment of the scoring method. The content validity was calculated using the content validity index for both the individual task and all tasks together (I-CVI and S-CVI). Feedback was incorporated to create the assessment. The focus group members agreed to include 10 assessment tasks that covered all IHM skills. In the expert panel review, all tasks received an I-CVI above 0.78 and S-CVI above 0.80 in representativeness and relevance ratings, representing good content validity. With the comments from the expert panel, tasks were modified to improve the clarity and importance to daily life. A four-point Likert scale was identified for assessing both the completion of the assessment tasks and the quality of IHM skills within the task performance. Face and content validity were established in this new IHM assessment. Further studies to examine psychometric properties and use within clinical practice are recommended. © 2018 Occupational Therapy Australia.

  1. A Metric-Based Validation Process to Assess the Realism of Synthetic Power Grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Birchfield, Adam; Schweitzer, Eran; Athari, Mir

    Public power system test cases that are of high quality benefit the power systems research community with expanded resources for testing, demonstrating, and cross-validating new innovations. Building synthetic grid models for this purpose is a relatively new problem, for which a challenge is to show that created cases are sufficiently realistic. This paper puts forth a validation process based on a set of metrics observed from actual power system cases. These metrics follow the structure, proportions, and parameters of key power system elements, which can be used in assessing and validating the quality of synthetic power grids. Though wide diversitymore » exists in the characteristics of power systems, the paper focuses on an initial set of common quantitative metrics to capture the distribution of typical values from real power systems. The process is applied to two new public test cases, which are shown to meet the criteria specified in the metrics of this paper.« less

  2. A Metric-Based Validation Process to Assess the Realism of Synthetic Power Grids

    DOE PAGES

    Birchfield, Adam; Schweitzer, Eran; Athari, Mir; ...

    2017-08-19

    Public power system test cases that are of high quality benefit the power systems research community with expanded resources for testing, demonstrating, and cross-validating new innovations. Building synthetic grid models for this purpose is a relatively new problem, for which a challenge is to show that created cases are sufficiently realistic. This paper puts forth a validation process based on a set of metrics observed from actual power system cases. These metrics follow the structure, proportions, and parameters of key power system elements, which can be used in assessing and validating the quality of synthetic power grids. Though wide diversitymore » exists in the characteristics of power systems, the paper focuses on an initial set of common quantitative metrics to capture the distribution of typical values from real power systems. The process is applied to two new public test cases, which are shown to meet the criteria specified in the metrics of this paper.« less

  3. Validity threats: overcoming interference with proposed interpretations of assessment data.

    PubMed

    Downing, Steven M; Haladyna, Thomas M

    2004-03-01

    Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity. The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided. The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged. There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.

  4. System verification and validation: a fundamental systems engineering task

    NASA Astrophysics Data System (ADS)

    Ansorge, Wolfgang R.

    2004-09-01

    Systems Engineering (SE) is the discipline in a project management team, which transfers the user's operational needs and justifications for an Extremely Large Telescope (ELT) -or any other telescope-- into a set of validated required system performance characteristics. Subsequently transferring these validated required system performance characteris-tics into a validated system configuration, and eventually into the assembled, integrated telescope system with verified performance characteristics and provided it with "objective evidence that the particular requirements for the specified intended use are fulfilled". The latter is the ISO Standard 8402 definition for "Validation". This presentation describes the verification and validation processes of an ELT Project and outlines the key role System Engineering plays in these processes throughout all project phases. If these processes are implemented correctly into the project execution and are started at the proper time, namely at the very beginning of the project, and if all capabilities of experienced system engineers are used, the project costs and the life-cycle costs of the telescope system can be reduced between 25 and 50 %. The intention of this article is, to motivate and encourage project managers of astronomical telescopes and scientific instruments to involve the entire spectrum of Systems Engineering capabilities performed by trained and experienced SYSTEM engineers for the benefit of the project by explaining them the importance of Systems Engineering in the AIV and validation processes.

  5. THE VALIDITY OF HUMAN AND COMPUTERIZED WRITING ASSESSMENT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ronald L. Boring

    2005-09-01

    This paper summarizes an experiment designed to assess the validity of essay grading between holistic and analytic human graders and a computerized grader based on latent semantic analysis. The validity of the grade was gauged by the extent to which the student’s knowledge of the topic correlated with the grader’s expert knowledge. To assess knowledge, Pathfinder networks were generated by the student essay writers, the holistic and analytic graders, and the computerized grader. It was found that the computer generated grades more closely matched the definition of valid grading than did human generated grades.

  6. Knowledge-based system verification and validation

    NASA Technical Reports Server (NTRS)

    Johnson, Sally C.

    1990-01-01

    The objective of this task is to develop and evaluate a methodology for verification and validation (V&V) of knowledge-based systems (KBS) for space station applications with high reliability requirements. The approach consists of three interrelated tasks. The first task is to evaluate the effectiveness of various validation methods for space station applications. The second task is to recommend requirements for KBS V&V for Space Station Freedom (SSF). The third task is to recommend modifications to the SSF to support the development of KBS using effectiveness software engineering and validation techniques. To accomplish the first task, three complementary techniques will be evaluated: (1) Sensitivity Analysis (Worchester Polytechnic Institute); (2) Formal Verification of Safety Properties (SRI International); and (3) Consistency and Completeness Checking (Lockheed AI Center). During FY89 and FY90, each contractor will independently demonstrate the user of his technique on the fault detection, isolation, and reconfiguration (FDIR) KBS or the manned maneuvering unit (MMU), a rule-based system implemented in LISP. During FY91, the application of each of the techniques to other knowledge representations and KBS architectures will be addressed. After evaluation of the results of the first task and examination of Space Station Freedom V&V requirements for conventional software, a comprehensive KBS V&V methodology will be developed and documented. Development of highly reliable KBS's cannot be accomplished without effective software engineering methods. Using the results of current in-house research to develop and assess software engineering methods for KBS's as well as assessment of techniques being developed elsewhere, an effective software engineering methodology for space station KBS's will be developed, and modification of the SSF to support these tools and methods will be addressed.

  7. Validation of selected analytical methods using accuracy profiles to assess the impact of a Tobacco Heating System on indoor air quality.

    PubMed

    Mottier, Nicolas; Tharin, Manuel; Cluse, Camille; Crudo, Jean-René; Lueso, María Gómez; Goujon-Ginglinger, Catherine G; Jaquier, Anne; Mitova, Maya I; Rouget, Emmanuel G R; Schaller, Mathieu; Solioz, Jennifer

    2016-09-01

    Studies in environmentally controlled rooms have been used over the years to assess the impact of environmental tobacco smoke on indoor air quality. As new tobacco products are developed, it is important to determine their impact on air quality when used indoors. Before such an assessment can take place it is essential that the analytical methods used to assess indoor air quality are validated and shown to be fit for their intended purpose. Consequently, for this assessment, an environmentally controlled room was built and seven analytical methods, representing eighteen analytes, were validated. The validations were carried out with smoking machines using a matrix-based approach applying the accuracy profile procedure. The performances of the methods were compared for all three matrices under investigation: background air samples, the environmental aerosol of Tobacco Heating System THS 2.2, a heat-not-burn tobacco product developed by Philip Morris International, and the environmental tobacco smoke of a cigarette. The environmental aerosol generated by the THS 2.2 device did not have any appreciable impact on the performances of the methods. The comparison between the background and THS 2.2 environmental aerosol samples generated by smoking machines showed that only five compounds were higher when THS 2.2 was used in the environmentally controlled room. Regarding environmental tobacco smoke from cigarettes, the yields of all analytes were clearly above those obtained with the other two air sample types. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  8. Validation of a Self-Administered Computerized System to Detect Cognitive Impairment in Older Adults

    PubMed Central

    Brinkman, Samuel D.; Reese, Robert J.; Norsworthy, Larry A.; Dellaria, Donna K.; Kinkade, Jacob W.; Benge, Jared; Brown, Kimberly; Ratka, Anna; Simpkins, James W.

    2015-01-01

    There is increasing interest in the development of economical and accurate approaches to identifying persons in the community who have mild, undetected cognitive impairments. Computerized assessment systems have been suggested as a viable approach to identifying these persons. The validity of a computerized assessment system for identification of memory and executive deficits in older individuals was evaluated in the current study. Volunteers (N = 235) completed a 3-hr battery of neuropsychological tests and a computerized cognitive assessment system. Participants were classified as impaired (n = 78) or unimpaired (n = 157) on the basis of the Mini Mental State Exam, Wechsler Memory Scale-III and the Trail Making Test (TMT), Part B. All six variables (three memory variables and three executive variables) derived from the computerized assessment differed significantly between groups in the expected direction. There was also evidence of temporal stability and concurrent validity. Application of computerized assessment systems for clinical practice and for identification of research participants is discussed in this article. PMID:25332303

  9. Issues in developing valid assessments of speech pathology students' performance in the workplace.

    PubMed

    McAllister, Sue; Lincoln, Michelle; Ferguson, Alison; McAllister, Lindy

    2010-01-01

    Workplace-based learning is a critical component of professional preparation in speech pathology. A validated assessment of this learning is seen to be 'the gold standard', but it is difficult to develop because of design and validation issues. These issues include the role and nature of judgement in assessment, challenges in measuring quality, and the relationship between assessment and learning. Valid assessment of workplace-based performance needs to capture the development of competence over time and account for both occupation specific and generic competencies. This paper reviews important conceptual issues in the design of valid and reliable workplace-based assessments of competence including assessment content, process, impact on learning, measurement issues, and validation strategies. It then goes on to share what has been learned about quality assessment and validation of a workplace-based performance assessment using competency-based ratings. The outcomes of a four-year national development and validation of an assessment tool are described. A literature review of issues in conceptualizing, designing, and validating workplace-based assessments was conducted. Key factors to consider in the design of a new tool were identified and built into the cycle of design, trialling, and data analysis in the validation stages of the development process. This paper provides an accessible overview of factors to consider in the design and validation of workplace-based assessment tools. It presents strategies used in the development and national validation of a tool COMPASS, used in an every speech pathology programme in Australia, New Zealand, and Singapore. The paper also describes Rasch analysis, a model-based statistical approach which is useful for establishing validity and reliability of assessment tools. Through careful attention to conceptual and design issues in the development and trialling of workplace-based assessments, it has been possible to develop the

  10. Measuring Quality in Rural Kindergarten Classrooms: Reliability and Validity Evidence for the Classroom Assessment Scoring System, Kindergarten-Third Grade (CLASS K-3)

    ERIC Educational Resources Information Center

    Sandilos, Lia E.

    2012-01-01

    The purpose of the current study was to evaluate the structural validity and stability of scores on a measure of global classroom quality, the Classroom Assessment Scoring System, Kindergarten-Third Grade (CLASS K-3; Pianta, La Paro, & Hamre, 2008). Using data from a sample of 417 kindergarten classrooms in the rural Southern and Mid-Atlantic…

  11. Validation of the American Board of Orthodontics Objective Grading System for assessing the treatment outcomes of Chinese patients.

    PubMed

    Song, Guang-Ying; Baumrind, Sheldon; Zhao, Zhi-He; Ding, Yin; Bai, Yu-Xing; Wang, Lin; He, Hong; Shen, Gang; Li, Wei-Ran; Wu, Wei-Zi; Ren, Chong; Weng, Xuan-Rong; Geng, Zhi; Xu, Tian-Min

    2013-09-01

    Orthodontics in China has developed rapidly, but there is no standard index of treatment outcomes. We assessed the validity of the American Board of Orthodontics Objective Grading System (ABO-OGS) for the classification of treatment outcomes in Chinese patients. We randomly selected 108 patients who completed treatment between July 2005 and September 2008 in 6 orthodontic treatment centers across China. Sixty-nine experienced Chinese orthodontists made subjective assessments of the end-of-treatment casts for each patient. Three examiners then used the ABO-OGS to measure the casts. Pearson correlation analysis and receiver operating characteristic curve analysis were conducted to evaluate the correspondence between the ABO-OGS cast measurements and the orthodontists' subjective assessments. The average subjective grading scores were highly correlated with the ABO-OGS scores (r = 0.7042). Four of the 7 study cast components of the ABO-OGS score-occlusal relationship, overjet, interproximal contact, and alignment-were statistically significantly correlated with the judges' subjective assessments. Together, these 4 accounted for 58% of the variability in the average subjective grading scores. The ABO-OGS cutoff score for cases that the judges deemed satisfactory was 16 points; the corresponding cutoff score for cases that the judges considered acceptable was 21 points. The ABO-OGS is a valid index for the assessment of treatment outcomes in Chinese patients. By comparing the objective scores on this modification of the ABO-OGS with the mean subjective assessment of a panel of highly qualified Chinese orthodontists, a cutoff point for satisfactory treatment outcome was defined as 16 points or fewer, with scores of 16 to 21 points denoting less than satisfactory but still acceptable treatment. Cases that scored greater than 21 points were considered unacceptable. Copyright © 2013 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.

  12. Seeking Empirical Validity in an Assurance of Learning System

    ERIC Educational Resources Information Center

    Avery, Sherry L.; McWhorter, Rochell R.; Lirely, Roger; Doty, H. Harold

    2014-01-01

    Business schools have established measurement tools to support their assurance of learning (AoL) systems and to assess student achievement of learning objectives. However, business schools have not required their tools to be empirically validated, thus ensuring that they measure what they are intended to measure. The authors propose confirmatory…

  13. Validity assessment and the neurological physical examination.

    PubMed

    Zasler, Nathan D

    2015-01-01

    The assessment of any patient or examinee with neurological impairment, whether acquired or congenital, provides a key set of data points in the context of developing accurate diagnostic impressions and implementing an appropriate neurorehabilitation program. As part of that assessment, the neurological physical exam is an extremely important component of the overall neurological assessment. In the aforementioned context, clinicians often are confounded by unusual, atypical or unexplainable physical exam findings that bring into question the organicity, veracity, and/or underlying cause of the observed clinical presentation. The purpose of this review is to provide readers with general directions and specific caveats regarding validity assessment in the context of the neurological physical exam. It is of utmost importance for health care practitioners to be aware of assessment methodologies that may assist in determining the validity of the neurological physical exam and differentiating organic from non-organic/functional impairments. Maybe more importantly, the limitations of many commonly used strategies for assessment of non-organicity should be recognized and consider prior to labeling observed physical findings on neurological exam as non-organic or functional.

  14. Assessing validity of observational intervention studies – the Benchmarking Controlled Trials

    PubMed Central

    Malmivaara, Antti

    2016-01-01

    Abstract Background: Benchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations. Aims: To create and pilot test a checklist for appraising methodological validity of a BCT. Methods: The checklist was created by extracting the most essential elements from the comprehensive set of criteria in the previous paper on BCTs. Also checklists and scientific papers on observational studies and respective systematic reviews were utilized. Ten BCTs published in the Lancet and in the New England Journal of Medicine were used to assess feasibility of the created checklist. Results: The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies. Conclusions: The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies. However, the piloted checklist should be validated in further studies.Key messagesBenchmarking Controlled Trial (BCT) is a concept which covers all observational studies aiming to assess impact of interventions or health care system features to patients and populations.This paper presents a checklist for appraising methodological validity of BCTs and pilot-tests the checklist with ten BCTs published in leading medical journals. The appraised studies seem to have several methodological limitations, some of which could be avoided in planning, conducting and reporting phases of the studies.The checklist can be used for planning, conducting, reporting, reviewing, and critical reading of observational intervention studies. PMID:27238631

  15. Mental State Assessment and Validation Using Personalized Physiological Biometrics

    PubMed Central

    Patel, Aashish N.; Howard, Michael D.; Roach, Shane M.; Jones, Aaron P.; Bryant, Natalie B.; Robinson, Charles S. H.; Clark, Vincent P.; Pilly, Praveen K.

    2018-01-01

    Mental state monitoring is a critical component of current and future human-machine interfaces, including semi-autonomous driving and flying, air traffic control, decision aids, training systems, and will soon be integrated into ubiquitous products like cell phones and laptops. Current mental state assessment approaches supply quantitative measures, but their only frame of reference is generic population-level ranges. What is needed are physiological biometrics that are validated in the context of task performance of individuals. Using curated intake experiments, we are able to generate personalized models of three key biometrics as useful indicators of mental state; namely, mental fatigue, stress, and attention. We demonstrate improvements to existing approaches through the introduction of new features. Furthermore, addressing the current limitations in assessing the efficacy of biometrics for individual subjects, we propose and employ a multi-level validation scheme for the biometric models by means of k-fold cross-validation for discrete classification and regression testing for continuous prediction. The paper not only provides a unified pipeline for extracting a comprehensive mental state evaluation from a parsimonious set of sensors (only EEG and ECG), but also demonstrates the use of validation techniques in the absence of empirical data. Furthermore, as an example of the application of these models to novel situations, we evaluate the significance of correlations of personalized biometrics to the dynamic fluctuations of accuracy and reaction time on an unrelated threat detection task using a permutation test. Our results provide a path toward integrating biometrics into augmented human-machine interfaces in a judicious way that can help to maximize task performance.

  16. Mental State Assessment and Validation Using Personalized Physiological Biometrics.

    PubMed

    Patel, Aashish N; Howard, Michael D; Roach, Shane M; Jones, Aaron P; Bryant, Natalie B; Robinson, Charles S H; Clark, Vincent P; Pilly, Praveen K

    2018-01-01

    Mental state monitoring is a critical component of current and future human-machine interfaces, including semi-autonomous driving and flying, air traffic control, decision aids, training systems, and will soon be integrated into ubiquitous products like cell phones and laptops. Current mental state assessment approaches supply quantitative measures, but their only frame of reference is generic population-level ranges. What is needed are physiological biometrics that are validated in the context of task performance of individuals. Using curated intake experiments, we are able to generate personalized models of three key biometrics as useful indicators of mental state; namely, mental fatigue, stress, and attention. We demonstrate improvements to existing approaches through the introduction of new features. Furthermore, addressing the current limitations in assessing the efficacy of biometrics for individual subjects, we propose and employ a multi-level validation scheme for the biometric models by means of k -fold cross-validation for discrete classification and regression testing for continuous prediction. The paper not only provides a unified pipeline for extracting a comprehensive mental state evaluation from a parsimonious set of sensors (only EEG and ECG), but also demonstrates the use of validation techniques in the absence of empirical data. Furthermore, as an example of the application of these models to novel situations, we evaluate the significance of correlations of personalized biometrics to the dynamic fluctuations of accuracy and reaction time on an unrelated threat detection task using a permutation test. Our results provide a path toward integrating biometrics into augmented human-machine interfaces in a judicious way that can help to maximize task performance.

  17. Bioculture System Validation

    NASA Technical Reports Server (NTRS)

    Sato, Kevin Y.

    2012-01-01

    The Bioculture System first flight will be to validate the performance of the hardware and its automated and manual operational capabilities in the space flight environment of the International Space Station. Biology, Engineering, and Operations tests will be conducted in the Bioculture System fully characterize its automated and manual functions to support cell culturing for short and long durations. No hypothesis-driven research will be conducted with biological sample, and the science leads have all provided their concurrence that none of the data they collect will be considered as proprietary and can be free distributed to the science community. The outcome of the validation flight will be to commission the hardware for use by the science community. This presentation will provide non-proprietary details about the Bioculture System and information about the activities for the first flight.

  18. Validity of three clinical performance assessments of internal medicine clerks.

    PubMed

    Hull, A L; Hodder, S; Berger, B; Ginsberg, D; Lindheim, N; Quan, J; Kleinhenz, M E

    1995-06-01

    To analyze the construct validity of three methods to assess the clinical performances of internal medicine clerks. A multitrait-multimethod (MTMM) study was conducted at the Case Western Reserve University School of Medicine to determine the convergent and divergent validity of a clinical evaluation form (CEF) completed by faculty and residents, an objective structured clinical examination (OSCE), and the medicine subject test of the National Board of Medical Examiners. Three traits were involved in the analysis: clinical skills, knowledge, and personal characteristics. A correlation matrix was computed for 410 third-year students who completed the clerkship between August 1988 and July 1991. There was a significant (p < .01) convergence of the four correlations that assessed the same traits by using different methods. However, the four convergent correlations were of moderate magnitude (ranging from .29 to .47). Divergent validity was assessed by comparing the magnitudes of the convergence correlations with the magnitudes of correlations among unrelated assessments (i.e., different traits by different methods). Seven of nine possible coefficients were smaller than the convergent coefficients, suggesting evidence of divergent validity. A significant CEF method effect was identified. There was convergent validity and some evidence of divergent validity with a significant method effect. The findings were similar for correlations corrected for attenuation. Four conclusions were reached: (1) the reliability of the OSCE must be improved, (2) the CEF ratings must be redesigned to further discriminate among the specific traits assessed, (3) additional methods to assess personal characteristics must be instituted, and (4) several assessment methods should be used to evaluate individual student performances.

  19. Development of an instrument to measure medical students' perceptions of the assessment environment: initial validation.

    PubMed

    Sim, Joong Hiong; Tong, Wen Ting; Hong, Wei-Han; Vadivelu, Jamuna; Hassan, Hamimah

    2015-01-01

    Assessment environment, synonymous with climate or atmosphere, is multifaceted. Although there are valid and reliable instruments for measuring the educational environment, there is no validated instrument for measuring the assessment environment in medical programs. This study aimed to develop an instrument for measuring students' perceptions of the assessment environment in an undergraduate medical program and to examine the psychometric properties of the new instrument. The Assessment Environment Questionnaire (AEQ), a 40-item, four-point (1=Strongly Disagree to 4=Strongly Agree) Likert scale instrument designed by the authors, was administered to medical undergraduates from the authors' institution. The response rate was 626/794 (78.84%). To establish construct validity, exploratory factor analysis (EFA) with principal component analysis and varimax rotation was conducted. To examine the internal consistency reliability of the instrument, Cronbach's α was computed. Mean scores for the entire AEQ and for each factor/subscale were calculated. Mean AEQ scores of students from different academic years and sex were examined. Six hundred and eleven completed questionnaires were analysed. EFA extracted four factors: feedback mechanism (seven items), learning and performance (five items), information on assessment (five items), and assessment system/procedure (three items), which together explained 56.72% of the variance. Based on the four extracted factors/subscales, the AEQ was reduced to 20 items. Cronbach's α for the 20-item AEQ was 0.89, whereas Cronbach's α for the four factors/subscales ranged from 0.71 to 0.87. Mean score for the AEQ was 2.68/4.00. The factor/subscale of 'feedback mechanism' recorded the lowest mean (2.39/4.00), whereas the factor/subscale of 'assessment system/procedure' scored the highest mean (2.92/4.00). Significant differences were found among the AEQ scores of students from different academic years. The AEQ is a valid and reliable

  20. Performance Evaluation of a Data Validation System

    NASA Technical Reports Server (NTRS)

    Wong, Edmond (Technical Monitor); Sowers, T. Shane; Santi, L. Michael; Bickford, Randall L.

    2005-01-01

    Online data validation is a performance-enhancing component of modern control and health management systems. It is essential that performance of the data validation system be verified prior to its use in a control and health management system. A new Data Qualification and Validation (DQV) Test-bed application was developed to provide a systematic test environment for this performance verification. The DQV Test-bed was used to evaluate a model-based data validation package known as the Data Quality Validation Studio (DQVS). DQVS was employed as the primary data validation component of a rocket engine health management (EHM) system developed under NASA's NGLT (Next Generation Launch Technology) program. In this paper, the DQVS and DQV Test-bed software applications are described, and the DQV Test-bed verification procedure for this EHM system application is presented. Test-bed results are summarized and implications for EHM system performance improvements are discussed.

  1. The German Version of the Manchester Triage System and Its Quality Criteria – First Assessment of Validity and Reliability

    PubMed Central

    Gräff, Ingo; Goldschmidt, Bernd; Glien, Procula; Bogdanow, Manuela; Fimmers, Rolf; Hoeft, Andreas; Kim, Se-Chan; Grigutsch, Daniel

    2014-01-01

    Background The German Version of the Manchester Triage System (MTS) has found widespread use in EDs across German-speaking Europe. Studies about the quality criteria validity and reliability of the MTS currently only exist for the English-language version. Most importantly, the content of the German version differs from the English version with respect to presentation diagrams and change indicators, which have a significant impact on the category assigned. This investigation offers a preliminary assessment in terms of validity and inter-rater reliability of the German MTS. Methods Construct validity of assigned MTS level was assessed based on comparisons to hospitalization (general / intensive care), mortality, ED and hospital length of stay, level of prehospital care and number of invasive diagnostics. A sample of 45,469 patients was used. Inter-rater agreement between an expert and triage nurses (reliability) was calculated separately for a subset group of 167 emergency patients. Results For general hospital admission the area under the curve (AUC) of the receiver operating characteristic was 0.749; for admission to ICU it was 0.871. An examination of MTS-level and number of deceased patients showed that the higher the priority derived from MTS, the higher the number of deaths (p<0.0001 / χ2 Test). There was a substantial difference in the 30-day survival among the 5 MTS categories (p<0.0001 / log-rank test).The AUC for the predict 30-day mortality was 0.613. Categories orange and red had the highest numbers of heart catheter and endoscopy. Category red and orange were mostly accompanied by an emergency physician, whereas categories blue and green were walk-in patients. Inter-rater agreement between expert triage nurses was almost perfect (κ = 0.954). Conclusion The German version of the MTS is a reliable and valid instrument for a first assessment of emergency patients in the emergency department. PMID:24586477

  2. Development and evaluation of an automated fall risk assessment system.

    PubMed

    Lee, Ju Young; Jin, Yinji; Piao, Jinshi; Lee, Sun-Mi

    2016-04-01

    Fall risk assessment is the first step toward prevention, and a risk assessment tool with high validity should be used. This study aimed to develop and validate an automated fall risk assessment system (Auto-FallRAS) to assess fall risks based on electronic medical records (EMRs) without additional data collected or entered by nurses. This study was conducted in a 1335-bed university hospital in Seoul, South Korea. The Auto-FallRAS was developed using 4211 fall-related clinical data extracted from EMRs. Participants included fall patients and non-fall patients (868 and 3472 for the development study; 752 and 3008 for the validation study; and 58 and 232 for validation after clinical application, respectively). The system was evaluated for predictive validity and concurrent validity. The final 10 predictors were included in the logistic regression model for the risk-scoring algorithm. The results of the Auto-FallRAS were shown as high/moderate/low risk on the EMR screen. The predictive validity analyzed after clinical application of the Auto-FallRAS was as follows: sensitivity = 0.95, NPV = 0.97 and Youden index = 0.44. The validity of the Morse Fall Scale assessed by nurses was as follows: sensitivity = 0.68, NPV = 0.88 and Youden index = 0.28. This study found that the Auto-FallRAS results were better than were the nurses' predictions. The advantage of the Auto-FallRAS is that it automatically analyzes information and shows patients' fall risk assessment results without requiring additional time from nurses. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.

  3. Clinical audit project in undergraduate medical education curriculum: an assessment validation study

    PubMed Central

    Steketee, Carole; Mak, Donna

    2016-01-01

    Objectives To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. Methods A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). Results The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes.  Substantive validity in students’ and examiners’ response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP.  There is evidence of high internal consistency reliability of CAP scores (Cronbach’s alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct.  Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates.  Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. Conclusions This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole.  PMID:27716612

  4. Clinical audit project in undergraduate medical education curriculum: an assessment validation study.

    PubMed

    Tor, Elina; Steketee, Carole; Mak, Donna

    2016-09-24

    To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes.  Substantive validity in students' and examiners' response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP.  There is evidence of high internal consistency reliability of CAP scores (Cronbach's alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct.  Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates.  Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole.

  5. Validation of the organizational culture assessment instrument.

    PubMed

    Heritage, Brody; Pollock, Clare; Roberts, Lynne

    2014-01-01

    Organizational culture is a commonly studied area in industrial/organizational psychology due to its important role in workplace behaviour, cognitions, and outcomes. Jung et al.'s [1] review of the psychometric properties of organizational culture measurement instruments noted many instruments have limited validation data despite frequent use in both theoretical and applied situations. The Organizational Culture Assessment Instrument (OCAI) has had conflicting data regarding its psychometric properties, particularly regarding its factor structure. Our study examined the factor structure and criterion validity of the OCAI using robust analysis methods on data gathered from 328 (females = 226, males = 102) Australian employees. Confirmatory factor analysis supported a four factor structure of the OCAI for both ideal and current organizational culture perspectives. Current organizational culture data demonstrated expected reciprocally-opposed relationships between three of the four OCAI factors and the outcome variable of job satisfaction but ideal culture data did not, thus indicating possible weak criterion validity when the OCAI is used to assess ideal culture. Based on the mixed evidence regarding the measure's properties, further examination of the factor structure and broad validity of the measure is encouraged.

  6. Validation of the Organizational Culture Assessment Instrument

    PubMed Central

    Heritage, Brody; Pollock, Clare; Roberts, Lynne

    2014-01-01

    Organizational culture is a commonly studied area in industrial/organizational psychology due to its important role in workplace behaviour, cognitions, and outcomes. Jung et al.'s [1] review of the psychometric properties of organizational culture measurement instruments noted many instruments have limited validation data despite frequent use in both theoretical and applied situations. The Organizational Culture Assessment Instrument (OCAI) has had conflicting data regarding its psychometric properties, particularly regarding its factor structure. Our study examined the factor structure and criterion validity of the OCAI using robust analysis methods on data gathered from 328 (females = 226, males = 102) Australian employees. Confirmatory factor analysis supported a four factor structure of the OCAI for both ideal and current organizational culture perspectives. Current organizational culture data demonstrated expected reciprocally-opposed relationships between three of the four OCAI factors and the outcome variable of job satisfaction but ideal culture data did not, thus indicating possible weak criterion validity when the OCAI is used to assess ideal culture. Based on the mixed evidence regarding the measure's properties, further examination of the factor structure and broad validity of the measure is encouraged. PMID:24667839

  7. Statistical methodology: II. Reliability and validity assessment in study design, Part B.

    PubMed

    Karras, D J

    1997-02-01

    Validity measures the correspondence between a test and other purported measures of the same or similar qualities. When a reference standard exists, a criterion-based validity coefficient can be calculated. If no such standard is available, the concepts of content and construct validity may be used, but quantitative analysis may not be possible. The Pearson and Spearman tests of correlation are often used to assess the correspondence between tests, but do not account for measurement biases and may yield misleading results. Techniques that measure interest differences may be more meaningful in validity assessment, and the kappa statistic is useful for analyzing categorical variables. Questionnaires often can be designed to allow quantitative assessment of reliability and validity, although this may be difficult. Inclusion of homogeneous questions is necessary to assess reliability. Analysis is enhanced by using Likert scales or similar techniques that yield ordinal data. Validity assessment of questionnaires requires careful definition of the scope of the test and comparison with previously validated tools.

  8. Validation of a new classification system for skin tears.

    PubMed

    LeBlanc, Kimberly; Baranoski, Sharon; Holloway, Samantha; Langemo, Diane

    2013-06-01

    The aim of this study was to validate and establish reliability of the International Skin Tear classification system. A consensus panel of 12 internationally recognized key opinion leaders convened in 2011 to establish consensus statements on the prevention, prediction, assessment, and treatment of skin tears. Subsequently, a new skin tear classification system was proposed. The system was then tested for interrater and intrarater reliability between the experts before being tested more widely on a sample of 327 individuals from the United States, Canada, and Europe. The results of the study indicated a substantial level of agreement for the expert panel (Fleiss κ = 0.619; 2-month follow-up = 0.653). Intrarater reliability was high (Cohen κ = 0.877). Interrater reliability was moderate (Fleiss κ = 0.555) for healthcare professionals (n = 303) and fair for non-health professionals (Fleiss κ = 0.338; n = 24). This international study established the reliability and validity of a new classification system for skin tears.

  9. Validation of multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Siewiorek, D. P.; Segall, Z.; Kong, T.

    1982-01-01

    Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested.

  10. Functional gait assessment and balance evaluation system test: reliability, validity, sensitivity, and specificity for identifying individuals with Parkinson disease who fall.

    PubMed

    Leddy, Abigail L; Crowner, Beth E; Earhart, Gammon M

    2011-01-01

    Gait impairments, balance impairments, and falls are prevalent in individuals with Parkinson disease (PD). Although the Berg Balance Scale (BBS) can be considered the reference standard for the determination of fall risk, it has a noted ceiling effect. Development of ceiling-free measures that can assess balance and are good at discriminating "fallers" from "nonfallers" is needed. The purpose of this study was to compare the Functional Gait Assessment (FGA) and the Balance Evaluation Systems Test (BESTest) with the BBS among individuals with PD and evaluate the tests' reliability, validity, and discriminatory sensitivity and specificity for fallers versus nonfallers. This was an observational study of community-dwelling individuals with idiopathic PD. The BBS, FGA, and BESTest were administered to 80 individuals with PD. Interrater reliability (n=15) was assessed by 3 raters. Test-retest reliability was based on 2 tests of participants (n=24), 2 weeks apart. Intraclass correlation coefficients (2,1) were used to calculate reliability, and Spearman correlation coefficients were used to assess validity. Cutoff points, sensitivity, and specificity were based on receiver operating characteristic plots. Test-retest reliability was .80 for the BBS, .91 for the FGA, and .88 for the BESTest. Interrater reliability was greater than .93 for all 3 tests. The FGA and BESTest were correlated with the BBS (r=.78 and r=.87, respectively). Cutoff scores to identify fallers were 47/56 for the BBS, 15/30 for the FGA, and 69% for the BESTest. The overall accuracy (area under the curve) for the BBS, FGA, and BESTest was .79, .80, and .85, respectively. Fall reports were retrospective. Both the FGA and the BESTest have reliability and validity for assessing balance in individuals with PD. The BESTest is most sensitive for identifying fallers.

  11. Revision and validation of a scale to assess pregnancy stress.

    PubMed

    Chen, Chung-Hey

    2015-03-01

    Pregnancy is a potentially stressful event. Prenatal stress alters maternal endocrine and immune systems, has been implicated in the etiology of prenatal complications or postnatal psychiatric disorders, and may adversely affect fetal health. The 30-item Pregnancy Stress Rating Scale (PSRS), initially developed in 1983 by Chen and colleagues, is the only measure to date designed specifically to evaluate prenatal stress. The purpose of this study was to reconsider and revise the 30-item PSRS and validate the new PSRS. A cross-sectional design was used. Adding new items of pregnancy stress generated from clinical experience and expert recommendations resulted in a 40-item revised PSRS that was more reflective of current social conditions. Three hundred pregnant women, recruited from the antenatal clinic of a medical center in southern Taiwan, completed the revised PSRS to assess its internal consistency, test-retest reliability, construct validity, and convergent and discriminate validity. The final 36-item PSRS (PSRS36) was derived by deleting four items with relatively low item-total correlation coefficients or factor loadings. The resultant 36-item scale showed good internal consistency (α = .92) and 2-week test-retest reliability (r = .82). Factor analysis confirmed construct validity and suggested five prenatal stress dimensions, which explained 52.17% of the total variance. Convergent and discriminate validities were indicated by significant correlations among the PSRS36, Perceived Stress Scale, and Interpersonal Support Evaluation List. The PSRS36 is a psychometrically sound and practical tool for nurses and other healthcare providers to assess prenatal stress and to examine intervention protocols in Taiwanese prenatal women. More research is recommended to determine whether the PSRS36 may be used in other racial-ethnic groups.

  12. Concurrent Validity of Persian Version of Wechsler Intelligence Scale for Children - Fourth Edition and Cognitive Assessment System in Patients with Learning Disorder

    PubMed Central

    Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman

    2013-01-01

    Objective The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. Methods One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. Findings There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). Conclusion There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales. PMID:23724180

  13. Concurrent validity of persian version of wechsler intelligence scale for children - fourth edition and cognitive assessment system in patients with learning disorder.

    PubMed

    Rostami, Reza; Sadeghi, Vahid; Zarei, Jamileh; Haddadi, Parvaneh; Mohazzab-Torabi, Saman; Salamati, Payman

    2013-04-01

    The aim of this study was to compare the Persian version of the wechsler intelligence scale for children - fourth edition (WISC-IV) and cognitive assessment system (CAS) tests, to determine the correlation between their scales and to evaluate the probable concurrent validity of these tests in patients with learning disorders. One-hundered-sixty-two children with learning disorder who were presented at Atieh Comprehensive Psychiatry Center were selected in a consecutive non-randomized order. All of the patients were assessed based on WISC-IV and CAS scores questionnaires. Pearson correlation coefficient was used to analyze the correlation between the data and to assess the concurrent validity of the two tests. Linear regression was used for statistical modeling. The type one error was considered 5% in maximum. There was a strong correlation between total score of WISC-IV test and total score of CAS test in the patients (r=0.75, P<0.001). The correlations among the other scales were mostly high and all of them were statistically significant (P<0.001). A linear regression model was obtained (α = 0.51, β = 0.81 and P<0.001). There is an acceptable correlation between the WISC-IV scales and CAS test in children with learning disorders. A concurrent validity is established between the two tests and their scales.

  14. Policy and Validity Prospects for Performance-Based Assessment.

    ERIC Educational Resources Information Center

    Baker, Eva L.; And Others

    1994-01-01

    This article describes performance-based assessment as expounded by its proponents, comments on these conceptions, reviews evidence regarding the technical quality of performance-based assessment, and considers its validity under various policy options. (JDD)

  15. Validity and reliability of a novel immunosuppressive adverse effects scoring system in renal transplant recipients.

    PubMed

    Meaney, Calvin J; Arabi, Ziad; Venuto, Rocco C; Consiglio, Joseph D; Wilding, Gregory E; Tornatore, Kathleen M

    2014-06-12

    After renal transplantation, many patients experience adverse effects from maintenance immunosuppressive drugs. When these adverse effects occur, patient adherence with immunosuppression may be reduced and impact allograft survival. If these adverse effects could be prospectively monitored in an objective manner and possibly prevented, adherence to immunosuppressive regimens could be optimized and allograft survival improved. Prospective, standardized clinical approaches to assess immunosuppressive adverse effects by health care providers are limited. Therefore, we developed and evaluated the application, reliability and validity of a novel adverse effects scoring system in renal transplant recipients receiving calcineurin inhibitor (cyclosporine or tacrolimus) and mycophenolic acid based immunosuppressive therapy. The scoring system included 18 non-renal adverse effects organized into gastrointestinal, central nervous system and aesthetic domains developed by a multidisciplinary physician group. Nephrologists employed this standardized adverse effect evaluation in stable renal transplant patients using physical exam, review of systems, recent laboratory results, and medication adherence assessment during a clinic visit. Stable renal transplant recipients in two clinical studies were evaluated and received immunosuppressive regimens comprised of either cyclosporine or tacrolimus with mycophenolic acid. Face, content, and construct validity were assessed to document these adverse effect evaluations. Inter-rater reliability was determined using the Kappa statistic and intra-class correlation. A total of 58 renal transplant recipients were assessed using the adverse effects scoring system confirming face validity. Nephrologists (subject matter experts) rated the 18 adverse effects as: 3.1 ± 0.75 out of 4 (maximum) regarding clinical importance to verify content validity. The adverse effects scoring system distinguished 1.75-fold increased gastrointestinal adverse

  16. Corporate Entrepreneurship Assessment Instrument (CEAI): Refinement and Validation of a Survey Measure

    DTIC Science & Technology

    2007-03-01

    CORPORATE ENTREPRENEURSHIP ASSESSMENT INSTRUMENT (CEAI): REFINEMENT AND VALIDATION OF A SURVEY MEASURE...States Government. AFIT/GIR/ENV/07-M7 CORPORATE ENTREPRENEURSHIP ASSESSMENT INSTRUMENT (CEAI): REFINEMENT AND VALIDATION OF A SURVEY MEASURE...UNLIMITED AFIT/GIR/ENV/07-M7 CORPORATE ENTREPRENEURSHIP ASSESSMENT INSTRUMENT (CEAI): REFINEMENT AND VALIDATION OF A SURVEY MEASURE Michael

  17. Cross-validation pitfalls when selecting and assessing regression and classification models.

    PubMed

    Krstajic, Damjan; Buturovic, Ljubomir J; Leahy, David E; Thomas, Simon

    2014-03-29

    We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.

  18. Validity of portfolio assessment: which qualities determine ratings?

    PubMed

    Driessen, Erik W; Overeem, Karlijn; van Tartwijk, Jan; van der Vleuten, Cees P M; Muijtjens, Arno M M

    2006-09-01

    The portfolio is becoming increasingly accepted as a valuable tool for learning and assessment. The validity of portfolio assessment, however, may suffer from bias due to irrelevant qualities, such as lay-out and writing style. We examined the possible effects of such qualities in a portfolio programme aimed at stimulating Year 1 medical students to reflect on their professional and personal development. In later curricular years, this portfolio is also used to judge clinical competence. We developed an instrument, the Portfolio Analysis Scoring Inventory, to examine the impact of form and content aspects on portfolio assessment. The Inventory consists of 15 items derived from interviews with experienced mentors, the literature, and the criteria for reflective competence used in the regular portfolio assessment procedure. Forty portfolios, selected from 231 portfolios for which ratings from the regular assessment procedure were available, were rated by 2 researchers, independently, using the Inventory. Regression analysis was used to estimate the correlation between the ratings from the regular assessment and those resulting from the Inventory items. Inter-rater agreement ranged from 0.46 to 0.87. The strongest predictor of the variance in the regular ratings was 'quality of reflection' (R 0.80; R2 66%). No further items accounted for a significant proportion of variance. Irrelevant items, such as writing style and lay-out, had negligible effects. The absence of an impact of irrelevant criteria appears to support the validity of the portfolio assessment procedure. Further studies should examine the portfolio's validity for the assessment of clinical competence.

  19. A Model-Based Approach to Support Validation of Medical Cyber-Physical Systems

    PubMed Central

    Silva, Lenardo C.; Almeida, Hyggo O.; Perkusich, Angelo; Perkusich, Mirko

    2015-01-01

    Medical Cyber-Physical Systems (MCPS) are context-aware, life-critical systems with patient safety as the main concern, demanding rigorous processes for validation to guarantee user requirement compliance and specification-oriented correctness. In this article, we propose a model-based approach for early validation of MCPS, focusing on promoting reusability and productivity. It enables system developers to build MCPS formal models based on a library of patient and medical device models, and simulate the MCPS to identify undesirable behaviors at design time. Our approach has been applied to three different clinical scenarios to evaluate its reusability potential for different contexts. We have also validated our approach through an empirical evaluation with developers to assess productivity and reusability. Finally, our models have been formally verified considering functional and safety requirements and model coverage. PMID:26528982

  20. A Model-Based Approach to Support Validation of Medical Cyber-Physical Systems.

    PubMed

    Silva, Lenardo C; Almeida, Hyggo O; Perkusich, Angelo; Perkusich, Mirko

    2015-10-30

    Medical Cyber-Physical Systems (MCPS) are context-aware, life-critical systems with patient safety as the main concern, demanding rigorous processes for validation to guarantee user requirement compliance and specification-oriented correctness. In this article, we propose a model-based approach for early validation of MCPS, focusing on promoting reusability and productivity. It enables system developers to build MCPS formal models based on a library of patient and medical device models, and simulate the MCPS to identify undesirable behaviors at design time. Our approach has been applied to three different clinical scenarios to evaluate its reusability potential for different contexts. We have also validated our approach through an empirical evaluation with developers to assess productivity and reusability. Finally, our models have been formally verified considering functional and safety requirements and model coverage.

  1. Validation of a global scale to assess the quality of interprofessional teamwork in mental health settings.

    PubMed

    Tomizawa, Ryoko; Yamano, Mayumi; Osako, Mitue; Hirabayashi, Naotugu; Oshima, Nobuo; Sigeta, Masahiro; Reeves, Scott

    2017-12-01

    Few scales currently exist to assess the quality of interprofessional teamwork through team members' perceptions of working together in mental health settings. The purpose of this study was to revise and validate an interprofessional scale to assess the quality of teamwork in inpatient psychiatric units and to use it multi-nationally. A literature review was undertaken to identify evaluative teamwork tools and develop an additional 12 items to ensure a broad global focus. Focus group discussions considered adaptation to different care systems using subjective judgements from 11 participants in a pre-test of items. Data quality, construct validity, reproducibility, and internal consistency were investigated in the survey using an international comparative design. Exploratory factor analysis yielded five factors with 21 items: 'patient/community centred care', 'collaborative communication', 'interprofessional conflict', 'role clarification', and 'environment'. High overall internal consistency, reproducibility, adequate face validity, and reasonable construct validity were shown in the USA and Japan. The revised Collaborative Practice Assessment Tool (CPAT) is a valid measure to assess the quality of interprofessional teamwork in psychiatry and identifies the best strategies to improve team performance. Furthermore, the revised scale will generate more rigorous evidence for collaborative practice in psychiatry internationally.

  2. Validation of the French Version of the Edmonton Symptom Assessment System.

    PubMed

    Pautex, Sophie; Vayne-Bossert, Petra; Bernard, Mathieu; Beauverd, Michel; Cantin, Boris; Mazzocato, Claudia; Thollet, Catherine; Bollondi-Pauly, Catherine; Ducloux, Dominique; Herrmann, François; Escher, Monica

    2017-11-01

    The Edmonton Symptom Assessment System (ESAS) is a brief, widely adopted, multidimensional questionnaire to evaluate patient-reported symptoms. The objective of this study was to define a standard French version of the ESAS (F-ESAS) to determine the psychometric properties in French-speaking patients. In a first pilot study, health professionals (n = 20) and patients (n = 33) defined the most adapted terms in French (F-ESAS). In a prospective multicentric study, palliative care patients completed the three forms of F-ESAS (F-ESAS-VI, F-ESAS-VE, and F-ESAS-NU, where VI is visual, VE, verbal, and NU, numerical), the Hospital Anxiety and Depression Scale. All patients had a test-retest evaluation during the same half-day. Standardized distraction material was used between each scale. One hundred twenty-four patients were included (mean age [±SD]: 68.3 ± 12; 70 women; 54 men). Test-retest reliability was high for all three F-ESAS, and the correlation between these scales was nearly perfect (Spearman rs = 0.66-0.91; P < 0.05). F-ESAS-VI, F-ESAS-VE, and F-ESAS-NU performed similarly and were equally reliable, although there was a trend toward lower reliability for F-ESAS-VI. Correlation between F-ESAS depression and anxiety and HADS depression and anxiety, respectively, were positive (Spearman rs = 0.38-0.41 for depression; Spearman rs = 0.48-0.57 for anxiety, P < 0.05). Among patients, 59 (48%), 45 (36%), and 20 (16%) preferred to assess their symptoms with F-ESAS-VE, F-ESAS-NU, and F-ESAS-VI, respectively. The F-ESAS is a valid and reliable tool for measuring multidimensional symptoms in French-speaking patients with an advanced cancer. All forms of F-ESAS performed well with a trend for better psychometric performance for F-ESAS-NU, but patients preferred the F-ESAS-VE. Copyright © 2017. Published by Elsevier Inc.

  3. Developing the Polish Educational Needs Assessment Tool (Pol-ENAT) in rheumatoid arthritis and systemic sclerosis: a cross-cultural validation study using Rasch analysis.

    PubMed

    Sierakowska, Matylda; Sierakowski, Stanisław; Sierakowska, Justyna; Horton, Mike; Ndosi, Mwidimi

    2015-03-01

    To undertake cross-cultural adaptation and validation of the educational needs assessment tool (ENAT) for use with people with rheumatoid arthritis (RA) and systemic sclerosis (SSc) in Poland. The study involved two main phases: (1) cross-cultural adaptation of the ENAT from English into Polish and (2) Cross-cultural validation of Polish Educational Needs Assessment Tool (Pol-ENAT). The first phase followed an established process of cross-cultural adaptation of self-report measures. The second phase involved completion of the Pol-ENAT by patients and subjecting the data to Rasch analysis to assess the construct validity, unidimensionality, internal consistency and cross-cultural invariance. An adequate conceptual equivalence was achieved following the adaptation process. The dataset for validation comprised a total of 278 patients, 237 (85.3 %) of which were female. In each disease group (145, RA and 133, SSc), the 7 domains of the Pol-ENAT were found to fit the Rasch model, X (2)(df) = 16.953(14), p = 0.259 and 8.132(14), p = 0.882 for RA and SSc, respectively. Internal consistency of the Pol-ENAT was high (patient separation index = 0.85 and 0.89 for SSc and RA, respectively), and unidimensionality was confirmed. Cross-cultural differential item functioning (DIF) was detected in some subscales, and DIF-adjusted conversion tables were calibrated to enable cross-cultural comparison of data between Poland and the UK. Using a standard process in cross-cultural adaptation, conceptual equivalence was achieved between the original (UK) ENAT and the adapted Pol-ENAT. Fit to the Rasch model, confirmed that the construct validity, unidimensionality and internal consistency of the ENAT have been preserved.

  4. Validation of an organizational communication climate assessment toolkit.

    PubMed

    Wynia, Matthew K; Johnson, Megan; McCoy, Thomas P; Griffin, Leah Passmore; Osborn, Chandra Y

    2010-01-01

    Effective communication is critical to providing quality health care and can be affected by a number of modifiable organizational factors. The authors performed a prospective multisite validation study of an organizational communication climate assessment tool in 13 geographically and ethnically diverse health care organizations. Communication climate was measured across 9 discrete domains. Patient and staff surveys with matched items in each domain were developed using a national consensus process, which then underwent psychometric field testing and assessment of domain coherence. The authors found meaningful within-site and between-site performance score variability in all domains. In multivariable models, most communication domains were significant predictors of patient-reported quality of care and trust. The authors conclude that these assessment tools provide a valid empirical assessment of organizational communication climate in 9 domains. Assessment results may be useful to track organizational performance, to benchmark, and to inform tailored quality improvement interventions.

  5. Design and validation of an automated hydrostatic weighing system.

    PubMed

    McClenaghan, B A; Rocchio, L

    1986-08-01

    The purpose of this study was to design and evaluate the validity of an automated technique to assess body density using a computerized hydrostatic weighing system. An existing hydrostatic tank was modified and interfaced with a microcomputer equipped with an analog-to-digital converter. Software was designed to input variables, control the collection of data, calculate selected measurements, and provide a summary of the results of each session. Validity of the data obtained utilizing the automated hydrostatic weighing system was estimated by: evaluating the reliability of the transducer/computer interface to measure objects of known underwater weight; comparing the data against a criterion measure; and determining inter-session subject reliability. Values obtained from the automated system were found to be highly correlated with known underwater weights (r = 0.99, SEE = 0.0060 kg). Data concurrently obtained utilizing the automated system and a manual chart recorder were also found to be highly correlated (r = 0.99, SEE = 0.0606 kg). Inter-session subject reliability was determined utilizing data collected on subjects (N = 16) tested on two occasions approximately 24 h apart. Correlations revealed high relationships between measures of underwater weight (r = 0.99, SEE = 0.1399 kg) and body density (r = 0.98, SEE = 0.00244 g X cm-1). Results indicate that a computerized hydrostatic weighing system is a valid and reliable method for determining underwater weight.

  6. Evaluating the Diagnostic Validity of a Facet-Based Formative Assessment System

    ERIC Educational Resources Information Center

    DeBarger, Angela Haydel; DiBello, Louis; Minstrell, Jim; Feng, Mingyu; Stout, William; Pellegrino, James; Haertel, Geneva; Harris, Christopher; Ructinger, Liliana

    2011-01-01

    This paper describes methods for an alignment study and psychometric analyses of a formative assessment system, Diagnoser Tools for physics. Diagnoser Tools begin with facet clusters as the interpretive framework for designing questions and instructional activities. Thus each question in the diagnostic assessments includes distractors that…

  7. Reliability and Validity Evidence of Multiple Balance Assessments in Athletes With a Concussion

    PubMed Central

    Murray, Nicholas; Salvatore, Anthony; Powell, Douglas; Reed-Jones, Rebecca

    2014-01-01

    Context: An estimated 300 000 sport-related concussion injuries occur in the United States annually. Approximately 30% of individuals with concussions experience balance disturbances. Common methods of balance assessment include the Clinical Test of Sensory Organization and Balance (CTSIB), the Sensory Organization Test (SOT), the Balance Error Scoring System (BESS), and the Romberg test; however, the National Collegiate Athletic Association recommended the Wii Fit as an alternative measure of balance in athletes with a concussion. A central concern regarding the implementation of the Wii Fit is whether it is reliable and valid for measuring balance disturbance in athletes with concussion. Objective: To examine the reliability and validity evidence for the CTSIB, SOT, BESS, Romberg test, and Wii Fit for detecting balance disturbance in athletes with a concussion. Data Sources: Literature considered for review included publications with reliability and validity data for the assessments of balance (CTSIB, SOT, BESS, Romberg test, and Wii Fit) from PubMed, PsycINFO, and CINAHL. Data Extraction: We identified 63 relevant articles for consideration in the review. Of the 63 articles, 28 were considered appropriate for inclusion and 35 were excluded. Data Synthesis: No current reliability or validity information supports the use of the CTSIB, SOT, Romberg test, or Wii Fit for balance assessment in athletes with a concussion. The BESS demonstrated moderate to high reliability (interclass correlation coefficient = 0.87) and low to moderate validity (sensitivity = 34%, specificity = 87%). However, the Romberg test and Wii Fit have been shown to be reliable tools in the assessment of balance in Parkinson patients. Conclusions: The BESS can evaluate balance problems after a concussion. However, it lacks the ability to detect balance problems after the third day of recovery. Further investigation is needed to establish the use of the CTSIB, SOT, Romberg test, and Wii Fit for

  8. Translation, cultural adaptation and validation into portuguese (Brazil) in Systemic Sclerosis Questionnaire (SySQ).

    PubMed

    Machado, Roberta Ismael Lacerda; Souto, Lais Medeiros; Freire, Eutilia Andrade Medeiros

    2014-01-01

    Systemic sclerosis (SSc) is a multisystem disease, autoimmune disorder characterized by a fibroblastic disfunction, with significant impact on quality of life (QoL), measured by instruments or questionnaires that usually were formulated in other languages and in different cultural contexts. Translate into Brazilian Portuguese, cross cultural adaptation and assess the reliability and validity of the Systemic Sclerosis Questionnaire (SySQ). Translation and adaptation: into Portuguese and cross-cultural adaptation was performed in accordance with studies on questionnaire translation methodology into other languages. Reliability: it was analyzed using three interviews with different interviewers, two on the same day (interobserver) and the third within 14 days of the first assessment (intraobserver).Validity was assessed by correlating clinical and quality of life parameters with the domain scores of Sysc. a descriptive analysis of the study sample. Reproducibility was assessed using an intraclass correlation coefficient (ICC). Internal consistency was assessed using Cronbach's alpha coefficient. To assess validity we used Spearman correlation coefficient. Five percent was the level of significance adopted for all statistical tests. In the evaluation of the questionnaires, the results were similar to the original questionnaire, the internal consistency ranging between 0.73 and 0.93 for each item. The interobserver reproducibility was very good for all domains (α = 0.786 to 0.983) and intraobserver agreement was considered very good for general symptoms domain (ICC = 0.916), good for musculoskeletal symptoms domain (ICC = 0.897) and cardiopulmonary domain (ICC = 0.842) and reasonable for gastrointestinal symptoms domain (ICC = 0.686). The Brazilian Portuguese version of SySQ proved to be reproducible and valid for our population, using a recognized methodology for translation and cultural adaptation of questionnaires, as well as to assess the reproducibility and

  9. Exploring a Framework for Consequential Validity for Performance-Based Assessments

    ERIC Educational Resources Information Center

    Kim, Su Jung

    2017-01-01

    This study explores a new comprehensive framework for understanding elements of validity, specifically for performance assessments that are administered within specific and dynamic contexts. The adoption of edTPA is a good empirical case for examining the concept of consequential validity because this assessment has been implemented at the state…

  10. Reliable and valid assessment of point-of-care ultrasonography.

    PubMed

    Todsen, Tobias; Tolsgaard, Martin Grønnebæk; Olsen, Beth Härstedt; Henriksen, Birthe Merete; Hillingsø, Jens Georg; Konge, Lars; Jensen, Morten Lind; Ringsted, Charlotte

    2015-02-01

    To explore the reliability and validity of the Objective Structured Assessment of Ultrasound Skills (OSAUS) scale for point-of-care ultrasonography (POC US) performance. POC US is increasingly used by clinicians and is an essential part of the management of acute surgical conditions. However, the quality of performance is highly operator-dependent. Therefore, reliable and valid assessment of trainees' ultrasonography competence is needed to ensure patient safety. Twenty-four physicians, representing novices, intermediates, and experts in POC US, scanned 4 different surgical patient cases in a controlled set-up. All ultrasound examinations were video-recorded and assessed by 2 blinded radiologists using OSAUS. Reliability was examined using generalizability theory. Construct validity was examined by comparing performance scores between the groups and by correlating physicians' OSAUS scores with diagnostic accuracy. The generalizability coefficient was high (0.81) and a D-study demonstrated that 1 assessor and 5 cases would result in similar reliability. The construct validity of the OSAUS scale was supported by a significant difference in the mean scores between the novice group (17.0; SD 8.4) and the intermediate group (30.0; SD 10.1), P = 0.007, as well as between the intermediate group and the expert group (72.9; SD 4.4), P = 0.04, and by a high correlation between OSAUS scores and diagnostic accuracy (Spearman ρ correlation coefficient = 0.76; P < 0.001). This study demonstrates high reliability as well as evidence of construct validity of the OSAUS scale for assessment of POC US competence. Hence, the OSAUS scale may be suitable for both in-training as well as end-of-training assessment.

  11. Embedded performance validity testing in neuropsychological assessment: Potential clinical tools.

    PubMed

    Rickards, Tyler A; Cranston, Christopher C; Touradji, Pegah; Bechtold, Kathleen T

    2018-01-01

    The article aims to suggest clinically-useful tools in neuropsychological assessment for efficient use of embedded measures of performance validity. To accomplish this, we integrated available validity-related and statistical research from the literature, consensus statements, and survey-based data from practicing neuropsychologists. We provide recommendations for use of 1) Cutoffs for embedded performance validity tests including Reliable Digit Span, California Verbal Learning Test (Second Edition) Forced Choice Recognition, Rey-Osterrieth Complex Figure Test Combination Score, Wisconsin Card Sorting Test Failure to Maintain Set, and the Finger Tapping Test; 2) Selecting number of performance validity measures to administer in an assessment; and 3) Hypothetical clinical decision-making models for use of performance validity testing in a neuropsychological assessment collectively considering behavior, patient reporting, and data indicating invalid or noncredible performance. Performance validity testing helps inform the clinician about an individual's general approach to tasks: response to failure, task engagement and persistence, compliance with task demands. Data-driven clinical suggestions provide a resource to clinicians and to instigate conversation within the field to make more uniform, testable decisions to further the discussion, and guide future research in this area.

  12. Evaluating the Diagnostic Validity of the Facet-Based Formative Assessment System

    ERIC Educational Resources Information Center

    DeBarger, Angela H.; DiBello, Louis; Minstrell, Jim; Stout, William; Pellegrino, James; Haertel, Geneva; Feng, Mingyu

    2011-01-01

    The research design and team constitute a multidisciplinary attack on problems of educational and assessment design in physics instruction. Components of the research include: (a) an Evidence-Centered Design analysis of Diagnoser instructional materials and assessments that provides a view of the evidentiary coherence of the existing system; (b)…

  13. Towards a Framework for the Validation of Early Childhood Assessment Systems

    ERIC Educational Resources Information Center

    Goldstein, Jessica; Flake, Jessica Kay

    2016-01-01

    American early childhood education is in the midst of drastic change. In recent years, states have begun the process of overhauling early childhood education systems in response to federal grant competitions, bringing an increased focus on assessment and accountability for early learning programs. The assessment of young children is fraught with…

  14. A Design to Improve Internal Validity of Assessments of Teaching Demonstrations

    ERIC Educational Resources Information Center

    Bartsch, Robert A.; Engelhardt Bittner, Wendy M.; Moreno, Jesse E., Jr.

    2008-01-01

    Internal validity is important in assessing teaching demonstrations both for one's knowledge and for quality assessment demanded by outside sources. We describe a method to improve the internal validity of assessments of teaching demonstrations: a 1-group pretest-posttest design with alternative forms. This design is often more practical and…

  15. Validity of the OSU Post-Traumatic Stress Disorder Scale and the Behavior Assessment System for Children Self-Report of Personality with Child Tornado Survivors

    ERIC Educational Resources Information Center

    Evans, Linda Garner; Oehler-Stinnett, Judy

    2008-01-01

    Tornadoes and other natural disasters can lead to anxiety and posttraumatic stress disorder (PTSD) in children. This study provides further validity for the Oklahoma State University Post-Traumatic Stress Disorder Scale-Child Form (OSU PTSDS-CF) by comparing it to the Behavior Assessment System for Children Self-Report of Personality (BASC-SRP).…

  16. Development and Validation of the Musical Ear Training Assessment (META)

    ERIC Educational Resources Information Center

    Wolf, Anna; Kopiez, Reinhard

    2018-01-01

    In the following study, we have developed an assessment instrument for the practice-dependent skill of analytical hearing following a strict test theoretical validation, resulting in the Musical Ear Training Assessment (META). By means of three pilot studies, a developmental study, and a validation study, we verified a one-dimensional test model…

  17. Building validation tools for knowledge-based systems

    NASA Technical Reports Server (NTRS)

    Stachowitz, R. A.; Chang, C. L.; Stock, T. S.; Combs, J. B.

    1987-01-01

    The Expert Systems Validation Associate (EVA), a validation system under development at the Lockheed Artificial Intelligence Center for more than a year, provides a wide range of validation tools to check the correctness, consistency and completeness of a knowledge-based system. A declarative meta-language (higher-order language), is used to create a generic version of EVA to validate applications written in arbitrary expert system shells. The architecture and functionality of EVA are presented. The functionality includes Structure Check, Logic Check, Extended Structure Check (using semantic information), Extended Logic Check, Semantic Check, Omission Check, Rule Refinement, Control Check, Test Case Generation, Error Localization, and Behavior Verification.

  18. Validating workplace performance assessments in health sciences students: a case study from speech pathology.

    PubMed

    McAllister, Sue; Lincoln, Michelle; Ferguson, Allison; McAllister, Lindy

    2013-01-01

    Valid assessment of health science students' ability to perform in the real world of workplace practice is critical for promoting quality learning and ultimately certifying students as fit to enter the world of professional practice. Current practice in performance assessment in the health sciences field has been hampered by multiple issues regarding assessment content and process. Evidence for the validity of scores derived from assessment tools are usually evaluated against traditional validity categories with reliability evidence privileged over validity, resulting in the paradoxical effect of compromising the assessment validity and learning processes the assessments seek to promote. Furthermore, the dominant statistical approaches used to validate scores from these assessments fall under the umbrella of classical test theory approaches. This paper reports on the successful national development and validation of measures derived from an assessment of Australian speech pathology students' performance in the workplace. Validation of these measures considered each of Messick's interrelated validity evidence categories and included using evidence generated through Rasch analyses to support score interpretation and related action. This research demonstrated that it is possible to develop an assessment of real, complex, work based performance of speech pathology students, that generates valid measures without compromising the learning processes the assessment seeks to promote. The process described provides a model for other health professional education programs to trial.

  19. Validation in Support of Internationally Harmonised OECD Test Guidelines for Assessing the Safety of Chemicals.

    PubMed

    Gourmelon, Anne; Delrue, Nathalie

    Ten years elapsed since the OECD published the Guidance document on the validation and international regulatory acceptance of test methods for hazard assessment. Much experience has been gained since then in validation centres, in countries and at the OECD on a variety of test methods that were subjected to validation studies. This chapter reviews validation principles and highlights common features that appear to be important for further regulatory acceptance across studies. Existing OECD-agreed validation principles will most likely generally remain relevant and applicable to address challenges associated with the validation of future test methods. Some adaptations may be needed to take into account the level of technique introduced in test systems, but demonstration of relevance and reliability will continue to play a central role as pre-requisite for the regulatory acceptance. Demonstration of relevance will become more challenging for test methods that form part of a set of predictive tools and methods, and that do not stand alone. OECD is keen on ensuring that while these concepts evolve, countries can continue to rely on valid methods and harmonised approaches for an efficient testing and assessment of chemicals.

  20. Validity and inter-observer reliability of subjective hand-arm vibration assessments.

    PubMed

    Coenen, Pieter; Formanoy, Margriet; Douwes, Marjolein; Bosch, Tim; de Kraker, Heleen

    2014-07-01

    Exposure to mechanical vibrations at work (e.g., due to handling powered tools) is a potential occupational risk as it may cause upper extremity complaints. However, reliable and valid assessment methods for vibration exposure at work are lacking. Measuring hand-arm vibration objectively is often difficult and expensive, while often used information provided by manufacturers lacks detail. Therefore, a subjective hand-arm vibration assessment method was tested on validity and inter-observer reliability. In an experimental protocol, sixteen tasks handling powered tools were executed by two workers. Hand-arm vibration was assessed subjectively by 16 observers according to the proposed subjective assessment method. As a gold standard reference, hand-arm vibration was measured objectively using a vibration measurement device. Weighted κ's were calculated to assess validity, intra-class-correlation coefficients (ICCs) were calculated to assess inter-observer reliability. Inter-observer reliability of the subjective assessments depicting the agreement among observers can be expressed by an ICC of 0.708 (0.511-0.873). The validity of the subjective assessments as compared to the gold-standard reference can be expressed by a weighted κ of 0.535 (0.285-0.785). Besides, the percentage of exact agreement of the subjective assessment compared to the objective measurement was relatively low (i.e., 52% of all tasks). This study shows that subjectively assessed hand-arm vibrations are fairly reliable among observers and moderately valid. This assessment method is a first attempt to use subjective risk assessments of hand-arm vibration. Although, this assessment method can benefit from some future improvement, it can be of use in future studies and in field-based ergonomic assessments. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  1. Educational Assessment Using Intelligent Systems. Research Report. ETS RR-08-68

    ERIC Educational Resources Information Center

    Shute, Valerie J.; Zapata-Rivera, Diego

    2008-01-01

    Recent advances in educational assessment, cognitive science, and artificial intelligence have made it possible to integrate valid assessment and instruction in the form of modern computer-based intelligent systems. These intelligent systems leverage assessment information that is gathered from various sources (e.g., summative and formative). This…

  2. Validating the Octave Allegro Information Systems Risk Assessment Methodology: A Case Study

    ERIC Educational Resources Information Center

    Keating, Corland G.

    2014-01-01

    An information system (IS) risk assessment is an important part of any successful security management strategy. Risk assessments help organizations to identify mission-critical IS assets and prioritize risk mitigation efforts. Many risk assessment methodologies, however, are complex and can only be completed successfully by highly qualified and…

  3. Documentation of pharmaceutical care: Validation of an intervention oriented classification system.

    PubMed

    Maes, Karen A; Studer, Helene; Berger, Jérôme; Hersberger, Kurt E; Lampert, Markus L

    2017-12-01

    During the dispensing process, pharmacists may come across technical and clinical issues requiring a pharmaceutical intervention (PI). An intervention-oriented classification system is a helpful tool to document these PIs in a structured manner. Therefore, we developed the PharmDISC classification system (Pharmacists' Documentation of Interventions in Seamless Care). The aim of this study was to evaluate the PharmDISC system in the daily practice environment (in terms of interrater reliability, appropriateness, interpretability, acceptability, feasibility, and validity); to assess its user satisfaction, the descriptive manual, and the online training; and to explore first implementation aspects. Twenty-one pharmacists from different community pharmacies each classified 30 prescriptions requiring a PI with the PharmDISC system on 5 selected days within 5 weeks. Interrater reliability was determined using model PIs and Fleiss's kappa coefficients (κ) were calculated. User satisfaction was assessed by questionnaire with a 4-point Likert scale. The main outcome measures were interrater reliability (κ); appropriateness, interpretability, validity (ratio of completely classified PIs/all PIs); feasibility, and acceptability (user satisfaction and suggestions). The PharmDISC system reached an average substantial agreement (κ = 0.66). Of documented 519 PIs, 430 (82.9%) were completely classified. Most users found the system comprehensive (median user agreement 3 [2/3.25 quartiles]) and practical (3[2.75/3]). The PharmDISC system raised the awareness regarding drug-related problems for most users (n = 16). To facilitate its implementation, an electronic version that automatically connects to the prescription together with a task manager for PIs needing follow-up was suggested. Barriers could be time expenditure and lack of understanding the benefits. Substantial interrater reliability and acceptable user satisfaction indicate that the PharmDISC system is a valid

  4. Development and Validity Testing of an Arthritis Self-Management Assessment Tool.

    PubMed

    Oh, HyunSoo; Han, SunYoung; Kim, SooHyun; Seo, WhaSook

    Because of the chronic, progressive nature of arthritis and the substantial effects it has on quality of life, patients may benefit from self-management. However, no valid, reliable self-management assessment tool has been devised for patients with arthritis. This study was conducted to develop a comprehensive self-management assessment tool for patients with arthritis, that is, the Arthritis Self-Management Assessment Tool (ASMAT). To develop a list of qualified items corresponding to the conceptual definitions and attributes of arthritis self-management, a measurement model was established on the basis of theoretical and empirical foundations. Content validity testing was conducted to evaluate whether listed items were suitable for assessing arthritis self-management. Construct validity and reliability of the ASMAT were tested. Construct validity was examined using confirmatory factor analysis and nomological validity. The 32-item ASMAT was developed with a sample composed of patients in a clinic in South Korea. Content validity testing validated the 32 items, which comprised medical (10 items), behavioral (13 items), and psychoemotional (9 items) management subscales. Construct validity testing of the ASMAT showed that the 32 items properly corresponded with conceptual constructs of arthritis self-management, and were suitable for assessing self-management ability in patients with arthritis. Reliability was also well supported. The ASMAT devised in the present study may aid the evaluation of patient self-management ability and the effectiveness of self-management interventions. The authors believe the developed tool may also aid the identification of problems associated with the adoption of self-management practice, and thus improve symptom management, independence, and quality of life of patients with arthritis.

  5. Utility of pedometers for assessing physical activity: construct validity.

    PubMed

    Tudor-Locke, Catrine; Williams, Joel E; Reis, Jared P; Pluto, Delores

    2004-01-01

    Valid assessment of physical activity is necessary to fully understand this important health-related behaviour for research, surveillance, intervention and evaluation purposes. This article is the second in a companion set exploring the validity of pedometer-assessed physical activity. The previous article published in Sports Medicine dealt with convergent validity (i.e. the extent to which an instrument's output is associated with that of other instruments intended to measure the same exposure of interest). The present focus is on construct validity. Construct validity is the extent to which the measurement corresponds with other measures of theoretically-related parameters. Construct validity is typically evaluated by correlational analysis, that is, the magnitude of concordance between two measures (e.g. pedometer-determined steps/day and a theoretically-related parameter such as age, anthropometric measures and fitness). A systematic literature review produced 29 articles published since > or =1980 directly relevant to construct validity of pedometers in relation to age, anthropometric measures and fitness. Reported correlations were combined and a median r-value was computed. Overall, there was a weak inverse relationship (median r = -0.21) between age and pedometer-determined physical activity. A weak inverse relationship was also apparent with both body mass index and percentage overweight (median r = -0.27 and r = -0.22, respectively). Positive relationships regarding indicators of fitness ranged from weak to moderate depending on the fitness measure utilised: 6-minute walk test (median r = 0.69), timed treadmill test (median r = 0.41) and estimated maximum oxygen uptake (median r = 0.22). Studies are warranted to assess the relationship of pedometer-determined physical activity with other important health-related outcomes including blood pressure and physiological parameters such as blood glucose and lipid profiles. The aggregated evidence of convergent

  6. Content validity across methods of malnutrition assessment in patients with cancer is limited.

    PubMed

    Sealy, Martine J; Nijholt, Willemke; Stuiver, Martijn M; van der Berg, Marit M; Roodenburg, Jan L N; van der Schans, Cees P; Ottery, Faith D; Jager-Wittenaar, Harriët

    2016-08-01

    To identify malnutrition assessment methods in cancer patients and assess their content validity based on internationally accepted definitions for malnutrition. Systematic review of studies in cancer patients that operationalized malnutrition as a variable, published since 1998. Eleven key concepts, within the three domains reflected by the malnutrition definitions acknowledged by European Society for Clinical Nutrition and Metabolism (ESPEN) and the American Society for Parenteral and Enteral Nutrition (ASPEN): A: nutrient balance; B: changes in body shape, body area and body composition; and C: function, were used to classify content validity of methods to assess malnutrition. Content validity indices (M-CVIA-C) were calculated per assessment method. Acceptable content validity was defined as M-CVIA-C ≥ 0.80. Thirty-seven assessment methods were identified in the 160 included articles. Mini Nutritional Assessment (M-CVIA-C = 0.72), Scored Patient-Generated Subjective Global Assessment (M-CVIA-C = 0.61), and Subjective Global Assessment (M-CVIA-C = 0.53) scored highest M-CVIA-C. A large number of malnutrition assessment methods are used in cancer research. Content validity of these methods varies widely. None of these assessment methods has acceptable content validity, when compared against a construct based on ESPEN and ASPEN definitions of malnutrition. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Is Learner Self-Assessment Reliable and Valid in a Web-Based Portfolio Environment for High School Students?

    ERIC Educational Resources Information Center

    Chang, Chi-Cheng; Liang, Chaoyun; Chen, Yi-Hui

    2013-01-01

    This study explored the reliability and validity of Web-based portfolio self-assessment. Participants were 72 senior high school students enrolled in a computer application course. The students created learning portfolios, viewed peers' work, and performed self-assessment on the Web-based portfolio assessment system. The results indicated: 1)…

  8. Note on concurrent validation of the personality assessment inventory in law enforcement.

    PubMed

    Hays, J R

    1997-08-01

    This study compared the Personality Assessment Inventory and MMPI-168 profiles of 9 law enforcement applicants with published MMPI profiles to provide concurrent validation for the use of the Personality Assessment Inventory to assess personality pathology of peace officer applicants. The sample showed subclinical elevations of the Positive Impression and Treatment Rejection scales on the Personality Assessment Inventory and subclinical elevations on the MMPI validity scales of Lie and Correction and the clinical scales of Psychopathic Deviate and Hypomania. The applicants' mean MMPI profile provided concurrent validation for the use of the Personality Assessment Inventory in this decision on fitness to serve.

  9. Box-ticking and Olympic high jumping - Physicians' perceptions and acceptance of national physician validation systems.

    PubMed

    Sehlbach, Carolin; Govaerts, Marjan J B; Mitchell, Sharon; Rohde, Gernot G U; Smeenk, Frank W J M; Driessen, Erik W

    2018-05-24

    National physician validation systems aim to ensure lifelong learning through periodic appraisals of physicians' competence. Their effectiveness is determined by physicians' acceptance of and commitment to the system. This study, therefore, sought to explore physicians' perceptions and self-reported acceptance of validation across three different physician validation systems in Europe. Using a constructivist grounded-theory approach, we conducted semi-structured interviews with 32 respiratory specialists from three countries with markedly different validation systems: Germany, which has a mandatory, credit-based system oriented to continuing professional development; Denmark, with mandatory annual dialogs and ensuing, non-compulsory activities; and the UK, with a mandatory, portfolio-based revalidation system. We analyzed interview data with a view to identifying factors influencing physicians' perceptions and acceptance. Factors that influenced acceptance were the assessment's authenticity and alignment of its requirements with clinical practice, physicians' beliefs about learning, perceived autonomy, and organizational support. Users' acceptance levels determine any system's effectiveness. To support lifelong learning effectively, national physician validation systems must be carefully designed and integrated into daily practice. Involving physicians in their design may render systems more authentic and improve alignment between individual ambitions and the systems' goals, thereby promoting acceptance.

  10. Discriminant Validity Assessment: Use of Fornell & Larcker criterion versus HTMT Criterion

    NASA Astrophysics Data System (ADS)

    Hamid, M. R. Ab; Sami, W.; Mohmad Sidek, M. H.

    2017-09-01

    Assessment of discriminant validity is a must in any research that involves latent variables for the prevention of multicollinearity issues. Fornell and Larcker criterion is the most widely used method for this purpose. However, a new method has emerged for establishing the discriminant validity assessment through heterotrait-monotrait (HTMT) ratio of correlations method. Therefore, this article presents the results of discriminant validity assessment using these methods. Data from previous study was used that involved 429 respondents for empirical validation of value-based excellence model in higher education institutions (HEI) in Malaysia. From the analysis, the convergent, divergent and discriminant validity were established and admissible using Fornell and Larcker criterion. However, the discriminant validity is an issue when employing the HTMT criterion. This shows that the latent variables under study faced the issue of multicollinearity and should be looked into for further details. This also implied that the HTMT criterion is a stringent measure that could detect the possible indiscriminant among the latent variables. In conclusion, the instrument which consisted of six latent variables was still lacking in terms of discriminant validity and should be explored further.

  11. Content Validation and Evaluation of an Endovascular Teamwork Assessment Tool.

    PubMed

    Hull, L; Bicknell, C; Patel, K; Vyas, R; Van Herzeele, I; Sevdalis, N; Rudarakanchana, N

    2016-07-01

    To modify, content validate, and evaluate a teamwork assessment tool for use in endovascular surgery. A multistage, multimethod study was conducted. Stage 1 included expert review and modification of the existing Observational Teamwork Assessment for Surgery (OTAS) tool. Stage 2 included identification of additional exemplar behaviours contributing to effective teamwork and enhanced patient safety in endovascular surgery (using real-time observation, focus groups, and semistructured interviews of multidisciplinary teams). Stage 3 included content validation of exemplar behaviours using expert consensus according to established psychometric recommendations and evaluation of structure, content, feasibility, and usability of the Endovascular Observational Teamwork Assessment Tool (Endo-OTAS) by an expert multidisciplinary panel. Stage 4 included final team expert review of exemplars. OTAS core team behaviours were maintained (communication, coordination, cooperation, leadership team monitoring). Of the 114 OTAS behavioural exemplars, 19 were modified, four removed, and 39 additional endovascular-specific behaviours identified. Content validation of these 153 exemplar behaviours showed that 113/153 (73.9%) reached the predetermined Item-Content Validity Index rating for teamwork and/or patient safety. After expert team review, 140/153 (91.5%) exemplars were deemed to warrant inclusion in the tool. More than 90% of the expert panel agreed that Endo-OTAS is an appropriate teamwork assessment tool with observable behaviours. Some concerns were noted about the time required to conduct observations and provide performance feedback. Endo-OTAS is a novel teamwork assessment tool, with evidence for content validity and relevance to endovascular teams. Endo-OTAS enables systematic objective assessment of the quality of team performance during endovascular procedures. Copyright © 2016. Published by Elsevier Ltd.

  12. Recommendations for standardizing validation procedures assessing physical activity of older persons by monitoring body postures and movements.

    PubMed

    Lindemann, Ulrich; Zijlstra, Wiebren; Aminian, Kamiar; Chastin, Sebastien F M; de Bruin, Eling D; Helbostad, Jorunn L; Bussmann, Johannes B J

    2014-01-10

    Physical activity is an important determinant of health and well-being in older persons and contributes to their social participation and quality of life. Hence, assessment tools are needed to study this physical activity in free-living conditions. Wearable motion sensing technology is used to assess physical activity. However, there is a lack of harmonisation of validation protocols and applied statistics, which make it hard to compare available and future studies. Therefore, the aim of this paper is to formulate recommendations for assessing the validity of sensor-based activity monitoring in older persons with focus on the measurement of body postures and movements. Validation studies of body-worn devices providing parameters on body postures and movements were identified and summarized and an extensive inter-active process between authors resulted in recommendations about: information on the assessed persons, the technical system, and the analysis of relevant parameters of physical activity, based on a standardized and semi-structured protocol. The recommended protocols can be regarded as a first attempt to standardize validity studies in the area of monitoring physical activity.

  13. Assessing the reliability and validity of the Chinese Sexual Assault Symptom Scale (C-SASS): scale development and validation.

    PubMed

    Wang, Chang-Hwai; Lee, Jin-Chuan; Yuan, Yu-Hsi

    2014-01-01

    The purpose of this research is to establish and verify the psychometric and structural properties of the self-report Chinese Sexual Assault Symptom Scale (C-SASS) to assess the trauma experienced by Chinese victims of sexual assault. An earlier version of the C-SASS was constructed using a modified list of the same trauma symptoms administered to an American sample and used to develop and validate the Sexual Assault Symptom Scale II (SASS II). The rationale of this study is to revise the earlier version of the C-SASS, using a larger and more representative sample and more robust statistical analysis than in earlier research, to permit a more thorough examination of the instrument and further confirm the dimensions of sexual assault trauma in Chinese victims of rape. In this study, a sample of 418 victims from northern Taiwan was collected to confirm the reliability and validity of the C-SASS. Exploratory factor analysis yielded five common factors: Safety Fears, Self-Blame, Health Fears, Anger and Emotional Lability, and Fears About the Criminal Justice System. Further tests of the validity and composite reliability of the C-SASS were provided by the structural equation modeling (SEM). The results indicated that the C-SASS was a brief, valid, and reliable instrument for assessing sexual assault trauma among Chinese victims in Taiwan. The scale can be used to evaluate victims in sexual assault treatment centers around Taiwan, as well as to capture the characteristics of sexual assault trauma among Chinese victims.

  14. Meaningful Understanding and Systems Thinking in Organic Chemistry: Validating Measurement and Exploring Relationships

    ERIC Educational Resources Information Center

    Vachliotis, Theodoros; Salta, Katerina; Tzougraki, Chryssa

    2014-01-01

    The purpose of this study was dual: First, to develop and validate assessment schemes for assessing 11th grade students' meaningful understanding of organic chemistry concepts, as well as their systems thinking skills in the domain. Second, to explore the relationship between the two constructs of interest based on students' performance…

  15. Reliability and Validity Assessment of a Linear Position Transducer

    PubMed Central

    Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.

    2015-01-01

    The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300

  16. Assessing Clinical Reasoning (ASCLIRE): Instrument Development and Validation

    ERIC Educational Resources Information Center

    Kunina-Habenicht, Olga; Hautz, Wolf E.; Knigge, Michel; Spies, Claudia; Ahlers, Olaf

    2015-01-01

    Clinical reasoning is an essential competency in medical education. This study aimed at developing and validating a test to assess diagnostic accuracy, collected information, and diagnostic decision time in clinical reasoning. A norm-referenced computer-based test for the assessment of clinical reasoning (ASCLIRE) was developed, integrating the…

  17. Validity of an ultra-wideband local positioning system to measure locomotion in indoor sports.

    PubMed

    Serpiello, F R; Hopkins, W G; Barnes, S; Tavrou, J; Duthie, G M; Aughey, R J; Ball, K

    2018-08-01

    The validity of an Ultra-wideband (UWB) positioning system was investigated during linear and change-of-direction (COD) running drills. Six recreationally-active men performed ten repetitions of four activities (walking, jogging, maximal acceleration, and 45º COD) on an indoor court. Activities were repeated twice, in the centre of the court and on the side. Participants wore a receiver tag (Clearsky T6, Catapult Sports) and two reflective markers placed on the tag to allow for comparisons with the criterion system (Vicon). Distance, mean and peak velocity, acceleration, and deceleration were assessed. Validity was assessed via percentage least-square means difference (Clearsky-Vicon) with 90% confidence interval and magnitude-based inference; typical error was expressed as within-subject standard deviation. The mean differences for distance, mean/peak speed, and mean/peak accelerations in the linear drills were in the range of 0.2-12%, with typical errors between 1.2 and 9.3%. Mean and peak deceleration had larger differences and errors between systems. In the COD drill, moderate-to-large differences were detected for the activity performed in the centre of the court, increasing to large/very large on the side. When filtered and smoothed following a similar process, the UWB-based positioning system had acceptable validity, compared to Vicon, to assess movements representative of indoor sports.

  18. [Assessment of an Evaluation System for Psychiatry Learning].

    PubMed

    Campo-Cabal, Gerardo

    2012-01-01

    Through the analysis of a teaching evaluation system for a Psychiatry course aimed at Medicine students, the author reviews the basic elements taken into account in a teaching assessment process. Analysis was carried out of the assessment methods used as well as of the grades obtained by the students from four groups into which the they were divided. The selected assessment methods are appropriate to evaluate educational objectives; the contents are selected by means of a specification matrix; there is a high correlation coefficient between the grades obtained in previous academic periods and the ones obtained in the course, thus demonstrating the validity of the results (both considering the whole exam or just a part of it). Most of the students are on the right side of the grading curve, which means that the majority of them acquire the knowledge expected. The assessment system used in the Psychopathology course is fair, valid and reliable, specifically concerning the objective methods used, but the conceptual evaluation should be improved or, preferably, eliminated as a constituernt part of the evaluation system. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.

  19. Reliability and validity evidence of the Assessment of Language Use in Social Contexts for Adults (ALUSCA).

    PubMed

    Valente, Ana Rita S; Hall, Andreia; Alvelos, Helena; Leahy, Margaret; Jesus, Luis M T

    2018-04-12

    The appropriate use of language in context depends on the speaker's pragmatic language competencies. A coding system was used to develop a specific and adult-focused self-administered questionnaire to adults who stutter and adults who do not stutter, The Assessment of Language Use in Social Contexts for Adults, with three categories: precursors, basic exchanges, and extended literal/non-literal discourse. This paper presents the content validity, item analysis, reliability coefficients and evidences of construct validity of the instrument. Content validity analysis was based on a two-stage process: first, 11 pragmatic questionnaires were assessed to identify items that probe each pragmatic competency and to create the first version of the instrument; second, items were assessed qualitatively by an expert panel composed by adults who stutter and controls, and quantitatively and qualitatively by an expert panel composed by clinicians. A pilot study was conducted with five adults who stutter and five controls to analyse items and calculate reliability. Construct validity evidences were obtained using the hypothesized relationships method and factor analysis with 28 adults who stutter and 28 controls. Concerning content validity, the questionnaires assessed up to 13 pragmatic competencies. Qualitative and quantitative analysis revealed ambiguities in items construction. Disagreement between experts was solved through item modification. The pilot study showed that the instrument presented internal consistency and temporal stability. Significant differences between adults who stutter and controls and different response profiles revealed the instrument's underlying construct. The instrument is reliable and presented evidences of construct validity.

  20. Procedure-specific assessment tool for flexible pharyngo-laryngoscopy: gathering validity evidence and setting pass-fail standards.

    PubMed

    Melchiors, Jacob; Petersen, K; Todsen, T; Bohr, A; Konge, Lars; von Buchwald, Christian

    2018-06-01

    The attainment of specific identifiable competencies is the primary measure of progress in the modern medical education system. The system, therefore, requires a method for accurately assessing competence to be feasible. Evidence of validity needs to be gathered before an assessment tool can be implemented in the training and assessment of physicians. This evidence of validity must according to the contemporary theory on validity be gathered from specific sources in a structured and rigorous manner. The flexible pharyngo-laryngoscopy (FPL) is central to the otorhinolaryngologist. We aim to evaluate the flexible pharyngo-laryngoscopy assessment tool (FLEXPAT) created in a previous study and to establish a pass-fail level for proficiency. Eighteen physicians with different levels of experience (novices, intermediates, and experienced) were recruited to the study. Each performed an FPL on two patients. These procedures were video recorded, blinded, and assessed by two specialists. The score was expressed as the percentage of a possible max score. Cronbach's α was used to analyze internal consistency of the data, and a generalizability analysis was performed. The scores of the three different groups were explored, and a pass-fail level was determined using the contrasting groups' standard setting method. Internal consistency was strong with a Cronbach's α of 0.86. We found a generalizability coefficient of 0.72 sufficient for moderate stakes assessment. We found a significant difference between the novice and experienced groups (p < 0.001) and strong correlation between experience and score (Pearson's r = 0.75). The pass/fail level was established at 72% of the maximum score. Applying this pass-fail level in the test population resulted in half of the intermediary group receiving a failing score. We gathered validity evidence for the FLEXPAT according to the contemporary framework as described by Messick. Our results support a claim of validity and are

  1. Validity of Dietary Assessment in Athletes: A Systematic Review

    PubMed Central

    Beck, Kathryn L.; Gifford, Janelle A.; Slater, Gary; Flood, Victoria M.; O’Connor, Helen

    2017-01-01

    Dietary assessment methods that are recognized as appropriate for the general population are usually applied in a similar manner to athletes, despite the knowledge that sport-specific factors can complicate assessment and impact accuracy in unique ways. As dietary assessment methods are used extensively within the field of sports nutrition, there is concern the validity of methodologies have not undergone more rigorous evaluation in this unique population sub-group. The purpose of this systematic review was to compare two or more methods of dietary assessment, including dietary intake measured against biomarkers or reference measures of energy expenditure, in athletes. Six electronic databases were searched for English-language, full-text articles published from January 1980 until June 2016. The search strategy combined the following keywords: diet, nutrition assessment, athlete, and validity; where the following outcomes are reported but not limited to: energy intake, macro and/or micronutrient intake, food intake, nutritional adequacy, diet quality, or nutritional status. Meta-analysis was performed on studies with sufficient methodological similarity, with between-group standardized mean differences (or effect size) and 95% confidence intervals (CI) being calculated. Of the 1624 studies identified, 18 were eligible for inclusion. Studies comparing self-reported energy intake (EI) to energy expenditure assessed via doubly labelled water were grouped for comparison (n = 11) and demonstrated mean EI was under-estimated by 19% (−2793 ± 1134 kJ/day). Meta-analysis revealed a large pooled effect size of −1.006 (95% CI: −1.3 to −0.7; p < 0.001). The remaining studies (n = 7) compared a new dietary tool or instrument to a reference method(s) (e.g., food record, 24-h dietary recall, biomarker) as part of a validation study. This systematic review revealed there are limited robust studies evaluating dietary assessment methods in athletes. Existing literature

  2. System for assessing Aviation's Global Emissions (SAGE). Version 1.5 : validation assessment, model assumptions and uncertainties

    DOT National Transportation Integrated Search

    2005-09-01

    The United States (US) Federal Aviation Administration (FAA) Office of Environment and Energy (AEE) has : developed the System for assessing Aviations Global Emissions (SAGE) with support from the Volpe National : Transportation Systems Center (Vo...

  3. Validity and reliability of Internet-based physiotherapy assessment for musculoskeletal disorders: a systematic review.

    PubMed

    Mani, Suresh; Sharma, Shobha; Omar, Baharudin; Paungmali, Aatit; Joseph, Leonard

    2017-04-01

    Purpose The purpose of this review is to systematically explore and summarise the validity and reliability of telerehabilitation (TR)-based physiotherapy assessment for musculoskeletal disorders. Method A comprehensive systematic literature review was conducted using a number of electronic databases: PubMed, EMBASE, PsycINFO, Cochrane Library and CINAHL, published between January 2000 and May 2015. The studies examined the validity, inter- and intra-rater reliabilities of TR-based physiotherapy assessment for musculoskeletal conditions were included. Two independent reviewers used the Quality Appraisal Tool for studies of diagnostic Reliability (QAREL) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool to assess the methodological quality of reliability and validity studies respectively. Results A total of 898 hits were achieved, of which 11 articles based on inclusion criteria were reviewed. Nine studies explored the concurrent validity, inter- and intra-rater reliabilities, while two studies examined only the concurrent validity. Reviewed studies were moderate to good in methodological quality. The physiotherapy assessments such as pain, swelling, range of motion, muscle strength, balance, gait and functional assessment demonstrated good concurrent validity. However, the reported concurrent validity of lumbar spine posture, special orthopaedic tests, neurodynamic tests and scar assessments ranged from low to moderate. Conclusion TR-based physiotherapy assessment was technically feasible with overall good concurrent validity and excellent reliability, except for lumbar spine posture, orthopaedic special tests, neurodynamic testa and scar assessment.

  4. Development and validation of a scale for mouth handicap in systemic sclerosis: the Mouth Handicap in Systemic Sclerosis scale

    PubMed Central

    Mouthon, L; Rannou, F; Bérezné, A; Pagnoux, C; Arène, J‐P; Foïs, E; Cabane, J; Guillevin, L; Revel, M; Fermanian, J; Poiraudeau, S

    2007-01-01

    Objective To develop and assess the reliability and construct validity of a scale assessing disability involving the mouth in systemic sclerosis (SSc). Methods We generated a 34‐item provisional scale from mailed responses of patients (n = 74), expert consensus (n = 10) and literature analysis. A total of 71 other SSc patients were recruited. The test–retest reliability was assessed using the intraclass coefficient correlation and divergent validity using the Spearman correlation coefficient. Factor analysis followed by varimax rotation was performed to assess the factorial structure of the scale. Results The item reduction process retained 12 items with 5 levels of answers (total score range 0–48). The mean total score of the scale was 20.3 (SD 9.7). The test–retest reliability was 0.96. Divergent validity was confirmed for global disability (Health Assessment Questionnaire (HAQ), r = 0.33), hand function (Cochin Hand Function Scale, r = 0.37), inter‐incisor distance (r = −0.34), handicap (McMaster‐Toronto Arthritis questionnaire (MACTAR), r = 0.24), depression (Hospital Anxiety and Depression (HAD); HADd, r = 0.26) and anxiety (HADa, r = 0.17). Factor analysis extracted 3 factors with eigenvalues of 4.26, 1.76 and 1.47, explaining 63% of the variance. These 3 factors could be clinically characterised. The first factor (5 items) represents handicap induced by the reduction in mouth opening, the second (5 items) handicap induced by sicca syndrome and the third (2 items) aesthetic concerns. Conclusion We propose a new scale, the Mouth Handicap in Systemic Sclerosis (MHISS) scale, which has excellent reliability and good construct validity, and assesses specifically disability involving the mouth in patients with SSc. PMID:17502364

  5. Reliability and Validity of the Footprint Assessment Method Using Photoshop CS5 Software.

    PubMed

    Gutiérrez-Vilahú, Lourdes; Massó-Ortigosa, Núria; Costa-Tutusaus, Lluís; Guerra-Balic, Myriam

    2015-05-01

    Several sophisticated methods of footprint analysis currently exist. However, it is sometimes useful to apply standard measurement methods of recognized evidence with an easy and quick application. We sought to assess the reliability and validity of a new method of footprint assessment in a healthy population using Photoshop CS5 software (Adobe Systems Inc, San Jose, California). Forty-two footprints, corresponding to 21 healthy individuals (11 men with a mean ± SD age of 20.45 ± 2.16 years and 10 women with a mean ± SD age of 20.00 ± 1.70 years) were analyzed. Footprints were recorded in static bipedal standing position using optical podography and digital photography. Three trials for each participant were performed. The Hernández-Corvo, Chippaux-Smirak, and Staheli indices and the Clarke angle were calculated by manual method and by computerized method using Photoshop CS5 software. Test-retest was used to determine reliability. Validity was obtained by intraclass correlation coefficient (ICC). The reliability test for all of the indices showed high values (ICC, 0.98-0.99). Moreover, the validity test clearly showed no difference between techniques (ICC, 0.99-1). The reliability and validity of a method to measure, assess, and record the podometric indices using Photoshop CS5 software has been demonstrated. This provides a quick and accurate tool useful for the digital recording of morphostatic foot study parameters and their control.

  6. Interaction of Theory and Practice to Assess External Validity.

    PubMed

    Leviton, Laura C; Trujillo, Mathew D

    2016-01-18

    Variations in local context bedevil the assessment of external validity: the ability to generalize about effects of treatments. For evaluation, the challenges of assessing external validity are intimately tied to the translation and spread of evidence-based interventions. This makes external validity a question for decision makers, who need to determine whether to endorse, fund, or adopt interventions that were found to be effective and how to ensure high quality once they spread. To present the rationale for using theory to assess external validity and the value of more systematic interaction of theory and practice. We review advances in external validity, program theory, practitioner expertise, and local adaptation. Examples are provided for program theory, its adaptation to diverse contexts, and generalizing to contexts that have not yet been studied. The often critical role of practitioner experience is illustrated in these examples. Work is described that the Robert Wood Johnson Foundation is supporting to study treatment variation and context more systematically. Researchers and developers generally see a limited range of contexts in which the intervention is implemented. Individual practitioners see a different and often a wider range of contexts, albeit not a systematic sample. Organized and taken together, however, practitioner experiences can inform external validity by challenging the developers and researchers to consider a wider range of contexts. Researchers have developed a variety of ways to adapt interventions in light of such challenges. In systematic programs of inquiry, as opposed to individual studies, the problems of context can be better addressed. Evaluators have advocated an interaction of theory and practice for many years, but the process can be made more systematic and useful. Systematic interaction can set priorities for assessment of external validity by examining the prevalence and importance of context features and treatment

  7. Development and preliminary validation of an interactive remote physical therapy system.

    PubMed

    Mishra, Anup K; Skubic, Marjorie; Abbott, Carmen

    2015-01-01

    In this paper, we present an interactive physical therapy system (IPTS) for remote quantitative assessment of clients in the home. The system consists of two different interactive interfaces connected through a network, for a real-time low latency video conference using audio, video, skeletal, and depth data streams from a Microsoft Kinect. To test the potential of IPTS, experiments were conducted with 5 independent living senior subjects in Kansas City, MO. Also, experiments were conducted in the lab to validate the real-time biomechanical measures calculated using the skeletal data from the Microsoft Xbox 360 Kinect and Microsoft Xbox One Kinect, with ground truth data from a Vicon motion capture system. Good agreements were found in the validation tests. The results show potential capabilities of the IPTS system to provide remote physical therapy to clients, especially older adults, who may find it difficult to visit the clinic.

  8. Reliability and validity analysis of the transfer assessment instrument.

    PubMed

    McClure, Laura A; Boninger, Michael L; Ozawa, Haishin; Koontz, Alicia

    2011-03-01

    To describe the development and evaluate the reliability and validity of a newly created outcome measure, the Transfer Assessment Instrument (TAI), to assess the quality of transfers performed by full-time wheelchair users. Repeated measures. 2009 National Veterans Wheelchair Games in Spokane, WA. A convenience sample of full-time wheelchair users (N=40) who perform sitting pivot or standing pivot transfers. Not applicable. Intraclass correlation coefficients (ICCs) for reliability and Spearman correlation coefficients for concurrent validity between the TAI and a global assessment scale (0-100 visual analog scale [VAS]). No adverse events occurred during testing. Intrarater ICCs for 3 raters ranged between .35 and .89, and the interrater ICC was .642. Correlations between the TAI and a global assessment VAS ranged between .19 (P=.285) and .69 (P>.000). Item analyses of the tool found a wide range of results, from weak to good reliability. Evaluators found the TAI to be safe and able to be completed in a short time. The TAI is a safe, quick outcome measure that uses equipment typically found in a clinical setting and does not ask participants to perform new skills. Reliability and validity testing found the TAI to have acceptable interrater and a wide range of intrarater reliability. Future work indicates the need for continued refinement including removal or modification of items found to have low reliability, improved education for clinicians, and further reliability and validity analysis with a more diverse subject population. The TAI has the potential to fill a void in assessment of transfers. Copyright © 2011 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  9. Gait assessment using the Microsoft Xbox One Kinect: Concurrent validity and inter-day reliability of spatiotemporal and kinematic variables.

    PubMed

    Mentiplay, Benjamin F; Perraton, Luke G; Bower, Kelly J; Pua, Yong-Hao; McGaw, Rebekah; Heywood, Sophie; Clark, Ross A

    2015-07-16

    The revised Xbox One Kinect, also known as the Microsoft Kinect V2 for Windows, includes enhanced hardware which may improve its utility as a gait assessment tool. This study examined the concurrent validity and inter-day reliability of spatiotemporal and kinematic gait parameters estimated using the Kinect V2 automated body tracking system and a criterion reference three-dimensional motion analysis (3DMA) marker-based camera system. Thirty healthy adults performed two testing sessions consisting of comfortable and fast paced walking trials. Spatiotemporal outcome measures related to gait speed, speed variability, step length, width and time, foot swing velocity and medial-lateral and vertical pelvis displacement were examined. Kinematic outcome measures including ankle flexion, knee flexion and adduction and hip flexion were examined. To assess the agreement between Kinect and 3DMA systems, Bland-Altman plots, relative agreement (Pearson's correlation) and overall agreement (concordance correlation coefficients) were determined. Reliability was assessed using intraclass correlation coefficients, Cronbach's alpha and standard error of measurement. The spatiotemporal measurements had consistently excellent (r≥0.75) concurrent validity, with the exception of modest validity for medial-lateral pelvis sway (r=0.45-0.46) and fast paced gait speed variability (r=0.73). In contrast kinematic validity was consistently poor to modest, with all associations between the systems weak (r<0.50). In those measures with acceptable validity, the inter-day reliability was similar between systems. In conclusion, while the Kinect V2 body tracking may not accurately obtain lower body kinematic data, it shows great potential as a tool for measuring spatiotemporal aspects of gait. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Reliability and concurrent validity of the Microsoft Xbox One Kinect for assessment of standing balance and postural control.

    PubMed

    Clark, Ross A; Pua, Yong-Hao; Oliveira, Cristino C; Bower, Kelly J; Thilarajah, Shamala; McGaw, Rebekah; Hasanki, Ksaniel; Mentiplay, Benjamin F

    2015-07-01

    The Microsoft Kinect V2 for Windows, also known as the Xbox One Kinect, includes new and potentially far improved depth and image sensors which may increase its accuracy for assessing postural control and balance. The aim of this study was to assess the concurrent validity and reliability of kinematic data recorded using a marker-based three dimensional motion analysis (3DMA) system and the Kinect V2 during a variety of static and dynamic balance assessments. Thirty healthy adults performed two sessions, separated by one week, consisting of static standing balance tests under different visual (eyes open vs. closed) and supportive (single limb vs. double limb) conditions, and dynamic balance tests consisting of forward and lateral reach and an assessment of limits of stability. Marker coordinate and joint angle data were concurrently recorded using the Kinect V2 skeletal tracking algorithm and the 3DMA system. Task-specific outcome measures from each system on Day 1 and 2 were compared. Concurrent validity of trunk angle data during the dynamic tasks and anterior-posterior range and path length in the static balance tasks was excellent (Pearson's r>0.75). In contrast, concurrent validity for medial-lateral range and path length was poor to modest for all trials except single leg eyes closed balance. Within device test-retest reliability was variable; however, the results were generally comparable between devices. In conclusion, the Kinect V2 has the potential to be used as a reliable and valid tool for the assessment of some aspects of balance performance. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Reliability and Validity of the Greek Migraine Disability Assessment (MIDAS) Questionnaire.

    PubMed

    Oikonomidi, Theodora; Vikelis, Michail; Artemiadis, Artemios; Chrousos, George P; Darviri, Christina

    2018-03-01

    The Migraine Disability Assessment (MIDAS) Questionnaire is a reliable and valid instrument for migraine-related disability. Such a tool is needed to quantify migraine-related disability in the Greek population. This validation study aims to assess the test-retest reliability, internal consistency, item discriminant and convergent validity of the Greek translation of the MIDAS. Adults diagnosed with migraine completed the MIDAS Questionnaire on two occasions 3 weeks apart to assess reliability, and completed the RAND-36 to assess validity. Participants (n = 152) had a median MIDAS score of 24 and mostly severe disability (58% were grade IV). The test-retest reliability analysis (N = 59) revealed excellent reliability for the total score. Internal consistency was α = 0.71 for initial and α = 0.82 for retest completion. For item discriminant validity, the correlations between each question and the total score were significant, with high correlations for questions 2-5 (range 0.67 ≤ r ≤ 0.79; p < 0.01). For convergent validity, there was significant negative correlation between the total score and all RAND-36 subscales except for 'emotional wellbeing'. The negative correlation indicates that patients with a lower degree of disability according to their MIDAS score tended to have better wellbeing. Psychometric properties are comparable with those of other published validation studies of the MIDAS and the original. Findings on question 1 show that missing work/school days may be closely related with increased affect issues. The Greek version of the MIDAS Questionnaire has good reliability and validity. This study allowed for cross-cultural comparability of research findings.

  12. Validity of Sensory Systems as Distinct Constructs

    PubMed Central

    Su, Chia-Ting

    2014-01-01

    This study investigated the validity of sensory systems as distinct measurable constructs as part of a larger project examining Ayres’s theory of sensory integration. Confirmatory factor analysis (CFA) was conducted to test whether sensory questionnaire items represent distinct sensory system constructs. Data were obtained from clinical records of two age groups, 2- to 5-yr-olds (n = 231) and 6- to 10-yr-olds (n = 223). With each group, we tested several CFA models for goodness of fit with the data. The accepted model was identical for each group and indicated that tactile, vestibular–proprioceptive, visual, and auditory systems form distinct, valid factors that are not age dependent. In contrast, alternative models that grouped items according to sensory processing problems (e.g., over- or underresponsiveness within or across sensory systems) did not yield valid factors. Results indicate that distinct sensory system constructs can be measured validly using questionnaire data. PMID:25184467

  13. Modelling the pre-assessment learning effects of assessment: evidence in the validity chain.

    PubMed

    Cilliers, Francois J; Schuwirth, Lambert W T; van der Vleuten, Cees P M

    2012-11-01

    We previously developed a model of the pre-assessment learning effects of consequential assessment and started to validate it. The model comprises assessment factors, mechanism factors and learning effects. The purpose of this study was to continue the validation process. For stringency, we focused on a subset of assessment factor-learning effect associations that featured least commonly in a baseline qualitative study. Our aims were to determine whether these uncommon associations were operational in a broader but similar population to that in which the model was initially derived. A cross-sectional survey of 361 senior medical students at one medical school was undertaken using a purpose-made questionnaire based on a grounded theory and comprising pairs of written situational tests. In each pair, the manifestation of an assessment factor was varied. The frequencies at which learning effects were selected were compared for each item pair, using an adjusted alpha to assign significance. The frequencies at which mechanism factors were selected were calculated. There were significant differences in the learning effect selected between the two scenarios of an item pair for 13 of this subset of 21 uncommon associations, even when a p-value of < 0.00625 was considered to indicate significance. Three mechanism factors were operational in most scenarios: agency; response efficacy, and response value. For a subset of uncommon associations in the model, the role of most assessment factor-learning effect associations and the mechanism factors involved were supported in a broader but similar population to that in which the model was derived. Although model validation is an ongoing process, these results move the model one step closer to the stage of usefully informing interventions. Results illustrate how factors not typically included in studies of the learning effects of assessment could confound the results of interventions aimed at using assessment to influence learning

  14. The CMEMS-Med-MFC-Biogeochemistry operational system: implementation of NRT and Multi-Year validation tools

    NASA Astrophysics Data System (ADS)

    Salon, Stefano; Cossarini, Gianpiero; Bolzon, Giorgio; Teruzzi, Anna

    2017-04-01

    The Mediterranean Monitoring and Forecasting Centre (Med-MFC) is one of the regional production centres of the EU Copernicus Marine Environment Monitoring Service (CMEMS). Med-MFC manages a suite of numerical model systems for the operational delivery of the CMEMS products, providing continuous monitoring and forecasting of the Mediterranean marine environment. The CMEMS products of fundamental biogeochemical variables (chlorophyll, nitrate, phosphate, oxygen, phytoplankton biomass, primary productivity, pH, pCO2) are organised as gridded datasets and are available at the marine.copernicus.eu web portal. Quantitative estimates of CMEMS products accuracy are prerequisites to release reliable information to intermediate users, end users and to other downstream services. In particular, validation activities aim to deliver accuracy information of the model products and to serve as a long term monitoring of the performance of the modelling systems. The quality assessment of model output is implemented using a multiple-stages approach, basically inspired to the classic "GODAE 4 Classes" metrics and criteria (consistency, quality, performance and benefit). Firstly, pre-operational runs qualify the operational model system against historical data, also providing a verification of the improvements of the new model system release with respect to the previous version. Then, the near real time (NRT) validation aims at delivering a sustained on-line skill assessment of the model analysis and forecast, relying on the NRT available relevant observations (e.g. in situ, Bio Argo and satellite observations). NRT validation results are operated on weekly basis and published on the MEDEAF web portal (www.medeaf.inogs.it). On a quarterly basis, the integration of the NRT validation activities delivers a comprehensive view of the accuracy of model forecast through the official CMEMS validation webpage. Multi-Year production (e.g. reanalysis runs) follows a similar procedure, and the

  15. An Extended Validity Argument for Assessing Feedback Culture.

    PubMed

    Rougas, Steven; Clyne, Brian; Cianciolo, Anna T; Chan, Teresa M; Sherbino, Jonathan; Yarris, Lalena M

    2015-01-01

    NEGEA 2015 CONFERENCE ABSTRACT (EDITED): Measuring an Organization's Culture of Feedback: Can It Be Done? Steven Rougas and Brian Clyne. CONSTRUCT: This study sought to develop a construct for measuring formative feedback culture in an academic emergency medicine department. Four archetypes (Market, Adhocracy, Clan, Hierarchy) reflecting an organization's values with respect to focus (internal vs. external) and process (flexibility vs. stability and control) were used to characterize one department's receptiveness to formative feedback. The prevalence of residents' identification with certain archetypes served as an indicator of the department's organizational feedback culture. New regulations have forced academic institutions to implement wide-ranging changes to accommodate competency-based milestones and their assessment. These changes challenge residencies that use formative feedback from faculty as a major source of data for determining training advancement. Though various approaches have been taken to improve formative feedback to residents, there currently exists no tool to objectively measure the organizational culture that surrounds this process. Assessing organizational culture, commonly used in the business sector to represent organizational health, may help residency directors gauge their program's success in fostering formative feedback. The Organizational Culture Assessment Instrument (OCAI) is widely used, extensively validated, applicable to survey research, and theoretically based and may be modifiable to assess formative feedback culture in the emergency department. Using a modified Delphi technique and several iterations of focus groups amongst educators at one institution, four of the original six OCAI domains (which each contain 4 possible responses) were modified to create a 16-item Formative Feedback Culture Tool (FFCT) that was administered to 26 residents (response rate = 55%) at a single academic emergency medicine department. The mean

  16. The Validity of Individual Rorschach Variables: Systematic Reviews and Meta-Analyses of the Comprehensive System

    ERIC Educational Resources Information Center

    Mihura, Joni L.; Meyer, Gregory J.; Dumitrascu, Nicolae; Bombel, George

    2013-01-01

    We systematically evaluated the peer-reviewed Rorschach validity literature for the 65 main variables in the popular Comprehensive System (CS). Across 53 meta-analyses examining variables against externally assessed criteria (e.g., observer ratings, psychiatric diagnosis), the mean validity was r = 0.27 (k = 770) as compared to r = 0.08 (k = 386)…

  17. Validation of biomarkers of food intake-critical assessment of candidate biomarkers.

    PubMed

    Dragsted, L O; Gao, Q; Scalbert, A; Vergères, G; Kolehmainen, M; Manach, C; Brennan, L; Afman, L A; Wishart, D S; Andres Lacueva, C; Garcia-Aloy, M; Verhagen, H; Feskens, E J M; Praticò, G

    2018-01-01

    Biomarkers of food intake (BFIs) are a promising tool for limiting misclassification in nutrition research where more subjective dietary assessment instruments are used. They may also be used to assess compliance to dietary guidelines or to a dietary intervention. Biomarkers therefore hold promise for direct and objective measurement of food intake. However, the number of comprehensively validated biomarkers of food intake is limited to just a few. Many new candidate biomarkers emerge from metabolic profiling studies and from advances in food chemistry. Furthermore, candidate food intake biomarkers may also be identified based on extensive literature reviews such as described in the guidelines for Biomarker of Food Intake Reviews (BFIRev). To systematically and critically assess the validity of candidate biomarkers of food intake, it is necessary to outline and streamline an optimal and reproducible validation process. A consensus-based procedure was used to provide and evaluate a set of the most important criteria for systematic validation of BFIs. As a result, a validation procedure was developed including eight criteria, plausibility, dose-response, time-response, robustness, reliability, stability, analytical performance, and inter-laboratory reproducibility. The validation has a dual purpose: (1) to estimate the current level of validation of candidate biomarkers of food intake based on an objective and systematic approach and (2) to pinpoint which additional studies are needed to provide full validation of each candidate biomarker of food intake. This position paper on biomarker of food intake validation outlines the second step of the BFIRev procedure but may also be used as such for validation of new candidate biomarkers identified, e.g., in food metabolomic studies.

  18. The Stroop test as a measure of performance validity in adults clinically referred for neuropsychological assessment.

    PubMed

    Erdodi, Laszlo A; Sagar, Sanya; Seke, Kristian; Zuccato, Brandon G; Schwartz, Eben S; Roth, Robert M

    2018-06-01

    This study was designed to develop performance validity indicators embedded within the Delis-Kaplan Executive Function Systems (D-KEFS) version of the Stroop task. Archival data from a mixed clinical sample of 132 patients (50% male; M Age = 43.4; M Education = 14.1) clinically referred for neuropsychological assessment were analyzed. Criterion measures included the Warrington Recognition Memory Test-Words and 2 composites based on several independent validity indicators. An age-corrected scaled score ≤6 on any of the 4 trials reliably differentiated psychometrically defined credible and noncredible response sets with high specificity (.87-.94) and variable sensitivity (.34-.71). An inverted Stroop effect was less sensitive (.14-.29), but comparably specific (.85-90) to invalid performance. Aggregating the newly developed D-KEFS Stroop validity indicators further improved classification accuracy. Failing the validity cutoffs was unrelated to self-reported depression or anxiety. However, it was associated with elevated somatic symptom report. In addition to processing speed and executive function, the D-KEFS version of the Stroop task can function as a measure of performance validity. A multivariate approach to performance validity assessment is generally superior to univariate models. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  19. Content validity of the Geriatric Health Assessment Instrument

    PubMed Central

    Pedreira, Rhaine Borges Santos; Rocha, Saulo Vasconcelos; dos Santos, Clarice Alves; Vasconcelos, Lélia Renata Carneiro; Reis, Martha Cerqueira

    2016-01-01

    ABSTRACT Objective Assess the content validity of the Elderly Health Assessment Tool with low education. Methods The data collection instrument/questionnaire was prepared and submitted to an expert panel comprising four healthcare professionals experienced in research on epidemiology of aging. The experts were allowed to suggest item inclusion/exclusion and were asked to rate the ability of individual items in questionnaire blocks to encompass target dimensions as “not valid”, “somewhat valid” or “valid”, using an interval scale. Percent agreement and the Content Validity Index were used as measurements of inter-rater agreement; the minimum acceptable inter-rater agreement was set at 80%. Results The mean instrument percent agreement rate was 86%, ranging from 63 to 99%, and from 50 to 100% between and within blocks respectively. The Mean Content Validity Index score was 93.47%, ranging from 50 to 100% between individual items. Conclusion The instrument showed acceptable psychometric properties for application in geriatric populations with low levels of education. It enabled identifying diseases and assisted in choice of strategies related to health of the elderly. PMID:27462889

  20. Preschool Temperament Assessment: A Quantitative Assessment of the Validity of Behavioral Style Questionnaire Data

    ERIC Educational Resources Information Center

    Huelsman, Timothy J.; Gagnon, Sandra Glover; Kidder-Ashley, Pamela; Griggs, Marissa Swaim

    2014-01-01

    Research Findings: Child temperament is an important construct, but its measurement has been marked by a number of weaknesses that have diminished the frequency with which it is assessed in practice. We address this problem by presenting the results of a quantitative construct validation study. We calculated validity indices by hypothesizing the…

  1. Incremental Validity of Test Session and Classroom Observations in a Multimethod Assessment of Attention Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    McConaughy, Stephanie H.; Harder, Valerie S.; Antshel, Kevin M.; Gordon, Michael; Eiraldi, Ricardo; Dumenci, Levent

    2010-01-01

    This study tested the incremental validity of behavioral observations, over and above parent and teacher reports, for assessing symptoms of Attention Deficit/Hyperactivity Disorder (ADHD) in children ages 6 to 12, using the Test Observation Form (TOF) and Direct Observation Form (DOF) from the Achenbach System of Empirically Based Assessment. The…

  2. Multi-platform operational validation of the Western Mediterranean SOCIB forecasting system

    NASA Astrophysics Data System (ADS)

    Juza, Mélanie; Mourre, Baptiste; Renault, Lionel; Tintoré, Joaquin

    2014-05-01

    The development of science-based ocean forecasting systems at global, regional, and local scales can support a better management of the marine environment (maritime security, environmental and resources protection, maritime and commercial operations, tourism, ...). In this context, SOCIB (the Balearic Islands Coastal Observing and Forecasting System, www.socib.es) has developed an operational ocean forecasting system in the Western Mediterranean Sea (WMOP). WMOP uses a regional configuration of the Regional Ocean Modelling System (ROMS, Shchepetkin and McWilliams, 2005) nested in the larger scale Mediterranean Forecasting System (MFS) with a spatial resolution of 1.5-2km. WMOP aims at reproducing both the basin-scale ocean circulation and the mesoscale variability which is known to play a crucial role due to its strong interaction with the large scale circulation in this region. An operational validation system has been developed to systematically assess the model outputs at daily, monthly and seasonal time scales. Multi-platform observations are used for this validation, including satellite products (Sea Surface Temperature, Sea Level Anomaly), in situ measurements (from gliders, Argo floats, drifters and fixed moorings) and High-Frequency radar data. The validation procedures allow to monitor and certify the general realism of the daily production of the ocean forecasting system before its distribution to users. Additionally, different indicators (Sea Surface Temperature and Salinity, Eddy Kinetic Energy, Mixed Layer Depth, Heat Content, transports in key sections) are computed every day both at the basin-scale and in several sub-regions (Alboran Sea, Balearic Sea, Gulf of Lion). The daily forecasts, validation diagnostics and indicators from the operational model over the last months are available at www.socib.es.

  3. Construct Validity of Three Clerkship Performance Assessments

    ERIC Educational Resources Information Center

    Lee, Ming; Wimmers, Paul F.

    2010-01-01

    This study examined construct validity of three commonly used clerkship performance assessments: preceptors' evaluations, OSCE-type clinical performance measures, and the NBME [National Board of Medical Examiners] medicine subject examination. Six hundred and eighty-six students taking the inpatient medicine clerkship from 2003 to 2007…

  4. When Significant Others Suffer: German Validation of the Burden Assessment Scale (BAS)

    PubMed Central

    Hunger, Christina; Krause, Lena; Hilzinger, Rebecca; Ditzen, Beate; Schweitzer, Jochen

    2016-01-01

    There is a need of an economical, reliable, and valid instrument in the German-speaking countries to measure the burden of relatives who care for mentally ill persons. We translated the Burden Assessment Scale (BAS) and conducted a study investigating factor structure, psychometric quality and predictive validity. We used confirmative factor analyses (CFA, maximum-likelihood method) to examine the dimensionality of the German BAS in a sample of 215 relatives (72% women; M = 32 years, SD = 14, range: 18 to 77; 39% employed) of mentally ill persons (50% (ex-)partner or (best) friend; M = 32 years, SD = 13, range 8 to 64; main complaints were depression and/or anxiety). Cronbach’s α determined the internal consistency. We examined predictive validity using regression analyses including the BAS and validated scales of social systems functioning (Experience In Social Systems Questionnaire, EXIS.pers, EXIS.org) and psychopathology (Brief Symptom Inventory, BSI). Variables that might have influenced the dependent variables (e.g. age, gender, education, employment and civil status) were controlled by their introduction in the first step, and the BAS in the second step of the regression analyses. A model with four correlated factors (Disrupted Activities, Personal Distress, Time Perspective, Guilt) showed the best fit. With respect to the number of items included, the internal consistency was very good. The modified German BAS predicted relatives’ social systems functioning and psychopathology. The economical design makes the 19-item BAS promising for practice-oriented research, and for studies under time constraints. Strength, limitations and future directions are discussed. PMID:27764109

  5. A knowledge-driven approach to cluster validity assessment.

    PubMed

    Bolshakova, Nadia; Azuaje, Francisco; Cunningham, Pádraig

    2005-05-15

    This paper presents an approach to assessing cluster validity based on similarity knowledge extracted from the Gene Ontology. The program is freely available for non-profit use on request from the authors.

  6. Validity: Applying Current Concepts and Standards to Gynecologic Surgery Performance Assessments

    ERIC Educational Resources Information Center

    LeClaire, Edgar L.; Nihira, Mikio A.; Hardré, Patricia L.

    2015-01-01

    Validity is critical for meaningful assessment of surgical competency. According to the Standards for Educational and Psychological Testing, validation involves the integration of data from well-defined classifications of evidence. In the authoritative framework, data from all classifications support construct validity claims. The two aims of this…

  7. Validity, Reliability, and Inertia of Four Different Temperature Capsule Systems.

    PubMed

    Bongers, Coen C W G; Daanen, Hein A M; Bogerd, Cornelis P; Hopman, Maria T E; Eijsvogels, Thijs M H

    2018-01-01

    Telemetric temperature capsule systems are wireless, relatively noninvasive, and easily applicable in field conditions and have therefore great advantages for monitoring core body temperature. However, the accuracy and responsiveness of available capsule systems have not been compared previously. Therefore, the aim of this study was to examine the validity, reliability, and inertia characteristics of four ingestible temperature capsule systems (i.e., CorTemp, e-Celsius, myTemp, and VitalSense). Ten temperature capsules were examined for each system in a temperature-controlled water bath during three trials. The water bath temperature gradually increased from 33°C to 44°C in trials 1 and 2 to assess the validity and reliability, and from 36°C to 42°C in trial 3 to assess the inertia characteristics of the temperature capsules. A systematic difference between capsule and water bath temperature was found for CorTemp (0.077°C ± 0.040°C), e-Celsius (-0.081°C ± 0.055°C), myTemp (-0.003°C ± 0.006°C), and VitalSense (-0.017°C ± 0.023°C; P < 0.010), with the lowest bias for the myTemp system (P < 0.001). A systematic difference was found between trial 1 and trial 2 for CorTemp (0.017°C ± 0.083°C; P = 0.030) and e-Celsius (-0.007°C ± 0.033°C; P = 0.019), whereas temperature values of myTemp (0.001°C ± 0.008°C) and VitalSense (0.002°C ± 0.014°C) did not differ (P > 0.05). Comparable inertia characteristics were found for CorTemp (25 ± 4 s), e-Celsius (21 ± 13 s), and myTemp (19 ± 2 s), whereas the VitalSense system responded more slowly (39 ± 6 s) to changes in water bath temperature (P < 0.001). Although differences in temperature and inertia were observed between capsule systems, an excellent validity, test-retest reliability, and inertia was found for each system between 36°C and 44°C after removal of outliers.

  8. The Communication Function Classification System: cultural adaptation, validity, and reliability of the Farsi version for patients with cerebral palsy.

    PubMed

    Soleymani, Zahra; Joveini, Ghodsiye; Baghestani, Ahmad Reza

    2015-03-01

    This study developed a Farsi language Communication Function Classification System and then tested its reliability and validity. Communication Function Classification System is designed to classify the communication functions of individuals with cerebral palsy. Up until now, there has been no instrument for assessment of this communication function in Iran. The English Communication Function Classification System was translated into Farsi and cross-culturally modified by a panel of experts. Professionals and parents then assessed the content validity of the modified version. A backtranslation of the Farsi version was confirmed by the developer of the English Communication Function Classification System. Face validity was assessed by therapists and parents of 10 patients. The Farsi Communication Function Classification System was administered to 152 individuals with cerebral palsy (age, 2 to 18 years; median age, 10 years; mean age, 9.9 years; standard deviation, 4.3 years). Inter-rater reliability was analyzed between parents, occupational therapists, and speech and language pathologists. The test-retest reliability was assessed for 75 patients with a 14 day interval between tests. The inter-rater reliability of the Communication Function Classification System was 0.81 between speech and language pathologists and occupational therapists, 0.74 between parents and occupational therapists, and 0.88 between parents and speech and language pathologists. The test-retest reliability was 0.96 for occupational therapists, 0.98 for speech and language pathologists, and 0.94 for parents. The findings suggest that the Farsi version of Communication Function Classification System is a reliable and valid measure that can be used in clinical settings to assess communication function in patients with cerebral palsy. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Pain assessment in children: theoretical and empirical validity.

    PubMed

    Villarruel, A M; Denyes, M J

    1991-12-01

    Valid assessment of pain in children is foundational for both the nursing practice and research domains, yet few validated methods of pain measurement are currently available for young children. This article describes an innovative research approach used in the development of photographic instruments to measure pain intensity in young African-American and Hispanic children. The instruments were designed to enable children to participate actively in their own care and to do so in ways that are congruent with their developmental and cultural heritage. Conceptualization of the instruments, methodological development, and validation processes grounded in Orem's Self-Care Deficit Theory of Nursing are described. The authors discuss the ways in which the gaps between nursing theory, research, and practice are narrowed when development of instruments to measure clinical nursing phenomena are grounded in nursing theory, validated through research and utilized in practice settings.

  10. Validation of a novel venous duplex ultrasound objective structured assessment of technical skills for the assessment of venous reflux.

    PubMed

    Jaffer, Usman; Normahani, Pasha; Lackenby, Kimberly; Aslam, Mohammed; Standfield, Nigel J

    2015-01-01

    Duplex ultrasound measurement of reflux time is central to the diagnosis of venous incompetence. We have developed an assessment tool for Duplex measurement of venous reflux for both simulator and patient-based training. A novel assessment tool, Venous Duplex Ultrasound Assessment of Technical Skills (V-DUOSATS), was developed. A modified DUOSATS was used for simulator training. Participants of varying skill level were invited to viewed an instructional video and were allowed ample time to familiarize with the Duplex equipment. Attempts made by the participants were recorded and independently assessed by 3 expert assessors and 5 novice assessors using the modified V-DUOSATS. "Global" assessment was also done by expert assessors on a 4-point Likert scale. Content, construct, and concurrent validities as well as reliability were evaluated. Content and construct validity as well as reliability were demonstrated. Receiver operator characteristic analysis-established cut points of 19/22 and 21/30 were most appropriate for simulator and patient-based assessment, respectively. We have validated a novel assessment tool for Duplex venous reflux measurement. Further work is required to establish transference validity of simulator training to improve skill in scanning patients. We have developed and validated V-DUOSATS for simulator training. Copyright © 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.

  11. Development and validation of a Clinical Assessment Tool for Nursing Education (CAT-NE).

    PubMed

    Skúladóttir, Hafdís; Svavarsdóttir, Margrét Hrönn

    2016-09-01

    The aim of this study was to develop a valid assessment tool to guide clinical education and evaluate students' performance in clinical nursing education. The development of the Clinical Assessment Tool for Nursing Education (CAT-NE) was based on the theory of nursing as professional caring and the Bologna learning outcomes. Benson and Clark's four steps of instrument development and validation guided the development and assessment of the tool. A mixed-methods approach with individual structured cognitive interviewing and quantitative assessments was used to validate the tool. Supervisory teachers, a pedagogical consultant, clinical expert teachers, clinical teachers, and nursing students at the University of Akureyri in Iceland participated in the process. This assessment tool is valid to assess the clinical performance of nursing students; it consists of rubrics that list the criteria for the students' expected performance. According to the students and their clinical teachers, the assessment tool clarified learning objectives, enhanced the focus of the assessment process, and made evaluation more objective. Training clinical teachers on how to assess students' performances in clinical studies and use the tool enhanced the quality of clinical assessment in nursing education. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Validity of the Microcomputer Evaluation Screening and Assessment Aptitude Scores.

    ERIC Educational Resources Information Center

    Janikowski, Timothy P.; And Others

    1991-01-01

    Examined validity of Microcomputer Evaluation Screening and Assessment (MESA) aptitude scores relative to General Aptitude Test Battery (GATB) using multitrait-multimethod correlational analyses. Findings from 54 rehabilitation clients and 29 displaced workers revealed no evidence to support the construct validity of the MESA. (Author/NB)

  13. Health Information Technology Usability Evaluation Scale (Health-ITUES) for Usability Assessment of Mobile Health Technology: Validation Study

    PubMed Central

    Cho, Hwayoung; Liu, Jianfang

    2018-01-01

    Background Mobile technology has become a ubiquitous technology and can be particularly useful in the delivery of health interventions. This technology can allow us to deliver interventions to scale, cover broad geographic areas, and deliver technologies in highly tailored ways based on the preferences or characteristics of users. The broad use of mobile technologies supports the need for usability assessments of these tools. Although there have been a number of usability assessment instruments developed, none have been validated for use with mobile technologies. Objective The goal of this work was to validate the Health Information Technology Usability Evaluation Scale (Health-ITUES), a customizable usability assessment instrument in a sample of community-dwelling adults who were testing the use of a new mobile health (mHealth) technology. Methods A sample of 92 community-dwelling adults living with HIV used a new mobile app for symptom self-management and completed the Health-ITUES to assess the usability of the app. They also completed the Post-Study System Usability Questionnaire (PSSUQ), a widely used and well-validated usability assessment tool. Correlations between these scales and each of the subscales were assessed. Results The subscales of the Health-ITUES showed high internal consistency reliability (Cronbach alpha=.85-.92). Each of the Health-ITUES subscales and the overall scale was moderately to strongly correlated with the PSSUQ scales (r=.46-.70), demonstrating the criterion validity of the Health-ITUES. Conclusions The Health-ITUES has demonstrated reliability and validity for use in assessing the usability of mHealth technologies in community-dwelling adults living with a chronic illness. PMID:29305343

  14. Health Information Technology Usability Evaluation Scale (Health-ITUES) for Usability Assessment of Mobile Health Technology: Validation Study.

    PubMed

    Schnall, Rebecca; Cho, Hwayoung; Liu, Jianfang

    2018-01-05

    Mobile technology has become a ubiquitous technology and can be particularly useful in the delivery of health interventions. This technology can allow us to deliver interventions to scale, cover broad geographic areas, and deliver technologies in highly tailored ways based on the preferences or characteristics of users. The broad use of mobile technologies supports the need for usability assessments of these tools. Although there have been a number of usability assessment instruments developed, none have been validated for use with mobile technologies. The goal of this work was to validate the Health Information Technology Usability Evaluation Scale (Health-ITUES), a customizable usability assessment instrument in a sample of community-dwelling adults who were testing the use of a new mobile health (mHealth) technology. A sample of 92 community-dwelling adults living with HIV used a new mobile app for symptom self-management and completed the Health-ITUES to assess the usability of the app. They also completed the Post-Study System Usability Questionnaire (PSSUQ), a widely used and well-validated usability assessment tool. Correlations between these scales and each of the subscales were assessed. The subscales of the Health-ITUES showed high internal consistency reliability (Cronbach alpha=.85-.92). Each of the Health-ITUES subscales and the overall scale was moderately to strongly correlated with the PSSUQ scales (r=.46-.70), demonstrating the criterion validity of the Health-ITUES. The Health-ITUES has demonstrated reliability and validity for use in assessing the usability of mHealth technologies in community-dwelling adults living with a chronic illness. ©Rebecca Schnall, Hwayoung Cho, Jianfang Liu. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 05.01.2018.

  15. Validity as a social imperative for assessment in health professions education: a concept analysis.

    PubMed

    Marceau, Mélanie; Gallagher, Frances; Young, Meredith; St-Onge, Christina

    2018-06-01

    Assessment can have far-reaching consequences for future health care professionals and for society. Thus, it is essential to establish the quality of assessment. Few modern approaches to validity are well situated to ensure the quality of complex assessment approaches, such as authentic and programmatic assessments. Here, we explore and delineate the concept of validity as a social imperative in the context of assessment in health professions education (HPE) as a potential framework for examining the quality of complex and programmatic assessment approaches. We conducted a concept analysis using Rodgers' evolutionary method to describe the concept of validity as a social imperative in the context of assessment in HPE. Supported by an academic librarian, we developed and executed a search strategy across several databases for literature published between 1995 and 2016. From a total of 321 citations, we identified 67 articles that met our inclusion criteria. Two team members analysed the texts using a specified approach to qualitative data analysis. Consensus was achieved through full team discussions. Attributes that characterise the concept were: (i) demonstration of the use of evidence considered credible by society to document the quality of assessment; (ii) validation embedded through the assessment process and score interpretation; (iii) documented validity evidence supporting the interpretation of the combination of assessment findings, and (iv) demonstration of a justified use of a variety of evidence (quantitative and qualitative) to document the quality of assessment strategies. The emerging concept of validity as a social imperative highlights some areas of focus in traditional validation frameworks, whereas some characteristics appear unique to HPE and move beyond traditional frameworks. The study reflects the importance of embedding consideration for society and societal concerns throughout the assessment and validation process, and may represent a

  16. Validated Smartphone-Based Apps for Ear and Hearing Assessments: A Review

    PubMed Central

    Pallawela, Danuk

    2016-01-01

    Background An estimated 360 million people have a disabling hearing impairment globally, the vast majority of whom live in low- and middle-income countries (LMICs). Early identification through screening is important to negate the negative effects of untreated hearing impairment. Substantial barriers exist in screening for hearing impairment in LMICs, such as the requirement for skilled hearing health care professionals and prohibitively expensive specialist equipment to measure hearing. These challenges may be overcome through utilization of increasingly available smartphone app technologies for ear and hearing assessments that are easy to use by unskilled professionals. Objective Our objective was to identify and compare available apps for ear and hearing assessments and consider the incorporation of such apps into hearing screening programs Methods In July 2015, the commercial app stores Google Play and Apple App Store were searched to identify apps for ear and hearing assessments. Thereafter, six databases (EMBASE, MEDLINE, Global Health, Web of Science, CINAHL, and mHealth Evidence) were searched to assess which of the apps identified in the commercial review had been validated against gold standard measures. A comparison was made between validated apps. Results App store search queries returned 30 apps that could be used for ear and hearing assessments, the majority of which are for performing audiometry. The literature search identified 11 eligible validity studies that examined 6 different apps. uHear, an app for self-administered audiometry, was validated in the highest number of peer reviewed studies against gold standard pure tone audiometry (n=5). However, the accuracy of uHear varied across these studies. Conclusions Very few of the available apps have been validated in peer-reviewed studies. Of the apps that have been validated, further independent research is required to fully understand their accuracy at detecting ear and hearing conditions. PMID

  17. Validation of the process criteria for assessment of a hospital nursing service.

    PubMed

    Feldman, Liliane Bauer; Cunha, Isabel Cristina Kowal Olm; D'Innocenzo, Maria

    2013-01-01

    to validate an instrument containing process criteria for assessment of a hospital nursing service based on the National Accreditation Organization program. a descriptive, quantitative methodological study performed in stages. An instrument constructed with 69 process criteria was assessed by 49 nurses from accredited hospitals in 2009, according to a Likert scale, and validated by 16 judges through Delphi rounds in 2010. the original instrument assessed by nurses with 69 process criteria was judged by the degree of importance, and changed to 39 criteria. In the first Delphi round, the 39 criteria reached consensus among the 19 judges, with a medium reliability by Cronbach's alpha. In the second round, 40 converging criteria were validated by 16 judges, with high reliability. The criteria addressed management, costs, teaching, education, indicators, protocols, human resources, communication, among others. the 40 process criteria formed a validated instrument to assess the hospital nursing service which, when measured, can better direct interventions by nurses in reaching and strengthening outcomes.

  18. Development and Validation of the Foundational Healthcare Leadership Self-assessment.

    PubMed

    Van Hala, Sonja; Cochella, Susan; Jaggi, Rachel; Frost, Caren J; Kiraly, Bernadette; Pohl, Susan; Gren, Lisa

    2018-04-01

    We sought to develop and validate a self-assessment of foundational leadership skills for early-career physicians. We developed a leadership self-assessment from a compilation of materials on health care leadership skills. A sequential exploratory study was conducted using qualitative and quantitative analysis for face, content, and construct validity of the self-assessment. First, two focus groups were conducted with leaders in medicine and family medicine residents, to refine the pilot self-assessment. The self-assessment pilot was then tested with family medicine residents across the country, and the results were quantitatively evaluated with principal component analysis. This data was used to reduce and group the statements into leadership domains for the final self-assessment. Twenty-two invited family medicine residency programs agreed to distribute the survey. A total of 163 family medicine residents completed the survey, representing 16 to 20 residency programs from 12 states (response rate 28.9% to 34.8%). Analysis showed important differences by residency year, with more advanced residents scoring higher. The analysis reduced the number of items from 33 on the pilot assessment to 21 on the final assessment, which the authors titled the Foundational Healthcare Leadership Self-assessment (FHLS). The 21 items were grouped into five leadership domains: accountability, collaboration, communication, team management, and self-management. The FHLS is a validated 21-item self-assessment of foundational leadership skills for early career physicians. It takes less than 5 minutes to complete, and quantifies skill within five domains of foundational leadership. The FHLS is a first step in developing educational and evaluative assessments for training medical residents as clinician leaders.

  19. Modelling the pre-assessment learning effects of assessment: evidence in the validity chain

    PubMed Central

    Cilliers, Francois J; Schuwirth, Lambert W T; van der Vleuten, Cees P M

    2012-01-01

    OBJECTIVES We previously developed a model of the pre-assessment learning effects of consequential assessment and started to validate it. The model comprises assessment factors, mechanism factors and learning effects. The purpose of this study was to continue the validation process. For stringency, we focused on a subset of assessment factor–learning effect associations that featured least commonly in a baseline qualitative study. Our aims were to determine whether these uncommon associations were operational in a broader but similar population to that in which the model was initially derived. METHODS A cross-sectional survey of 361 senior medical students at one medical school was undertaken using a purpose-made questionnaire based on a grounded theory and comprising pairs of written situational tests. In each pair, the manifestation of an assessment factor was varied. The frequencies at which learning effects were selected were compared for each item pair, using an adjusted alpha to assign significance. The frequencies at which mechanism factors were selected were calculated. RESULTS There were significant differences in the learning effect selected between the two scenarios of an item pair for 13 of this subset of 21 uncommon associations, even when a p-value of < 0.00625 was considered to indicate significance. Three mechanism factors were operational in most scenarios: agency; response efficacy, and response value. CONCLUSIONS For a subset of uncommon associations in the model, the role of most assessment factor–learning effect associations and the mechanism factors involved were supported in a broader but similar population to that in which the model was derived. Although model validation is an ongoing process, these results move the model one step closer to the stage of usefully informing interventions. Results illustrate how factors not typically included in studies of the learning effects of assessment could confound the results of interventions aimed

  20. Design and validation of a portable, inexpensive and multi-beam timing light system using the Nintendo Wii hand controllers.

    PubMed

    Clark, Ross A; Paterson, Kade; Ritchie, Callan; Blundell, Simon; Bryant, Adam L

    2011-03-01

    Commercial timing light systems (CTLS) provide precise measurement of athletes running velocity, however they are often expensive and difficult to transport. In this study an inexpensive, wireless and portable timing light system was created using the infrared camera in Nintendo Wii hand controllers (NWHC). System creation with gold-standard validation. A Windows-based software program using NWHC to replicate a dual-beam timing gate was created. Firstly, data collected during 2m walking and running trials were validated against a 3D kinematic system. Secondly, data recorded during 5m running trials at various intensities from standing or flying starts were compared to a single beam CTLS and the independent and average scores of three handheld stopwatch (HS) operators. Intraclass correlation coefficient and Bland-Altman plots were used to assess validity. Absolute error quartiles and percentage of trials in absolute error threshold ranges were used to determine accuracy. The NWHC system was valid when compared against the 3D kinematic system (ICC=0.99, median absolute error (MAR)=2.95%). For the flying 5m trials the NWHC system possessed excellent validity and precision (ICC=0.97, MAR<3%) when compared with the CTLS. In contrast, the NWHC system and the HS values during standing start trials possessed only modest validity (ICC<0.75) and accuracy (MAR>8%). A NWHC timing light system is inexpensive, portable and valid for assessing running velocity. Errors in the 5m standing start trials may have been due to erroneous event detection by either the commercial or NWHC-based timing light systems. Copyright © 2010 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  1. Design and validation of a questionnaire to assess organizational culture in French hospital wards.

    PubMed

    Saillour-Glénisson, F; Domecq, S; Kret, M; Sibe, M; Dumond, J P; Michel, P

    2016-09-17

    , specific to healthcare context, for a unit level assessment showing robust psychometric properties (validity and reliability). This tool is suited for research purposes, especially for assessing organizational context in research analysing the effectiveness of hospital quality improvement strategies. Our tool is also suited for an overall assessment of ward culture and could be a powerful trigger to improve management and clinical performance. Its psychometric properties in other health systems need to be tested.

  2. Valid and Reliable Science Content Assessments for Science Teachers

    ERIC Educational Resources Information Center

    Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn

    2013-01-01

    Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper…

  3. Development of Internal System of Education Quality Assessment at a University

    ERIC Educational Resources Information Center

    Kalimullin, Aydar M.; Khodyreva, Elena ?.; Koinova-Zoellner, Julia

    2016-01-01

    The urgency of the research is determined by the need to ensure the quality of higher education an essential factor of which is development of the internal assessment system for educational activities at universities. The aim of the article is validation of the model of development of the internal assessment system for educational activities at…

  4. Validation of the use of synthetic imagery for camouflage effectiveness assessment

    NASA Astrophysics Data System (ADS)

    Newman, Sarah; Gilmore, Marilyn A.; Moorhead, Ian R.; Filbee, David R.

    2002-08-01

    CAMEO-SIM was developed as a laboratory method to assess the effectiveness of aircraft camouflage schemes. It is a physically accurate synthetic image generator, rendering in any waveband between 0.4 and 14 microns. Camouflage schemes are assessed by displaying imagery to observers under controlled laboratory conditions or by analyzing the digital image and calculating the contrast statistics between the target and background. Code verification has taken place during development. However, validation of CAMEO-SIM is essential to ensure that the imagery produced is suitable to be used for camouflage effectiveness assessment. Real world characteristics are inherently variable, so exact pixel to pixel correlation is unnecessary. For camouflage effectiveness assessment it is more important to be confident that the comparative effects of different schemes are correct, but prediction of detection ranges is also desirable. Several different tests have been undertaken to validate CAMEO-SIM for the purpose of assessing camouflage effectiveness. Simple scenes have been modeled and measured. Thermal and visual properties of the synthetic and real scenes have been compared. This paper describes the validation tests and discusses the suitability of CAMEO-SIM for camouflage assessment.

  5. 49 CFR Appendix F to Part 236 - Minimum Requirements of FRA Directed Independent Third-Party Assessment of PTC System Safety...

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... Third-Party Assessment of PTC System Safety Verification and Validation F Appendix F to Part 236... Safety Verification and Validation (a) This appendix provides minimum requirements for mandatory independent third-party assessment of PTC system safety verification and validation pursuant to subpart H or I...

  6. 49 CFR Appendix F to Part 236 - Minimum Requirements of FRA Directed Independent Third-Party Assessment of PTC System Safety...

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... Third-Party Assessment of PTC System Safety Verification and Validation F Appendix F to Part 236... Safety Verification and Validation (a) This appendix provides minimum requirements for mandatory independent third-party assessment of PTC system safety verification and validation pursuant to subpart H or I...

  7. 49 CFR Appendix F to Part 236 - Minimum Requirements of FRA Directed Independent Third-Party Assessment of PTC System Safety...

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... Third-Party Assessment of PTC System Safety Verification and Validation F Appendix F to Part 236... Safety Verification and Validation (a) This appendix provides minimum requirements for mandatory independent third-party assessment of PTC system safety verification and validation pursuant to subpart H or I...

  8. 49 CFR Appendix F to Part 236 - Minimum Requirements of FRA Directed Independent Third-Party Assessment of PTC System Safety...

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... Third-Party Assessment of PTC System Safety Verification and Validation F Appendix F to Part 236... Safety Verification and Validation (a) This appendix provides minimum requirements for mandatory independent third-party assessment of PTC system safety verification and validation pursuant to subpart H or I...

  9. Assessment of postoperative outcomes of hypospadias repair with validated questionnaires.

    PubMed

    Liu, Mona M Y; Holland, Andrew J A; Cass, Danny T

    2015-12-01

    A standardized assessment for the optimal repair of hypospadias remains elusive. This study utilized validated questionnaires to assess the postoperative functional, cosmetic, and psychosocial outcomes of hypospadias repair. 172 patients who underwent hypospadias repair under the care of a single surgeon were identified. 25 agreed for follow-up using the validated questionnaires of Hypospadias Objective Scoring Evaluation (HOSE), Pediatric Penile Perception Scale (PPPS), and Pediatric Quality of Life Inventory (PedsQL™4.0). Mean follow-up was 59months postoperatively (range 7-113months). Techniques used included tubularized incised plate urethroplasty, meatal advancement and glanuloplasty, and a 2-stage repair. 23 of 25 patients achieved a HOSE score of 14 or more (maximum of 16). The PPPS scores correlated with severity of the hypospadias. Those with glanular hypospadias (mean score=10) scored higher than those with coronal (mean score=9) and penile/penoscrotal hypospadias (mean score=7). There was no correlation between PedsQL™4.0 scores and the severity of hypospadias or procedure used. Validated questionnaires revealed generally good functional, cosmetic, and early psychosocial outcomes after hypospadias repair. The use of validated questionnaires in routine follow-up sessions may facilitate objective assessment of both functional outcomes and patient satisfaction. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Validity of a novel computerized screening test system for mild cognitive impairment.

    PubMed

    Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk

    2018-06-20

    ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.

  11. Validation and adaptation of the hospital consumer assessment of healthcare providers and systems in Arabic context: Evidence from Saudi Arabia.

    PubMed

    Alanazi, Mohammed R; Alamry, Ahmed; Al-Surimi, Khaled

    One of the main purposes of healthcare organizations is to serve patients by providing safe and high-quality patient-centered care. Patients are considered the most appropriate source to assess the quality level of healthcare services. The objectives of this paper were to describe the translation and adaptation process of the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey for Arabic speaking populations, examine the degree of equivalence between the original English version and the Arabic translated version, and estimate and report the validity and reliability of the translated Arabic HCAHPS version. The translation process had four main steps: (1) qualified bilingual translators translated the HCAHPS from English to Arabic; (2) the Arabic version was translated back to English and reviewed by experts to ensure content accuracy (content equivalence); (3) both Arabic and English versions were verified for accuracy and validity of the translation, checking for the similarities and differences (semantic equivalence); (4) finally, two independent bilinguals reviewed and made the final revision of both the Arabic and English versions separately and agreed on one final version that is similar and equivalent to the original English version in terms of content and meaning. The study findings showed that the overall Cronbach's α for the Arabic HCAHPS version was 0.90, showing good internal consistency across the 9 separate domains, which ranged from 0.70 to 0.97 Cronbach's α. The correlation coefficient between each statement for each separate domain revealed a highly positive significant correlation ranging from 0.72 to 0.89. The results of the study show empirical evidence of validity and reliability of HCAHPS in its Arabic version. Moreover, the Arabic version of HCAHPS in our study presented good internal consistency and it is highly recommended to be replicated and applied in the context of other Arab countries. Copyright © 2017

  12. Translating and validating a Training Needs Assessment tool into Greek

    PubMed Central

    Markaki, Adelais; Antonakis, Nikos; Hicks, Carolyn M; Lionis, Christos

    2007-01-01

    Background The translation and cultural adaptation of widely accepted, psychometrically tested tools is regarded as an essential component of effective human resource management in the primary care arena. The Training Needs Assessment (TNA) is a widely used, valid instrument, designed to measure professional development needs of health care professionals, especially in primary health care. This study aims to describe the translation, adaptation and validation of the TNA questionnaire into Greek language and discuss possibilities of its use in primary care settings. Methods A modified version of the English self-administered questionnaire consisting of 30 items was used. Internationally recommended methodology, mandating forward translation, backward translation, reconciliation and pretesting steps, was followed. Tool validation included assessing item internal consistency, using the alpha coefficient of Cronbach. Reproducibility (test – retest reliability) was measured by the kappa correlation coefficient. Criterion validity was calculated for selected parts of the questionnaire by correlating respondents' research experience with relevant research item scores. An exploratory factor analysis highlighted how the items group together, using a Varimax (oblique) rotation and subsequent Cronbach's alpha assessment. Results The psychometric properties of the Greek version of the TNA questionnaire for nursing staff employed in primary care were good. Internal consistency of the instrument was very good, Cronbach's alpha was found to be 0.985 (p < 0.001) and Kappa coefficient for reproducibility was found to be 0.928 (p < 0.0001). Significant positive correlations were found between respondents' current performance levels on each of the research items and amount of research involvement, indicating good criterion validity in the areas tested. Factor analysis revealed seven factors with eigenvalues of > 1.0, KMO (Kaiser-Meyer-Olkin) measure of sampling adequacy = 0.680 and

  13. The Multiple-Use of Accountability Assessments: Implications for the Process of Validation

    ERIC Educational Resources Information Center

    Koch, Martha J.

    2014-01-01

    Implications of the multiple-use of accountability assessments for the process of validation are examined. Multiple-use refers to the simultaneous use of results from a single administration of an assessment for its intended use and for one or more additional uses. A theoretical discussion of the issues for validation which emerge from…

  14. Assessing the Validity and Reliability of the Peristomal Skin Lesion Assessment Instrument Adapted for Use in Turkey.

    PubMed

    Ay, Ali; Bulut, Hulya

    2015-08-01

    Many ostomy patients experience peristomal skin lesions. A descriptive study was conducted to assess the validity, usability, and reliability of the Peristomal Skin Lesions Assessment instrument (SACS instrument) adapted to Turkish from English. The SACS Instrument consists of 2 main assessments: lesion type (utilizing definitions and photographs) and lesion area by location around the ostomy. The study was performed in 2 stages: 1) the SACS language was changed and its content validity established; and 2) the instrument\\'92s content validity and inter-observer agreement (consistency) were determined among pairs of nurses who used the tool to assess peristomal skin lesions. Patients (included if they were >18 years old and receiving treatment/observation at 1 of the 4 participating stomatherapy units) and 8 stomatherapy nurses also completed appropriate sociodemographic questionnaires. Of the 393 patients screened during the 7-month study, 100 (average age 56.74 \\'b1 14.03 years, 55 men) participated; most (79) had a planned operation. A little more than half (59) of the patients had colorectal cancer and 28 had their stoma site marked preoperatively by a stomatherapy nurse. The most common peristomal skin lesion risk factors were having an ileostomy and unplanned surgery. The content validity index of the entire Turkish SACS instrument was 1, and the inter-observer agreement Kappa statistic was very good (K = 0.90, 95% CI 0.80- 0.99). Individual SACS item K values ranged from K = 0.84 (95% CI 0.63\\'961) to K = 1 (95% CI 1). Most (62.5%) nurses found the terms and pictures used in the SACS classification adequate and suitable, and 50% believed the Turkish version of the SACS instrument was a valid and suitable assessment tool for use by Turkish stomatherapy nurses. Validity and reliability studies involving larger and more diverse patient and nurse samples are warranted.

  15. Establishment of a VISAR Measurement System for Material Model Validation in DSTO

    DTIC Science & Technology

    2013-02-01

    advancements published in the works by L.M. Baker, E.R. Hollenbach and W.F. Hemsing [1-3] and results in the user-friendly interface and configuration of the...VISAR system [4] used in the current work . VISAR tests are among the mandatory instrumentation techniques when validating material models and...The present work reports on preliminary tests using the recently commissioned DSTO VISAR system, providing an assessment of the experimental set-up

  16. Development and validation of the Vietnamese primary care assessment tool

    PubMed Central

    2018-01-01

    Objective To adapt the consumer version of the Primary Care Assessment Tool (PCAT) for Vietnam and determine its internal consistency and validity. Design A quantitative cross sectional study. Setting 56 communes in 3 representative provinces of central Vietnam. Participants Total of 3289 people who used health care services at health facility at least once over the past two years. Results The Vietnamese adult expanded consumer version of the PCAT (VN PCAT-AE) is an instrument for evaluation of primary care in Vietnam with 70 items comprising six scales representing four core primary care domains, and three additional scales representing three derivative domains. Sixteen other items from the original tool were not included in the final instrument, due to problems with missing values, floor or ceiling effects, and item-total correlations. All the retained scales have a Cronbach’s alpha above 0.70 except for the subscale of Family Centeredness. Conclusions The VN PCAT-AE demonstrates adequate internal consistency and validity to be used as an effective tool for measuring the quality of primary care in Vietnam from the consumer perspective. Additional work in the future to optimize valid measurement in all domains consistent with the original version of the tool may be helpful as the primary care system in Vietnam further develops. PMID:29324851

  17. Development and validation of the Vietnamese primary care assessment tool.

    PubMed

    Hoa, Nguyen Thi; Tam, Nguyen Minh; Peersman, Wim; Derese, Anselme; Markuns, Jeffrey F

    2018-01-01

    To adapt the consumer version of the Primary Care Assessment Tool (PCAT) for Vietnam and determine its internal consistency and validity. A quantitative cross sectional study. 56 communes in 3 representative provinces of central Vietnam. Total of 3289 people who used health care services at health facility at least once over the past two years. The Vietnamese adult expanded consumer version of the PCAT (VN PCAT-AE) is an instrument for evaluation of primary care in Vietnam with 70 items comprising six scales representing four core primary care domains, and three additional scales representing three derivative domains. Sixteen other items from the original tool were not included in the final instrument, due to problems with missing values, floor or ceiling effects, and item-total correlations. All the retained scales have a Cronbach's alpha above 0.70 except for the subscale of Family Centeredness. The VN PCAT-AE demonstrates adequate internal consistency and validity to be used as an effective tool for measuring the quality of primary care in Vietnam from the consumer perspective. Additional work in the future to optimize valid measurement in all domains consistent with the original version of the tool may be helpful as the primary care system in Vietnam further develops.

  18. The Adult Attachment Projective Picture System: integrating attachment into clinical assessment.

    PubMed

    George, Carol; West, Malcolm

    2011-01-01

    This article summarizes the development and validation of the Adult Attachment Projective System (AAP), a measure we developed from the Bowlby-Ainsworth developmental tradition to assess adult attachment status. The AAP has demonstrated excellent concurrent validity with the Adult Attachment Interview (George, Kaplan, & Main, 1984/1985/1996; Main & Goldwyn, 1985-1994; Main, Goldwyn, & Hesse, 2003), interjudge reliability, and test-retest reliability, with no effects of verbal intelligence or social desirability. The AAP coding and classification system and application in clinical and community samples are summarized. Finally, we introduce the 3 other articles that are part of this Special Section and discuss the use of the AAP in therapeutic assessment and treatment.

  19. ARM Radiosondes for National Polar-Orbiting Operational Environmental Satellite System Preparatory Project Validation Field Campaign Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Borg, Lori; Tobin, David; Reale, Anthony

    This IOP has been a coordinated effort involving the U.S. Department of Energy (DOE) Atmospheric Radiation (ARM) Climate Research Facility, the University of Wisconsin (UW)-Madison, and the JPSS project to validate SNPP NOAA Unique Combined Atmospheric Processing System (NUCAPS) temperature and moisture sounding products from the Cross-track Infrared Sounder (CrIS) and the Advanced Technology Microwave Sounder (ATMS). In this arrangement, funding for radiosondes was provided by the JPSS project to ARM. These radiosondes were launched coincident with the SNPP satellite overpasses (OP) at four of the ARM field sites beginning in July 2012 and running through September 2017. Combined withmore » other ARM data, an assessment of the radiosonde data quality was performed and post-processing corrections applied producing an ARM site Best Estimate (BE) product. The SNPP targeted radiosondes were integrated into the NOAA Products Validation System (NPROVS+) system, which collocated the radiosondes with satellite products (NOAA, National Aeronautics and Space Administration [NASA], European Organisation for the Exploitation of Meteorological Satellites [EUMETSAT], Geostationary Operational Environmental Satellite [GOES], Constellation Observing System for Meteorology, Ionosphere, and Climate [COSMIC]) and Numerical Weather Prediction (NWP forecasts for use in product assessment and algorithm development. This work was a fundamental, integral, and cost-effective part of the SNPP validation effort and provided critical accuracy assessments of the SNPP temperature and water vapor soundings.« less

  20. Evaluating the Validity of Portfolio Assessments for Licensure Decisions

    ERIC Educational Resources Information Center

    Wilson, Mark; Hallam, P. J.; Pecheone, Raymond; Moss, Pamela A.

    2014-01-01

    This study examines one part of a validity argument for portfolio assessments of teaching practice used as an indicator of teaching quality to inform a licensure decision. We investigate the relationship among portfolio assessment scores, a test of teacher knowledge (ETS's Praxis I and II), and changes in student achievement (on…

  1. Examining the Construct Validity of the Elemental Psychopathy Assessment

    ERIC Educational Resources Information Center

    Miller, Joshua D.; Gaughan, Eric T.; Maples, Jessica; Gentile, Brittany; Lynam, Donald R.; Widiger, Thomas A.

    2011-01-01

    Lynam and colleagues recently developed a new self-report inventory for the assessment of psychopathy, the Elemental Psychopathy Assessment (EPA). Using a sample of undergraduates (N = 227), the authors examined the construct validity of the EPA by examining its correlations with self and stranger ratings on the Five-Factor Model, as well as…

  2. Assessing cross-cultural validity of scales: a methodological review and illustrative example.

    PubMed

    Beckstead, Jason W; Yang, Chiu-Yueh; Lengacher, Cecile A

    2008-01-01

    In this article, we assessed the cross-cultural validity of the Women's Role Strain Inventory (WRSI), a multi-item instrument that assesses the degree of strain experienced by women who juggle the roles of working professional, student, wife and mother. Cross-cultural validity is evinced by demonstrating the measurement invariance of the WRSI. Measurement invariance is the extent to which items of multi-item scales function in the same way across different samples of respondents. We assessed measurement invariance by comparing a sample of working women in Taiwan with a similar sample from the United States. Structural equation models (SEMs) were employed to determine the invariance of the WRSI and to estimate the unique validity variance of its items. This article also provides nurse-researchers with the necessary underlying measurement theory and illustrates how SEMs may be applied to assess cross-cultural validity of instruments used in nursing research. Overall performance of the WRSI was acceptable but our analysis showed that some items did not display invariance properties across samples. Item analysis is presented and recommendations for improving the instrument are discussed.

  3. Infant polysomnography: reliability and validity of infant arousal assessment.

    PubMed

    Crowell, David H; Kulp, Thomas D; Kapuniai, Linda E; Hunt, Carl E; Brooks, Lee J; Weese-Mayer, Debra E; Silvestri, Jean; Ward, Sally Davidson; Corwin, Michael; Tinsley, Larry; Peucker, Mark

    2002-10-01

    Infant arousal scoring based on the Atlas Task Force definition of transient EEG arousal was evaluated to determine (1). whether transient arousals can be identified and assessed reliably in infants and (2). whether arousal and no-arousal epochs scored previously by trained raters can be validated reliably by independent sleep experts. Phase I for inter- and intrarater reliability scoring was based on two datasets of sleep epochs selected randomly from nocturnal polysomnograms of healthy full-term, preterm, idiopathic apparent life-threatening event cases, and siblings of Sudden Infant Death Syndrome infants of 35 to 64 weeks postconceptional age. After training, test set 1 reliability was assessed and discrepancies identified. After retraining, test set 2 was scored by the same raters to determine interrater reliability. Later, three raters from the trained group rescored test set 2 to assess inter- and intrarater reliabilities. Interrater and intrarater reliability kappa's, with 95% confidence intervals, ranged from substantial to almost perfect levels of agreement. Interrater reliabilities for spontaneous arousals were initially moderate and then substantial. During the validation phase, 315 previously scored epochs were presented to four sleep experts to rate as containing arousal or no-arousal events. Interrater expert agreements were diverse and considered as noninterpretable. Concordance in sleep experts' agreements, based on identification of the previously sampled arousal and no-arousal epochs, was used as a secondary evaluative technique. Results showed agreement by two or more experts on 86% of the Collaborative Home Infant Monitoring Evaluation Study arousal scored events. Conversely, only 1% of the Collaborative Home Infant Monitoring Evaluation Study-scored no-arousal epochs were rated as an arousal. In summary, this study presents an empirically tested model with procedures and criteria for attaining improved reliability in transient EEG arousal

  4. Microbiological Validation of the IVGEN System

    NASA Technical Reports Server (NTRS)

    Porter, David A.

    2013-01-01

    The principal purpose of this report is to describe a validation process that can be performed in part on the ground prior to launch, and in space for the IVGEN system. The general approach taken is derived from standard pharmaceutical industry validation schemes modified to fit the special requirements of in-space usage.

  5. Evaluation of the Validity and Reliability of the Waterlow Pressure Ulcer Risk Assessment Scale

    PubMed Central

    Charalambous, Charalambos; Koulori, Agoritsa; Vasilopoulos, Aristidis; Roupa, Zoe

    2018-01-01

    Introduction Prevention is the ideal strategy to tackle the problem of pressure ulcers. Pressure ulcer risk assessment scales are one of the most pivotal measures applied to tackle the problem, much criticisms has been developed regarding the validity and reliability of these scales. Objective To investigate the validity and reliability of the Waterlow pressure ulcer risk assessment scale. Method The methodology used is a narrative literature review, the bibliography was reviewed through Cinahl, Pubmed, EBSCO, Medline and Google scholar, 26 scientific articles where identified. The articles where chosen due to their direct correlation with the objective under study and their scientific relevance. Results The construct and face validity of the Waterlow appears adequate, but with regards to content validity changes in the category age and gender can be beneficial. The concurrent validity cannot be assessed. The predictive validity of the Waterlow is characterized by high specificity and low sensitivity. The inter-rater reliability has been demonstrated to be inadequate, this may be due to lack of clear definitions within the categories and differentiating level of knowledge between the users. Conclusion Due to the limitations presented regarding the validity and reliability of the Waterlow pressure ulcer risk assessment scale, the scale should be used in conjunction with clinical assessment to provide optimum results. PMID:29736104

  6. Evaluation of the Validity and Reliability of the Waterlow Pressure Ulcer Risk Assessment Scale.

    PubMed

    Charalambous, Charalambos; Koulori, Agoritsa; Vasilopoulos, Aristidis; Roupa, Zoe

    2018-04-01

    Prevention is the ideal strategy to tackle the problem of pressure ulcers. Pressure ulcer risk assessment scales are one of the most pivotal measures applied to tackle the problem, much criticisms has been developed regarding the validity and reliability of these scales. To investigate the validity and reliability of the Waterlow pressure ulcer risk assessment scale. The methodology used is a narrative literature review, the bibliography was reviewed through Cinahl, Pubmed, EBSCO, Medline and Google scholar, 26 scientific articles where identified. The articles where chosen due to their direct correlation with the objective under study and their scientific relevance. The construct and face validity of the Waterlow appears adequate, but with regards to content validity changes in the category age and gender can be beneficial. The concurrent validity cannot be assessed. The predictive validity of the Waterlow is characterized by high specificity and low sensitivity. The inter-rater reliability has been demonstrated to be inadequate, this may be due to lack of clear definitions within the categories and differentiating level of knowledge between the users. Due to the limitations presented regarding the validity and reliability of the Waterlow pressure ulcer risk assessment scale, the scale should be used in conjunction with clinical assessment to provide optimum results.

  7. The Social Validity Assessment of Social Competence Intervention Behavior Goals

    ERIC Educational Resources Information Center

    Hurley, Jennifer J.; Wehby, Joseph H.; Feurer, Irene D.

    2010-01-01

    Social validation is the value judgment from society on the importance of a study. The social validity of behavior goals used in the social competence intervention literature was assessed using the Q-sort technique. The stimulus items were 80 different social competence behavior goals taken from 78 classroom-based social competence intervention…

  8. Development and Validation of the Narrative Quality Assessment Tool.

    PubMed

    Kim, Wonsun Sunny; Shin, Cha-Nam; Kathryn Larkey, Linda; Roe, Denise J

    2017-04-01

    The use of storytelling in health promotion has grown over the past 2 decades, showing promise for moving people to initiate healthy behavior change. Given the increasingly prevalent role of storytelling in health promotion research and the need to more clearly identify what storytelling elements and mediators may better predict behavior change, there is a need to develop measures to specifically assess these factors in a cultural community context. The purpose of this study is to develop and preliminarily validate a narrative quality assessment tool for measuring elements of storytelling that are predicted to affect attitude and behavior change (i.e., narrative characteristics, identification, and transportation) within a cultural community setting using a culture-centric model. Reliability and validity of these scales were assessed with repeated administrations among 74 Latino men and women with a mean age of 39.6 years (SD = 11.47 years). The confirmatory factor analysis in addition to internal consistency tests revealed preliminary evidence for reliability and validity of the narrative characteristics, identification, and transportation scales. Cronbach's alpha ranged from .92 to .94. Items revealed adequate factor loadings (.85-.98) and good model fit. The new scales provide the first step in moving the assessment of narrative quality into a culturally relevant context for evaluation of story use in health promotion. The results present valuable information for nurse researchers to guide the development and testing of culturally grounded storytelling interventions' potential to predict attitude and behavior change for patients.

  9. ALHAT System Validation

    NASA Technical Reports Server (NTRS)

    Brady, Tye; Bailey, Erik; Crain, Timothy; Paschall, Stephen

    2011-01-01

    NASA has embarked on a multiyear technology development effort to develop a safe and precise lunar landing capability. The Autonomous Landing and Hazard Avoidance Technology (ALHAT) Project is investigating a range of landing hazard detection methods while developing a hazard avoidance capability to best field test the proper set of relevant autonomous GNC technologies. Ultimately, the advancement of these technologies through the ALHAT Project will provide an ALHAT System capable of enabling next generation lunar lander vehicles to globally land precisely and safely regardless of lighting condition. This paper provides an overview of the ALHAT System and describes recent validation experiments that have advanced the highly capable GNC architecture.

  10. Comprehensive Assessment of Emotional Disturbance: A Cross-Validation Approach

    ERIC Educational Resources Information Center

    Fisher, Emily S.; Doyon, Katie E.; Saldana, Enrique; Allen, Megan Redding

    2007-01-01

    Assessing a student for emotional disturbance is a serious and complex task given the stigma of the label and the ambiguities of the federal definition. One way that school psychologists can be more confident in their assessment results is to cross validate data from different sources using the RIOT approach (Review, Interview, Observe, Test).…

  11. Development and validation of a physics problem-solving assessment rubric

    NASA Astrophysics Data System (ADS)

    Docktor, Jennifer Lynn

    Problem solving is a complex process that is important for everyday life and crucial for learning physics. Although there is a great deal of effort to improve student problem solving throughout the educational system, there is no standard way to evaluate written problem solving that is valid, reliable, and easy to use. Most tests of problem solving performance given in the classroom focus on the correctness of the end result or partial results rather than the quality of the procedures and reasoning leading to the result, which gives an inadequate description of a student's skills. A more detailed and meaningful measure is necessary if different curricular materials or pedagogies are to be compared. This measurement tool could also allow instructors to diagnose student difficulties and focus their coaching. It is important that the instrument be applicable to any problem solving format used by a student and to a range of problem types and topics typically used by instructors. Typically complex processes such as problem solving are assessed by using a rubric, which divides a skill into multiple quasi-independent categories and defines criteria to attain a score in each. This dissertation describes the development of a problem solving rubric for the purpose of assessing written solutions to physics problems and presents evidence for the validity, reliability, and utility of score interpretations on the instrument.

  12. Validity of instruments to assess students' travel and pedestrian safety.

    PubMed

    Mendoza, Jason A; Watson, Kathy; Baranowski, Tom; Nicklas, Theresa A; Uscanga, Doris K; Hanfling, Marcus J

    2010-05-18

    Safe Routes to School (SRTS) programs are designed to make walking and bicycling to school safe and accessible for children. Despite their growing popularity, few validated measures exist for assessing important outcomes such as type of student transport or pedestrian safety behaviors. This research validated the SRTS school travel survey and a pedestrian safety behavior checklist. Fourth grade students completed a brief written survey on how they got to school that day with set responses. Test-retest reliability was obtained 3-4 hours apart. Convergent validity of the SRTS travel survey was assessed by comparison to parents' report. For the measure of pedestrian safety behavior, 10 research assistants observed 29 students at a school intersection for completion of 8 selected pedestrian safety behaviors. Reliability was determined in two ways: correlations between the research assistants' ratings to that of the Principal Investigator (PI) and intraclass correlations (ICC) across research assistant ratings. The SRTS travel survey had high test-retest reliability (kappa = 0.97, n = 96, p < 0.001) and convergent validity (kappa = 0.87, n = 81, p < 0.001). The pedestrian safety behavior checklist had moderate reliability across research assistants' ratings (ICC = 0.48) and moderate correlation with the PI (r = 0.55, p = < 0.01). When two raters simultaneously used the instrument, the ICC increased to 0.65. Overall percent agreement (91%), sensitivity (85%) and specificity (83%) were acceptable. These validated instruments can be used to assess SRTS programs. The pedestrian safety behavior checklist may benefit from further formative work.

  13. Validity of instruments to assess students' travel and pedestrian safety

    PubMed Central

    2010-01-01

    Background Safe Routes to School (SRTS) programs are designed to make walking and bicycling to school safe and accessible for children. Despite their growing popularity, few validated measures exist for assessing important outcomes such as type of student transport or pedestrian safety behaviors. This research validated the SRTS school travel survey and a pedestrian safety behavior checklist. Methods Fourth grade students completed a brief written survey on how they got to school that day with set responses. Test-retest reliability was obtained 3-4 hours apart. Convergent validity of the SRTS travel survey was assessed by comparison to parents' report. For the measure of pedestrian safety behavior, 10 research assistants observed 29 students at a school intersection for completion of 8 selected pedestrian safety behaviors. Reliability was determined in two ways: correlations between the research assistants' ratings to that of the Principal Investigator (PI) and intraclass correlations (ICC) across research assistant ratings. Results The SRTS travel survey had high test-retest reliability (κ = 0.97, n = 96, p < 0.001) and convergent validity (κ = 0.87, n = 81, p < 0.001). The pedestrian safety behavior checklist had moderate reliability across research assistants' ratings (ICC = 0.48) and moderate correlation with the PI (r = 0.55, p =< 0.01). When two raters simultaneously used the instrument, the ICC increased to 0.65. Overall percent agreement (91%), sensitivity (85%) and specificity (83%) were acceptable. Conclusions These validated instruments can be used to assess SRTS programs. The pedestrian safety behavior checklist may benefit from further formative work. PMID:20482778

  14. Visual Impairment Screening Assessment (VISA) tool: pilot validation.

    PubMed

    Rowe, Fiona J; Hepworth, Lauren R; Hanna, Kerry L; Howard, Claire

    2018-03-06

    To report and evaluate a new Vision Impairment Screening Assessment (VISA) tool intended for use by the stroke team to improve identification of visual impairment in stroke survivors. Prospective case cohort comparative study. Stroke units at two secondary care hospitals and one tertiary centre. 116 stroke survivors were screened, 62 by naïve and 54 by non-naïve screeners. Both the VISA screening tool and the comprehensive specialist vision assessment measured case history, visual acuity, eye alignment, eye movements, visual field and visual inattention. Full completion of VISA tool and specialist vision assessment was achieved for 89 stroke survivors. Missing data for one or more sections typically related to patient's inability to complete the assessment. Sensitivity and specificity of the VISA screening tool were 90.24% and 85.29%, respectively; the positive and negative predictive values were 93.67% and 78.36%, respectively. Overall agreement was significant; k=0.736. Lowest agreement was found for screening of eye movement and visual inattention deficits. This early validation of the VISA screening tool shows promise in improving detection accuracy for clinicians involved in stroke care who are not specialists in vision problems and lack formal eye training, with potential to lead to more prompt referral with fewer false positives and negatives. Pilot validation indicates acceptability of the VISA tool for screening of visual impairment in stroke survivors. Sensitivity and specificity were high indicating the potential accuracy of the VISA tool for screening purposes. Results of this study have guided the revision of the VISA screening tool ahead of full clinical validation. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  15. Generalizability and Validity of a Mathematics Performance Assessment.

    ERIC Educational Resources Information Center

    Lane, Suzanne; And Others

    1996-01-01

    Evidence from test results of 3,604 sixth and seventh graders is provided for the generalizability and validity of the Quantitative Understanding: Amplifying Student Achievement and Reasoning (QUASAR) Cognitive Assessment Instrument, which is designed to measure program outcomes and growth in mathematics. (SLD)

  16. The key-features approach to assess clinical decisions: validity evidence to date.

    PubMed

    Bordage, G; Page, G

    2018-05-17

    The key-features (KFs) approach to assessment was initially proposed during the First Cambridge Conference on Medical Education in 1984 as a more efficient and effective means of assessing clinical decision-making skills. Over three decades later, we conducted a comprehensive, systematic review of the validity evidence gathered since then. The evidence was compiled according to the Standards for Educational and Psychological Testing's five sources of validity evidence, namely, Content, Response process, Internal structure, Relations to other variables, and Consequences, to which we added two other types related to Cost-feasibility and Acceptability. Of the 457 publications that referred to the KFs approach between 1984 and October 2017, 164 are cited here; the remaining 293 were either redundant or the authors simply mentioned the KFs concept in relation to their work. While one set of articles reported meeting the validity standards, another set examined KFs test development choices and score interpretation. The accumulated validity evidence for the KFs approach since its inception supports the decision-making construct measured and its use to assess clinical decision-making skills at all levels of training and practice and with various types of exam formats. Recognizing that gathering validity evidence is an ongoing process, areas with limited evidence, such as item factor analyses or consequences of testing, are identified as well as new topics needing further clarification, such as the use of the KFs approach for formative assessment and its place within a program of assessment.

  17. Validation of GC and HPLC systems for residue studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, M.

    1995-12-01

    For residue studies, GC and HPLC system performance must be validated prior to and during use. One excellent measure of system performance is the standard curve and associated chromatograms used to construct that curve. The standard curve is a model of system response to an analyte over a specific time period, and is prima facia evidence of system performance beginning at the auto sampler and proceeding through the injector, column, detector, electronics, data-capture device, and printer/plotter. This tool measures the performance of the entire chromatographic system; its power negates most of the benefits associated with costly and time-consuming validation ofmore » individual system components. Other measures of instrument and method validation will be discussed, including quality control charts and experimental designs for method validation.« less

  18. Reliability and concurrent validity of a Smartphone, bubble inclinometer and motion analysis system for measurement of hip joint range of motion.

    PubMed

    Charlton, Paula C; Mentiplay, Benjamin F; Pua, Yong-Hao; Clark, Ross A

    2015-05-01

    Traditional methods of assessing joint range of motion (ROM) involve specialized tools that may not be widely available to clinicians. This study assesses the reliability and validity of a custom Smartphone application for assessing hip joint range of motion. Intra-tester reliability with concurrent validity. Passive hip joint range of motion was recorded for seven different movements in 20 males on two separate occasions. Data from a Smartphone, bubble inclinometer and a three dimensional motion analysis (3DMA) system were collected simultaneously. Intraclass correlation coefficients (ICCs), coefficients of variation (CV) and standard error of measurement (SEM) were used to assess reliability. To assess validity of the Smartphone application and the bubble inclinometer against the three dimensional motion analysis system, intraclass correlation coefficients and fixed and proportional biases were used. The Smartphone demonstrated good to excellent reliability (ICCs>0.75) for four out of the seven movements, and moderate to good reliability for the remaining three movements (ICC=0.63-0.68). Additionally, the Smartphone application displayed comparable reliability to the bubble inclinometer. The Smartphone application displayed excellent validity when compared to the three dimensional motion analysis system for all movements (ICCs>0.88) except one, which displayed moderate to good validity (ICC=0.71). Smartphones are portable and widely available tools that are mostly reliable and valid for assessing passive hip range of motion, with potential for large-scale use when a bubble inclinometer is not available. However, caution must be taken in its implementation as some movement axes demonstrated only moderate reliability. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  19. Measuring Teacher Effectiveness with the Pennsylvania Value-Added Assessment System

    ERIC Educational Resources Information Center

    Bowen, Naomi

    2017-01-01

    The purpose of this research was to determine if the Pennsylvania Value-Added Assessment System Average Growth Index (PVAAS AGI) scores, derived from standardized tests and calculated for Pennsylvania schools, provide a valid and reliable assessment of teacher effectiveness, as these scores are currently used to derive 15% of the annual…

  20. Statistically validated network of portfolio overlaps and systemic risk.

    PubMed

    Gualdi, Stanislao; Cimini, Giulio; Primicerio, Kevin; Di Clemente, Riccardo; Challet, Damien

    2016-12-21

    Common asset holding by financial institutions (portfolio overlap) is nowadays regarded as an important channel for financial contagion with the potential to trigger fire sales and severe losses at the systemic level. We propose a method to assess the statistical significance of the overlap between heterogeneously diversified portfolios, which we use to build a validated network of financial institutions where links indicate potential contagion channels. The method is implemented on a historical database of institutional holdings ranging from 1999 to the end of 2013, but can be applied to any bipartite network. We find that the proportion of validated links (i.e. of significant overlaps) increased steadily before the 2007-2008 financial crisis and reached a maximum when the crisis occurred. We argue that the nature of this measure implies that systemic risk from fire sales liquidation was maximal at that time. After a sharp drop in 2008, systemic risk resumed its growth in 2009, with a notable acceleration in 2013. We finally show that market trends tend to be amplified in the portfolios identified by the algorithm, such that it is possible to have an informative signal about institutions that are about to suffer (enjoy) the most significant losses (gains).

  1. Statistically validated network of portfolio overlaps and systemic risk

    PubMed Central

    Gualdi, Stanislao; Cimini, Giulio; Primicerio, Kevin; Di Clemente, Riccardo; Challet, Damien

    2016-01-01

    Common asset holding by financial institutions (portfolio overlap) is nowadays regarded as an important channel for financial contagion with the potential to trigger fire sales and severe losses at the systemic level. We propose a method to assess the statistical significance of the overlap between heterogeneously diversified portfolios, which we use to build a validated network of financial institutions where links indicate potential contagion channels. The method is implemented on a historical database of institutional holdings ranging from 1999 to the end of 2013, but can be applied to any bipartite network. We find that the proportion of validated links (i.e. of significant overlaps) increased steadily before the 2007–2008 financial crisis and reached a maximum when the crisis occurred. We argue that the nature of this measure implies that systemic risk from fire sales liquidation was maximal at that time. After a sharp drop in 2008, systemic risk resumed its growth in 2009, with a notable acceleration in 2013. We finally show that market trends tend to be amplified in the portfolios identified by the algorithm, such that it is possible to have an informative signal about institutions that are about to suffer (enjoy) the most significant losses (gains). PMID:28000764

  2. Sensor Selection and Data Validation for Reliable Integrated System Health Management

    NASA Technical Reports Server (NTRS)

    Garg, Sanjay; Melcher, Kevin J.

    2008-01-01

    For new access to space systems with challenging mission requirements, effective implementation of integrated system health management (ISHM) must be available early in the program to support the design of systems that are safe, reliable, highly autonomous. Early ISHM availability is also needed to promote design for affordable operations; increased knowledge of functional health provided by ISHM supports construction of more efficient operations infrastructure. Lack of early ISHM inclusion in the system design process could result in retrofitting health management systems to augment and expand operational and safety requirements; thereby increasing program cost and risk due to increased instrumentation and computational complexity. Having the right sensors generating the required data to perform condition assessment, such as fault detection and isolation, with a high degree of confidence is critical to reliable operation of ISHM. Also, the data being generated by the sensors needs to be qualified to ensure that the assessments made by the ISHM is not based on faulty data. NASA Glenn Research Center has been developing technologies for sensor selection and data validation as part of the FDDR (Fault Detection, Diagnosis, and Response) element of the Upper Stage project of the Ares 1 launch vehicle development. This presentation will provide an overview of the GRC approach to sensor selection and data quality validation and will present recent results from applications that are representative of the complexity of propulsion systems for access to space vehicles. A brief overview of the sensor selection and data quality validation approaches is provided below. The NASA GRC developed Systematic Sensor Selection Strategy (S4) is a model-based procedure for systematically and quantitatively selecting an optimal sensor suite to provide overall health assessment of a host system. S4 can be logically partitioned into three major subdivisions: the knowledge base, the down

  3. Validity evidence as a key marker of quality of technical skill assessment in OTL-HNS.

    PubMed

    Labbé, Mathilde; Young, Meredith; Nguyen, Lily H P

    2018-01-13

    Quality monitoring of assessment practices should be a priority in all residency programs. Validity evidence is one of the main hallmarks of assessment quality and should be collected to support the interpretation and use of assessment data. Our objective was to identify, synthesize, and present the validity evidence reported supporting different technical skill assessment tools in otolaryngology-head and neck surgery (OTL-HNS). We performed a secondary analysis of data generated through a systematic review of all published tools for assessing technical skills in OTL-HNS (n = 16). For each tool, we coded validity evidence according to the five types of evidence described by the American Educational Research Association's interpretation of Messick's validity framework. Descriptive statistical analyses were conducted. All 16 tools included in our analysis were supported by internal structure and relationship to variables validity evidence. Eleven articles presented evidence supporting content. Response process was discussed only in one article, and no study reported on evidence exploring consequences. We present the validity evidence reported for 16 rater-based tools that could be used for work-based assessment of OTL-HNS residents in the operating room. The articles included in our review were consistently deficient in evidence for response process and consequences. Rater-based assessment tools that support high-stakes decisions that impact the learner and programs should include several sources of validity evidence. Thus, use of any assessment should be done with careful consideration of the context-specific validity evidence supporting score interpretation, and we encourage deliberate continual assessment quality-monitoring. NA. Laryngoscope, 2018. © 2018 The American Laryngological, Rhinological and Otological Society, Inc.

  4. Development and Initial Validation of the Self-Assessed Lupus Damage Index Questionnaire (LDIQ)

    PubMed Central

    Costenbader, Karen H.; Khamashta, Munther; Ruiz-Garcia, Silvia; Perez-Rodriguez, Maria Teresa; Petri, Michelle; Elliott, Jennifer; Manzi, Susan; Karlson, Elizabeth W.; Turner-Stokes, Tabitha; Bermas, Bonnie; Coblyn, Jonathan; Massarotti, Elena; Schur, Peter; Fraser, Patricia; Navarro, Iris; Hanly, John G.; Shaver, Timothy S.; Katz, Robert S.; Chakravarty, Eliza; Fortin, Paul R.; Sanchez, Martha L.; Liu, Jigna; Michaud, Kaleb; Alarcón, Graciela S.; Wolfe, Frederick

    2010-01-01

    Purpose The SLICC Damage Index (SDI) is a validated instrument for assessing organ damage in systemic lupus erythematosus (SLE). Trained physicians must complete it, limiting utility where this is impossible. Methods We developed and pilot-tested a self-assessed organ damage instrument, the Lupus Damage Index Questionnaire (LDIQ), in 37 SLE subjects and 7 physicians. After refinement, 569 English-speaking SLE subjects and 14 rheumatologists from 11 international SLE clinics participated in validation. Subjects and physicians completed instruments separately. We calculated sensitivity, specificity, Spearman correlations and agreement, using the SDI as gold standard. 605 SLE participants in the community-based National Data Bank for Rheumatic Diseases (NDB) study completed the LDIQ and we assessed correlations with outcome and disability measures. Results Mean LDIQ score was 3.3 (0-16) and mean SDI score was 1.5 (0-9). LDIQ had a moderately high correlation with SDI (Spearman r=0.50, p<0.001). Specificities of individual LDIQ items were >80%, except for neuropathy. Sensitivities were variable and lowest for damage with <1% prevalence. Agreement between SDI and LDIQ was > 85% for all but neuropathy, reduced renal function, deforming arthritis and alopecia. In the NDB, LDIQ correlated well with comorbidity index (r=0.45), SF-36 physical component scale (0.43), Medical Research Council dyspnea scale (0.40), disability (0.37) and SLE Activity Questionnaire score (0.37). Conclusions The LDIQ’s metric properties are good compared to the SDI. It has construct validity and correlations with health assessments similar to the SDI. The LDIQ should allow expansion of SLE research. Its ultimate value will be determined in longitudinal studies. PMID:20391512

  5. Soil Moisture Active Passive Mission L4_SM Data Product Assessment (Version 2 Validated Release)

    NASA Technical Reports Server (NTRS)

    Reichle, Rolf Helmut; De Lannoy, Gabrielle J. M.; Liu, Qing; Ardizzone, Joseph V.; Chen, Fan; Colliander, Andreas; Conaty, Austin; Crow, Wade; Jackson, Thomas; Kimball, John; hide

    2016-01-01

    During the post-launch SMAP calibration and validation (Cal/Val) phase there are two objectives for each science data product team: 1) calibrate, verify, and improve the performance of the science algorithm, and 2) validate the accuracy of the science data product as specified in the science requirements and according to the Cal/Val schedule. This report provides an assessment of the SMAP Level 4 Surface and Root Zone Soil Moisture Passive (L4_SM) product specifically for the product's public Version 2 validated release scheduled for 29 April 2016. The assessment of the Version 2 L4_SM data product includes comparisons of SMAP L4_SM soil moisture estimates with in situ soil moisture observations from core validation sites and sparse networks. The assessment further includes a global evaluation of the internal diagnostics from the ensemble-based data assimilation system that is used to generate the L4_SM product. This evaluation focuses on the statistics of the observation-minus-forecast (O-F) residuals and the analysis increments. Together, the core validation site comparisons and the statistics of the assimilation diagnostics are considered primary validation methodologies for the L4_SM product. Comparisons against in situ measurements from regional-scale sparse networks are considered a secondary validation methodology because such in situ measurements are subject to up-scaling errors from the point-scale to the grid cell scale of the data product. Based on the limited set of core validation sites, the wide geographic range of the sparse network sites, and the global assessment of the assimilation diagnostics, the assessment presented here meets the criteria established by the Committee on Earth Observing Satellites for Stage 2 validation and supports the validated release of the data. An analysis of the time average surface and root zone soil moisture shows that the global pattern of arid and humid regions are captured by the L4_SM estimates. Results from the

  6. Applying Kane's Validity Framework to a Simulation Based Assessment of Clinical Competence

    ERIC Educational Resources Information Center

    Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud

    2018-01-01

    Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…

  7. The reliability and validity of the Complex Task Performance Assessment: A performance-based assessment of executive function.

    PubMed

    Wolf, Timothy J; Dahl, Abigail; Auen, Colleen; Doherty, Meghan

    2017-07-01

    The objective of this study was to evaluate the inter-rater reliability, test-retest reliability, concurrent validity, and discriminant validity of the Complex Task Performance Assessment (CTPA): an ecologically valid performance-based assessment of executive function. Community control participants (n = 20) and individuals with mild stroke (n = 14) participated in this study. All participants completed the CTPA and a battery of cognitive assessments at initial testing. The control participants completed the CTPA at two different times one week apart. The intra-class correlation coefficient (ICC) for inter-rater reliability for the total score on the CTPA was .991. The ICCs for all of the sub-scores of the CTPA were also high (.889-.977). The CTPA total score was significantly correlated to Condition 4 of the DKEFS Color-Word Interference Test (p = -.425), and the Wechsler Test of Adult Reading (p  = -.493). Finally, there were significant differences between control subjects and individuals with mild stroke on the total score of the CTPA (p = .007) and all sub-scores except interpretation failures and total items incorrect. These results are also consistent with other current executive function performance-based assessments and indicate that the CTPA is a reliable and valid performance-based measure of executive function.

  8. Improving the Validity of Activity of Daily Living Dependency Risk Assessment

    PubMed Central

    Clark, Daniel O.; Stump, Timothy E.; Tu, Wanzhu; Miller, Douglas K.

    2015-01-01

    Objectives Efforts to prevent activity of daily living (ADL) dependency may be improved through models that assess older adults’ dependency risk. We evaluated whether cognition and gait speed measures improve the predictive validity of interview-based models. Method Participants were 8,095 self-respondents in the 2006 Health and Retirement Survey who were aged 65 years or over and independent in five ADLs. Incident ADL dependency was determined from the 2008 interview. Models were developed using random 2/3rd cohorts and validated in the remaining 1/3rd. Results Compared to a c-statistic of 0.79 in the best interview model, the model including cognitive measures had c-statistics of 0.82 and 0.80 while the best fitting gait speed model had c-statistics of 0.83 and 0.79 in the development and validation cohorts, respectively. Conclusion Two relatively brief models, one that requires an in-person assessment and one that does not, had excellent validity for predicting incident ADL dependency but did not significantly improve the predictive validity of the best fitting interview-based models. PMID:24652867

  9. The Hyper-X Flight Systems Validation Program

    NASA Technical Reports Server (NTRS)

    Redifer, Matthew; Lin, Yohan; Bessent, Courtney Amos; Barklow, Carole

    2007-01-01

    For the Hyper-X/X-43A program, the development of a comprehensive validation test plan played an integral part in the success of the mission. The goal was to demonstrate hypersonic propulsion technologies by flight testing an airframe-integrated scramjet engine. Preparation for flight involved both verification and validation testing. By definition, verification is the process of assuring that the product meets design requirements; whereas validation is the process of assuring that the design meets mission requirements for the intended environment. This report presents an overview of the program with emphasis on the validation efforts. It includes topics such as hardware-in-the-loop, failure modes and effects, aircraft-in-the-loop, plugs-out, power characterization, antenna pattern, integration, combined systems, captive carry, and flight testing. Where applicable, test results are also discussed. The report provides a brief description of the flight systems onboard the X-43A research vehicle and an introduction to the ground support equipment required to execute the validation plan. The intent is to provide validation concepts that are applicable to current, follow-on, and next generation vehicles that share the hybrid spacecraft and aircraft characteristics of the Hyper-X vehicle.

  10. Automatic control system generation for robot design validation

    NASA Technical Reports Server (NTRS)

    Bacon, James A. (Inventor); English, James D. (Inventor)

    2012-01-01

    The specification and drawings present a new method, system and software product for and apparatus for generating a robotic validation system for a robot design. The robotic validation system for the robot design of a robotic system is automatically generated by converting a robot design into a generic robotic description using a predetermined format, then generating a control system from the generic robotic description and finally updating robot design parameters of the robotic system with an analysis tool using both the generic robot description and the control system.

  11. A new dataset validation system for the Planetary Science Archive

    NASA Astrophysics Data System (ADS)

    Manaud, N.; Zender, J.; Heather, D.; Martinez, S.

    2007-08-01

    The Planetary Science Archive is the official archive for the Mars Express mission. It has received its first data by the end of 2004. These data are delivered by the PI teams to the PSA team as datasets, which are formatted conform to the Planetary Data System (PDS). The PI teams are responsible for analyzing and calibrating the instrument data as well as the production of reduced and calibrated data. They are also responsible of the scientific validation of these data. ESA is responsible of the long-term data archiving and distribution to the scientific community and must ensure, in this regard, that all archived products meet quality. To do so, an archive peer-review is used to control the quality of the Mars Express science data archiving process. However a full validation of its content is missing. An independent review board recently recommended that the completeness of the archive as well as the consistency of the delivered data should be validated following well-defined procedures. A new validation software tool is being developed to complete the overall data quality control system functionality. This new tool aims to improve the quality of data and services provided to the scientific community through the PSA, and shall allow to track anomalies in and to control the completeness of datasets. It shall ensure that the PSA end-users: (1) can rely on the result of their queries, (2) will get data products that are suitable for scientific analysis, (3) can find all science data acquired during a mission. We defined dataset validation as the verification and assessment process to check the dataset content against pre-defined top-level criteria, which represent the general characteristics of good quality datasets. The dataset content that is checked includes the data and all types of information that are essential in the process of deriving scientific results and those interfacing with the PSA database. The validation software tool is a multi-mission tool that

  12. Translation and validation of the Canadian diabetes risk assessment questionnaire in China.

    PubMed

    Guo, Jia; Shi, Zhengkun; Chen, Jyu-Lin; Dixon, Jane K; Wiley, James; Parry, Monica

    2018-01-01

    To adapt the Canadian Diabetes Risk Assessment Questionnaire for the Chinese population and to evaluate its psychometric properties. A cross-sectional study was conducted with a convenience sample of 194 individuals aged 35-74 years from October 2014 to April 2015. The Canadian Diabetes Risk Assessment Questionnaire was adapted and translated for the Chinese population. Test-retest reliability was conducted to measure stability. Criterion and convergent validity of the adapted questionnaire were assessed using 2-hr 75 g oral glucose tolerance tests and the Finnish Diabetes Risk Scores, respectively. Sensitivity and specificity were evaluated to establish its predictive validity. The test-retest reliability was 0.988. Adequate validity of the adapted questionnaire was demonstrated by positive correlations found between the scores and 2-hr 75 g oral glucose tolerance tests (r = .343, p < .001) and with the Finnish Diabetes Risk Scores (r = .738, p < .001). The area under receiver operating characteristic curve was 0.705 (95% CI .632, .778), demonstrating moderate diagnostic value at a cutoff score of 30. The sensitivity was 73%, with a positive predictive value of 57% and negative predictive value of 78%. Our results provided evidence supporting the translation consistency, content validity, convergent validity, criterion validity, sensitivity, and specificity of the translated Canadian Diabetes Risk Assessment Questionnaire with minor modifications. This paper provides clinical, practical, and methodological information on how to adapt a diabetes risk calculator between cultures for public health nurses. © 2017 Wiley Periodicals, Inc.

  13. VALUE: Valid Assessment of Learning in Undergraduate Education

    ERIC Educational Resources Information Center

    Rhodes, Terrel L.

    2008-01-01

    This chapter discusses the Association of American Colleges and Universities' (AAC&U's) Valid Assessment of Learning in Undergraduate Education (VALUE) project, which aims to demonstrate that faculty across the country share fundamental expectations about student learning on all of the essential learning outcomes deemed critical for student…

  14. Validity of the Child Facial Coding System for the Assessment of Acute Pain in Children With Cerebral Palsy.

    PubMed

    Hadden, Kellie L; LeFort, Sandra; O'Brien, Michelle; Coyte, Peter C; Guerriere, Denise N

    2016-04-01

    The purpose of the current study was to examine the concurrent and discriminant validity of the Child Facial Coding System for children with cerebral palsy. Eighty-five children (mean = 8.35 years, SD = 4.72 years) were videotaped during a passive joint stretch with their physiotherapist and during 3 time segments: baseline, passive joint stretch, and recovery. Children's pain responses were rated from videotape using the Numerical Rating Scale and Child Facial Coding System. Results indicated that Child Facial Coding System scores during the passive joint stretch significantly correlated with Numerical Rating Scale scores (r = .72, P < .01). Child Facial Coding System scores were also significantly higher during the passive joint stretch than the baseline and recovery segments (P < .001). Facial activity was not significantly correlated with the developmental measures. These findings suggest that the Child Facial Coding System is a valid method of identifying pain in children with cerebral palsy. © The Author(s) 2015.

  15. Developing and Validating a New Classroom Climate Observation Assessment Tool

    PubMed Central

    Leff, Stephen S.; Thomas, Duane E.; Shapiro, Edward S.; Paskewich, Brooke; Wilson, Kim; Necowitz-Hoffman, Beth; Jawad, Abbas F.

    2011-01-01

    The climate of school classrooms, shaped by a combination of teacher practices and peer processes, is an important determinant for children’s psychosocial functioning and is a primary factor affecting bullying and victimization. Given that there are relatively few theoretically-grounded and validated assessment tools designed to measure the social climate of classrooms, our research team developed an observation tool through participatory action research (PAR). This article details how the assessment tool was designed and preliminarily validated in 18 third-, fourth-, and fifth-grade classrooms in a large urban public school district. The goals of this study are to illustrate the feasibility of a PAR paradigm in measurement development, ascertain the psychometric properties of the assessment tool, and determine associations with different indices of classroom levels of relational and physical aggression. PMID:21643447

  16. Developing and Validating a New Classroom Climate Observation Assessment Tool.

    PubMed

    Leff, Stephen S; Thomas, Duane E; Shapiro, Edward S; Paskewich, Brooke; Wilson, Kim; Necowitz-Hoffman, Beth; Jawad, Abbas F

    2011-01-01

    The climate of school classrooms, shaped by a combination of teacher practices and peer processes, is an important determinant for children's psychosocial functioning and is a primary factor affecting bullying and victimization. Given that there are relatively few theoretically-grounded and validated assessment tools designed to measure the social climate of classrooms, our research team developed an observation tool through participatory action research (PAR). This article details how the assessment tool was designed and preliminarily validated in 18 third-, fourth-, and fifth-grade classrooms in a large urban public school district. The goals of this study are to illustrate the feasibility of a PAR paradigm in measurement development, ascertain the psychometric properties of the assessment tool, and determine associations with different indices of classroom levels of relational and physical aggression.

  17. Assessing the validity of discourse analysis: transdisciplinary convergence

    NASA Astrophysics Data System (ADS)

    Jaipal-Jamani, Kamini

    2014-12-01

    Research studies using discourse analysis approaches make claims about phenomena or issues based on interpretation of written or spoken text, which includes images and gestures. How are findings/interpretations from discourse analysis validated? This paper proposes transdisciplinary convergence as a way to validate discourse analysis approaches to research. The argument is made that discourse analysis explicitly grounded in semiotics, systemic functional linguistics, and critical theory, offers a credible research methodology. The underlying assumptions, constructs, and techniques of analysis of these three theoretical disciplines can be drawn on to show convergence of data at multiple levels, validating interpretations from text analysis.

  18. Validation of Patient-Reported Outcomes Measurement Information System Short Forms for Use in Childhood-Onset Systemic Lupus Erythematosus.

    PubMed

    Jones, Jordan T; Carle, Adam C; Wootton, Janet; Liberio, Brianna; Lee, Jiha; Schanberg, Laura E; Ying, Jun; Morgan DeWitt, Esi; Brunner, Hermine I

    2017-01-01

    To validate the pediatric Patient-Reported Outcomes Measurement Information System short forms (PROMIS-SFs) in childhood-onset systemic lupus erythematosus (SLE) in a clinical setting. At 3 study visits, childhood-onset SLE patients completed the PROMIS-SFs (anger, anxiety, depressive symptoms, fatigue, physical function-mobility, physical function-upper extremity, pain interference, and peer relationships) using the PROMIS assessment center, and health-related quality of life (HRQoL) legacy measures (Pediatric Quality of Life Inventory, Childhood Health Assessment Questionnaire, Simple Measure of Impact of Lupus Erythematosus in Youngsters [SMILEY], and visual analog scales [VAS] of pain and well-being). Physicians rated childhood-onset SLE activity on a VAS and completed the Systemic Lupus Erythematosus Disease Activity Index 2000. Using a global rating scale of change (GRC) between study visits, physicians rated change of childhood-onset SLE activity (GRC-MD1: better/same/worse) and change of patient overall health (GRC-MD2: better/same/worse). Questionnaire scores were compared in support of validity and responsiveness to change (external standards: GRC-MD1, GRC-MD2). In this population-based cohort (n = 100) with a mean age of 15.8 years (range 10-20 years), the PROMIS-SFs were completed in less than 5 minutes in a clinical setting. The PROMIS-SF scores correlated at least moderately (Pearson's r ≥ 0.5) with those of legacy HRQoL measures, except for the SMILEY. Measures of childhood-onset SLE activity did not correlate with the PROMIS-SFs. Responsiveness to change of the PROMIS-SFs was supported by path, mixed-model, and correlation analyses. To assess HRQoL in childhood-onset SLE, the PROMIS-SFs demonstrated feasibility, internal consistency, construct validity, and responsiveness to change in a clinical setting. © 2016, American College of Rheumatology.

  19. Understanding Interrater Reliability and Validity of Risk Assessment Tools Used to Predict Adverse Clinical Events.

    PubMed

    Siedlecki, Sandra L; Albert, Nancy M

    This article will describe how to assess interrater reliability and validity of risk assessment tools, using easy-to-follow formulas, and to provide calculations that demonstrate principles discussed. Clinical nurse specialists should be able to identify risk assessment tools that provide high-quality interrater reliability and the highest validity for predicting true events of importance to clinical settings. Making best practice recommendations for assessment tool use is critical to high-quality patient care and safe practices that impact patient outcomes and nursing resources. Optimal risk assessment tool selection requires knowledge about interrater reliability and tool validity. The clinical nurse specialist will understand the reliability and validity issues associated with risk assessment tools, and be able to evaluate tools using basic calculations. Risk assessment tools are developed to objectively predict quality and safety events and ultimately reduce the risk of event occurrence through preventive interventions. To ensure high-quality tool use, clinical nurse specialists must critically assess tool properties. The better the tool's ability to predict adverse events, the more likely that event risk is mediated. Interrater reliability and validity assessment is relatively an easy skill to master and will result in better decisions when selecting or making recommendations for risk assessment tool use.

  20. Fault-tolerant clock synchronization validation methodology. [in computer systems

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Palumbo, Daniel L.; Johnson, Sally C.

    1987-01-01

    A validation method for the synchronization subsystem of a fault-tolerant computer system is presented. The high reliability requirement of flight-crucial systems precludes the use of most traditional validation methods. The method presented utilizes formal design proof to uncover design and coding errors and experimentation to validate the assumptions of the design proof. The experimental method is described and illustrated by validating the clock synchronization system of the Software Implemented Fault Tolerance computer. The design proof of the algorithm includes a theorem that defines the maximum skew between any two nonfaulty clocks in the system in terms of specific system parameters. Most of these parameters are deterministic. One crucial parameter is the upper bound on the clock read error, which is stochastic. The probability that this upper bound is exceeded is calculated from data obtained by the measurement of system parameters. This probability is then included in a detailed reliability analysis of the system.

  1. Predicting the ungauged basin: model validation and realism assessment

    NASA Astrophysics Data System (ADS)

    van Emmerik, Tim; Mulder, Gert; Eilander, Dirk; Piet, Marijn; Savenije, Hubert

    2016-04-01

    The hydrological decade on Predictions in Ungauged Basins (PUB) [1] led to many new insights in model development, calibration strategies, data acquisition and uncertainty analysis. Due to a limited amount of published studies on genuinely ungauged basins, model validation and realism assessment of model outcome has not been discussed to a great extent. With this study [2] we aim to contribute to the discussion on how one can determine the value and validity of a hydrological model developed for an ungauged basin. As in many cases no local, or even regional, data are available, alternative methods should be applied. Using a PUB case study in a genuinely ungauged basin in southern Cambodia, we give several examples of how one can use different types of soft data to improve model design, calibrate and validate the model, and assess the realism of the model output. A rainfall-runoff model was coupled to an irrigation reservoir, allowing the use of additional and unconventional data. The model was mainly forced with remote sensing data, and local knowledge was used to constrain the parameters. Model realism assessment was done using data from surveys. This resulted in a successful reconstruction of the reservoir dynamics, and revealed the different hydrological characteristics of the two topographical classes. We do not present a generic approach that can be transferred to other ungauged catchments, but we aim to show how clever model design and alternative data acquisition can result in a valuable hydrological model for ungauged catchments. [1] Sivapalan, M., Takeuchi, K., Franks, S., Gupta, V., Karambiri, H., Lakshmi, V., et al. (2003). IAHS decade on predictions in ungauged basins (PUB), 2003-2012: shaping an exciting future for the hydrological sciences. Hydrol. Sci. J. 48, 857-880. doi: 10.1623/hysj.48.6.857.51421 [2] van Emmerik, T., Mulder, G., Eilander, D., Piet, M. and Savenije, H. (2015). Predicting the ungauged basin: model validation and realism assessment

  2. A noise assessment and prediction system

    NASA Technical Reports Server (NTRS)

    Olsen, Robert O.; Noble, John M.

    1990-01-01

    A system has been designed to provide an assessment of noise levels that result from testing activities at Aberdeen Proving Ground, Md. The system receives meteorological data from surface stations and an upper air sounding system. The data from these systems are sent to a meteorological model, which provides forecasting conditions for up to three hours from the test time. The meteorological data are then used as input into an acoustic ray trace model which projects sound level contours onto a two-dimensional display of the surrounding area. This information is sent to the meteorological office for verification, as well as the range control office, and the environmental office. To evaluate the noise level predictions, a series of microphones are located off the reservation to receive the sound and transmit this information back to the central display unit. The computer models are modular allowing for a variety of models to be utilized and tested to achieve the best agreement with data. This technique of prediction and model validation will be used to improve the noise assessment system.

  3. The Predictive Validity of Dynamic Assessment: A Review

    ERIC Educational Resources Information Center

    Caffrey, Erin; Fuchs, Douglas; Fuchs, Lynn S.

    2008-01-01

    The authors report on a mixed-methods review of 24 studies that explores the predictive validity of dynamic assessment (DA). For 15 of the studies, they conducted quantitative analyses using Pearson's correlation coefficients. They descriptively examined the remaining studies to determine if their results were consistent with findings from the…

  4. Use of an automated learning management system to validate nursing competencies.

    PubMed

    Dumpe, Michelle L; Kanyok, Nancy; Hill, Kristin

    2007-01-01

    Maintaining nurse competencies in a dynamic environment is not an easy task and requires the use of resources already strained. An online learning management system was created, and 24 annual competencies were redesigned for online validation. As a result of this initiative, competencies have been standardized across many disciplines and are completed in a more timely manner, nurses and managers are more satisfied with this method of annual assessments, and cost savings have been realized.

  5. Development, Validation, and Verification of a Self-Assessment Tool to Estimate Agnibala (Digestive Strength).

    PubMed

    Singh, Aparna; Singh, Girish; Patwardhan, Kishor; Gehlot, Sangeeta

    2017-01-01

    According to Ayurveda, the traditional system of healthcare of Indian origin, Agni is the factor responsible for digestion and metabolism. Four functional states (Agnibala) of Agni have been recognized: regular, irregular, intense, and weak. The objective of the present study was to develop and validate a self-assessment tool to estimate Agnibala The developed tool was evaluated for its reliability and validity by administering it to 300 healthy volunteers of either gender belonging to 18 to 40-year age group. Besides confirming the statistical validity and reliability, the practical utility of the newly developed tool was also evaluated by recording serum lipid parameters of all the volunteers. The results show that the lipid parameters vary significantly according to the status of Agni The tool, therefore, may be used to screen normal population to look for possible susceptibility to certain health conditions. © The Author(s) 2016.

  6. Assessment of the Validity of the Research Diagnostic Criteria for Temporomandibular Disorders: Overview and Methodology

    PubMed Central

    Schiffman, Eric L.; Truelove, Edmond L.; Ohrbach, Richard; Anderson, Gary C.; John, Mike T.; List, Thomas; Look, John O.

    2011-01-01

    AIMS The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. An overview is presented, including Axis I and II methodology and descriptive statistics for the study participant sample. This paper details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. Validity testing for the Axis II biobehavioral instruments was based on previously validated reference standards. METHODS The Axis I reference standards were based on the consensus of 2 criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion exam reliability was also assessed within study sites. RESULTS Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas ≥ 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion exam agreement with reference standards was excellent (k ≥ 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). CONCLUSION The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods. PMID:20213028

  7. Validity of the Miller forensic assessment of symptoms test in psychiatric inpatients.

    PubMed

    Veazey, Connie H; Wagner, Alisha L; Hays, J Ray; Miller, Holly A

    2005-06-01

    This study investigated the validity of the Miller Forensic Assessment of Symptoms Test (M-FAST), a brief measure of malingering, in an inpatient psychiatric sample of 70. Among those patients who also completed the Personality Assessment Inventory (N=44), Total M-FAST score was related in the expected directions to the Personality Assessment Inventory validity scales and indexes, providing evidence for concurrent validity of the M-FAST. With the PAI malingering index used as a criterion, we examined the diagnostic efficiency of the M-FAST and found a cut score of 8 represented the best balance of sensitivity, specificity, positive predictive power, and negative predictive power. Based on this cut-score of 8, 16% of the population was classified as malingering. The M-FAST appears to be an excellent rapid screen for symptom exaggeration in this population and setting.

  8. Validity Evidence for Games as Assessment Environments. CRESST Report 773

    ERIC Educational Resources Information Center

    Delacruz, Girlie C.; Chung, Gregory K. W. K.; Baker, Eva L.

    2010-01-01

    This study provides empirical evidence of a highly specific use of games in education--the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also…

  9. A knowledge-based patient assessment system: conceptual and technical design.

    PubMed Central

    Reilly, C. A.; Zielstorff, R. D.; Fox, R. L.; O'Connell, E. M.; Carroll, D. L.; Conley, K. A.; Fitzgerald, P.; Eng, T. K.; Martin, A.; Zidik, C. M.; Segal, M.

    2000-01-01

    This paper describes the design of an inpatient patient assessment application that captures nursing assessment data using a wireless laptop computer. The primary aim of this system is to capture structured information for facilitating decision support and quality monitoring. The system also aims to improve efficiency of recording patient assessments, reduce costs, and improve discharge planning and early identification of patient learning needs. Object-oriented methods were used to elicit functional requirements and to model the proposed system. A tools-based development approach is being used to facilitate rapid development and easy modification of assessment items and rules for decision support. Criteria for evaluation include perceived utility by clinician users, validity of decision support rules, time spent recording assessments, and perceived utility of aggregate reports for quality monitoring. PMID:11079970

  10. A knowledge-based patient assessment system: conceptual and technical design.

    PubMed

    Reilly, C A; Zielstorff, R D; Fox, R L; O'Connell, E M; Carroll, D L; Conley, K A; Fitzgerald, P; Eng, T K; Martin, A; Zidik, C M; Segal, M

    2000-01-01

    This paper describes the design of an inpatient patient assessment application that captures nursing assessment data using a wireless laptop computer. The primary aim of this system is to capture structured information for facilitating decision support and quality monitoring. The system also aims to improve efficiency of recording patient assessments, reduce costs, and improve discharge planning and early identification of patient learning needs. Object-oriented methods were used to elicit functional requirements and to model the proposed system. A tools-based development approach is being used to facilitate rapid development and easy modification of assessment items and rules for decision support. Criteria for evaluation include perceived utility by clinician users, validity of decision support rules, time spent recording assessments, and perceived utility of aggregate reports for quality monitoring.

  11. Assessing behavioural changes in ALS: cross-validation of ALS-specific measures.

    PubMed

    Pinto-Grau, Marta; Costello, Emmet; O'Connor, Sarah; Elamin, Marwa; Burke, Tom; Heverin, Mark; Pender, Niall; Hardiman, Orla

    2017-07-01

    The Beaumont Behavioural Inventory (BBI) is a behavioural proxy report for the assessment of behavioural changes in ALS. This tool has been validated against the FrSBe, a non-ALS-specific behavioural assessment, and further comparison of the BBI against a disease-specific tool was considered. This study cross-validates the BBI against the ALS-FTD-Q. Sixty ALS patients, 8% also meeting criteria for FTD, were recruited. All patients were evaluated using the BBI and the ALS-FTD-Q, completed by a carer. Correlational analysis was performed to assess construct validity. Precision, sensitivity, specificity, and overall accuracy of the BBI when compared to the ALS-FTD-Q, were obtained. The mean score of the whole sample on the BBI was 11.45 ± 13.06. ALS-FTD patients scored significantly higher than non-demented ALS patients (31.6 ± 14.64, 9.62 ± 11.38; p < 0.0001). A significant large positive correlation between the BBI and the ALS-FTD-Q was observed (r = 0.807, p < 0.0001), and no significant correlations between the BBI and other clinical/demographic characteristics indicate good convergent and discriminant validity, respectively. 72% of overall concordance was observed. Precision, sensitivity, and specificity for the classification of severely impaired patients were adequate. However, lower concordance in the classification of mild behavioural changes was observed, with higher sensitivity using the BBI, most likely secondary to BBI items which endorsed behavioural aspects not measured by the ALS-FTD-Q. Good construct validity has been further confirmed when the BBI is compared to an ALS-specific tool. Furthermore, the BBI is a more comprehensive behavioural assessment for ALS, as it measures the whole behavioural spectrum in this condition.

  12. Criterion-Related Validity: Assessing the Value of Subscores

    ERIC Educational Resources Information Center

    Davison, Mark L.; Davenport, Ernest C., Jr.; Chang, Yu-Feng; Vue, Kory; Su, Shiyang

    2015-01-01

    Criterion-related profile analysis (CPA) can be used to assess whether subscores of a test or test battery account for more criterion variance than does a single total score. Application of CPA to subscore evaluation is described, compared to alternative procedures, and illustrated using SAT data. Considerations other than validity and reliability…

  13. Validation of the Sinhala version of the Repeatable Battery for Assessment of Neuropsychological Status (RBANS)

    PubMed

    Suraweera, Chathurie; Anandakumar, D; Dahanayake, D; Subendran, M; Perera, U T; Hanwella, Raveen; de Silva, Varuni

    2016-12-30

    Only the Mini mental state examination (MMSE) and Montreal Cognitive Assessment scale have been validated in a Sri Lankan population for the assessment of cognitive functions. Both tests are deficient in the number of domains assessed. Therefore validation of Repeatable Battery for Assessment of Neuropsychological Status is important as it assesses most of the cognitive domains. To culturally adapt RBANS and investigate the validity and reliability of culturally adapted RBANS (RBANS-S). Fifty four participants with major neurocognitive disorder and 60 normal controls aged >50 were administered with RBANS-S at the Cognitive Assessment Unit, Faculty of Medicine, Colombo and National Hospital of Sri Lanka. The participants were selected after a detailed clinical assessment according to Diagnostic and Statistical Manual – 5 criteria. Data were analysed using SPSS data package. The mean age of the sample was 69.5 years. RBANS-S total scale correlated highly with MMSE total score, (Pearson correlational coefficient = 0.793 p=0.01). Criterion validity was assessed using receiver operating curve characteristic analysis and the area under the curve was 0.937. RBANS-S showed strong concurrent validity us indicated by its significant correlations with the MMSE. All of the RBANS-S subtests demonstrated significant correlations with the MMSE subsets. The sensitivity and specificity for RBANS-S was 89% and 85% respectively at a totals score of 80.5. The RBANS-S yielded a reliability coefficient of 0.929. Culturally adapted RBANS-S is a valid and reliable instrument which can be used in assessment of cognitive functions.

  14. Characterizing the literature on validity and assessment in medical education: a bibliometric study.

    PubMed

    Young, Meredith; St-Onge, Christina; Xiao, Jing; Vachon Lachiver, Elise; Torabi, Nazi

    2018-05-23

    Assessment in Medical Education fills many roles and is under constant scrutiny. Assessments must be of good quality, and supported by validity evidence. Given the high-stakes consequences of assessment, and the many audiences within medical education (e. g., training level, specialty-specific), we set out to document the breadth, scope, and characteristics of the literature reporting on validation of assessments within medical education. Searches in Medline (Ovid), Web of Science, ERIC, EMBASE (Ovid), and PsycINFO (Ovid) identified articles reporting on assessment of learners in medical education published since 1999. Included articles were coded for geographic origin, journal, journal category, targeted assessment, and authors. A map of collaborations between prolific authors was generated. A total of 2,863 articles were included. The majority of articles were from the United States, with Canada producing the most articles per medical school. Most articles were published in journals with medical categorizations (73.1% of articles), but Medical Education was the most represented journal (7.4% of articles). Articles reported on a variety of assessment tools and approaches, and 89 prolific authors were identified, with a total of 228 collaborative links. Literature reporting on validation of assessments in medical education is heterogeneous. Literature is produced by a broad array of authors and collaborative networks, reported to a broad audience, and is primarily generated in North American and European contexts. Our findings speak to the heterogeneity of the medical education literature on assessment validation, and suggest that this heterogeneity may stem, at least in part, from differences in constructs measured, assessment purposes, or conceptualizations of validity.

  15. Development and Validation of a Consumer Quality Assessment Instrument for Dentistry.

    ERIC Educational Resources Information Center

    Johnson, Jeffrey D.; And Others

    1990-01-01

    This paper reviews the literature on consumer involvement in dental quality assessment, argues for inclusion of this information in quality assessment measures, outlines a conceptual model for measuring dental consumer quality assessment, and presents data relating to the development and validation of an instrument based on the conceptual model.…

  16. Expert opinion as 'validation' of risk assessment applied to calf welfare.

    PubMed

    Bracke, Marc B M; Edwards, Sandra A; Engel, Bas; Buist, Willem G; Algers, Bo

    2008-07-14

    Recently, a Risk Assessment methodology was applied to animal welfare issues in a report of the European Food Safety Authority (EFSA) on intensively housed calves. Because this is a new and potentially influential approach to derive conclusions on animal welfare issues, a so-called semantic-modelling type 'validation' study was conducted by asking expert scientists, who had been involved or quoted in the report, to give welfare scores for housing systems and for welfare hazards. Kendall's coefficient of concordance among experts (n = 24) was highly significant (P < 0.001), but low (0.29 and 0.18 for housing systems and hazards respectively). Overall correlations with EFSA scores were significant only for experts with a veterinary or mixed (veterinary and applied ethological) background. Significant differences in welfare scores were found between housing systems, between hazards, and between experts with different backgrounds. For example, veterinarians gave higher overall welfare scores for housing systems than ethologists did, probably reflecting a difference in their perception of animal welfare. Systems with the lowest scores were veal calves kept individually in so-called "baby boxes" (veal crates) or in small groups, and feedlots. A suckler herd on pasture was rated as the best for calf welfare. The main hazards were related to underfeeding, inadequate colostrum intake, poor stockperson education, insufficient space, inadequate roughage, iron deficiency, inadequate ventilation, poor floor conditions and no bedding. Points for improvement of the Risk Assessment applied to animal welfare include linking information, reporting uncertainty and transparency about underlying values. The study provides novel information on expert opinion in relation to calf welfare and shows that Risk Assessment applied to animal welfare can benefit from a semantic modelling approach.

  17. [Validity of four questionnaires to assess physical activity in Spanish adolescents].

    PubMed

    Martínez-Gómez, David; Martínez-De-Haro, Vicente; Del-Campo, Juan; Zapatera, Belén; Welk, Gregory J; Villagra, Ariel; Marcos, Ascensión; Veiga, Oscar L

    2009-01-01

    The physical activity (PA) levels of Spanish adolescents must be determined to assess how the lack of PA may affect the increasing prevalence of obesity. Thus, to assess PA in this age range valid measurement instruments are essential. The aim of this study was to evaluate the validity of four easily applied questionnaires (the enKid and FITNESSGRAM questions, the Patient-Centered Assessment and Counselling [PACE] questionnaire, and an activity rating) to assess PA in Spanish adolescents by using an accelerometer as the criterion instrument. A total of 232 adolescents (113 girls) completed the questionnaires and wore an ActiGraph accelerometer for 7 consecutive days. Spearman's correlation coefficient (rho) was used to compare the questionnaires and total PA, moderate PA, vigorous PA and moderate-to-vigorous PA (MVPA) assessed by the accelerometer. All the questionnaires showed moderate correlations when compared against total PA (rho=0.36-0.43) and MVPA (rho=0.34-0.46) obtained by the accelerometer in the total sample. Higher correlations were found when comparing the questionnaires against vigorous PA (rho=0.42-0.51) than against moderate PA (rho=0.15-0.17). The FITNESSGRAM question and the PACE questionnaire obtained weak correlations in girls and the enKid question and activity rating were moderately correlated for boys and girls. The four questionnaires evaluated showed acceptable validity in the assessment of PA in the Spanish adolescent population.

  18. Protocol for Reliability Assessment of Structural Health Monitoring Systems Incorporating Model-assisted Probability of Detection (MAPOD) Approach

    DTIC Science & Technology

    2011-09-01

    a quality evaluation with limited data, a model -based assessment must be...that affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a ...affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a wide range

  19. Validating a Geographical Image Retrieval System.

    ERIC Educational Resources Information Center

    Zhu, Bin; Chen, Hsinchun

    2000-01-01

    Summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. Describes an experiment to validate the performance of this image retrieval system against that of human subjects by examining similarity analysis…

  20. Stem cell-derived systems in toxicology assessment.

    PubMed

    Suter-Dick, Laura; Alves, Paula M; Blaauboer, Bas J; Bremm, Klaus-Dieter; Brito, Catarina; Coecke, Sandra; Flick, Burkhard; Fowler, Paul; Hescheler, Jürgen; Ingelman-Sundberg, Magnus; Jennings, Paul; Kelm, Jens M; Manou, Irene; Mistry, Pratibha; Moretto, Angelo; Roth, Adrian; Stedman, Donald; van de Water, Bob; Beilmann, Mario

    2015-06-01

    Industrial sectors perform toxicological assessments of their potential products to ensure human safety and to fulfill regulatory requirements. These assessments often involve animal testing, but ethical, cost, and time concerns, together with a ban on it in specific sectors, make appropriate in vitro systems indispensable in toxicology. In this study, we summarize the outcome of an EPAA (European Partnership of Alternatives to Animal Testing)-organized workshop on the use of stem cell-derived (SCD) systems in toxicology, with a focus on industrial applications. SCD systems, in particular, induced pluripotent stem cell-derived, provide physiological cell culture systems of easy access and amenable to a variety of assays. They also present the opportunity to apply the vast repository of existing nonclinical data for the understanding of in vitro to in vivo translation. SCD systems from several toxicologically relevant tissues exist; they generally recapitulate many aspects of physiology and respond to toxicological and pharmacological interventions. However, focused research is necessary to accelerate implementation of SCD systems in an industrial setting and subsequent use of such systems by regulatory authorities. Research is required into the phenotypic characterization of the systems, since methods and protocols for generating terminally differentiated SCD cells are still lacking. Organotypical 3D culture systems in bioreactors and microscale tissue engineering technologies should be fostered, as they promote and maintain differentiation and support coculture systems. They need further development and validation for their successful implementation in toxicity testing in industry. Analytical measures also need to be implemented to enable compound exposure and metabolism measurements for in vitro to in vivo extrapolation. The future of SCD toxicological tests will combine advanced cell culture technologies and biokinetic measurements to support regulatory and

  1. Novel Automated Morphometric and Kinematic Handwriting Assessment: A Validity Study in Children with ASD and ADHD

    ERIC Educational Resources Information Center

    Dirlikov, Benjamin; Younes, Laurent; Nebel, Mary Beth; Martinelli, Mary Katherine; Tiedemann, Alyssa Nicole; Koch, Carolyn A.; Fiorilli, Diana; Bastian, Amy J.; Denckla, Martha Bridge; Miller, Michael I.; Mostofsky, Stewart H.

    2017-01-01

    This study presents construct validity for a novel automated morphometric and kinematic handwriting assessment, including (1) convergent validity, establishing reliability of automated measures with traditional manual-derived Minnesota Handwriting Assessment (MHA), and (2) discriminant validity, establishing that the automated methods distinguish…

  2. Validation of Risk Assessment Models of Venous Thromboembolism in Hospitalized Medical Patients.

    PubMed

    Greene, M Todd; Spyropoulos, Alex C; Chopra, Vineet; Grant, Paul J; Kaatz, Scott; Bernstein, Steven J; Flanders, Scott A

    2016-09-01

    Patients hospitalized for acute medical illness are at increased risk for venous thromboembolism. Although risk assessment is recommended and several at-admission risk assessment models have been developed, these have not been adequately derived or externally validated. Therefore, an optimal approach to evaluate venous thromboembolism risk in medical patients is not known. We conducted an external validation study of existing venous thromboembolism risk assessment models using data collected on 63,548 hospitalized medical patients as part of the Michigan Hospital Medicine Safety (HMS) Consortium. For each patient, cumulative venous thromboembolism risk scores and risk categories were calculated. Cox regression models were used to quantify the association between venous thromboembolism events and assigned risk categories. Model discrimination was assessed using Harrell's C-index. Venous thromboembolism incidence in hospitalized medical patients is low (1%). Although existing risk assessment models demonstrate good calibration (hazard ratios for "at-risk" range 2.97-3.59), model discrimination is generally poor for all risk assessment models (C-index range 0.58-0.64). The performance of several existing risk assessment models for predicting venous thromboembolism among acutely ill, hospitalized medical patients at admission is limited. Given the low venous thromboembolism incidence in this nonsurgical patient population, careful consideration of how best to utilize existing venous thromboembolism risk assessment models is necessary, and further development and validation of novel venous thromboembolism risk assessment models for this patient population may be warranted. Published by Elsevier Inc.

  3. Mars Exploration Rover Mission: Entry, Descent, and Landing System Validation

    NASA Technical Reports Server (NTRS)

    Mitcheltree, Robert A.; Lee, Wayne; Steltzner, Adam; SanMartin, Alejanhdro

    2004-01-01

    System validation for a Mars entry, descent, and landing system is not simply a demonstration that the electrical system functions in the associated environments. The function of this system is its interaction with the atmospheric and surface environment. Thus, in addition to traditional test-bed, hardware-in-the-loop, testing, a validation program that confirms the environmental interaction is required. Unfortunately, it is not possible to conduct a meaningful end-to-end test of a Mars landing system on Earth. The validation plan must be constructed from an interconnected combination of simulation, analysis and test. For the Mars Exploration Rover mission, this combination of activities and the logic of how they combined to the system's validation was explicitly stated, reviewed, and tracked as part of the development plan.

  4. Discriminant content validity: a quantitative methodology for assessing content of theory-based measures, with illustrative applications.

    PubMed

    Johnston, Marie; Dixon, Diane; Hart, Jo; Glidewell, Liz; Schröder, Carin; Pollard, Beth

    2014-05-01

    In studies involving theoretical constructs, it is important that measures have good content validity and that there is not contamination of measures by content from other constructs. While reliability and construct validity are routinely reported, to date, there has not been a satisfactory, transparent, and systematic method of assessing and reporting content validity. In this paper, we describe a methodology of discriminant content validity (DCV) and illustrate its application in three studies. Discriminant content validity involves six steps: construct definition, item selection, judge identification, judgement format, single-sample test of content validity, and assessment of discriminant items. In three studies, these steps were applied to a measure of illness perceptions (IPQ-R) and control cognitions. The IPQ-R performed well with most items being purely related to their target construct, although timeline and consequences had small problems. By contrast, the study of control cognitions identified problems in measuring constructs independently. In the final study, direct estimation response formats for theory of planned behaviour constructs were found to have as good DCV as Likert format. The DCV method allowed quantitative assessment of each item and can therefore inform the content validity of the measures assessed. The methods can be applied to assess content validity before or after collecting data to select the appropriate items to measure theoretical constructs. Further, the data reported for each item in Appendix S1 can be used in item or measure selection. Statement of contribution What is already known on this subject? There are agreed methods of assessing and reporting construct validity of measures of theoretical constructs, but not their content validity. Content validity is rarely reported in a systematic and transparent manner. What does this study add? The paper proposes discriminant content validity (DCV), a systematic and transparent method

  5. Synkinesis assessment in facial palsy: validation of the Dutch Synkinesis Assessment Questionnaire.

    PubMed

    Kleiss, Ingrid J; Beurskens, Carien H G; Stalmeier, Peep F M; Ingels, Koen J A O; Marres, Henri A M

    2016-06-01

    The objective of this study is to validate an existing health-related quality of life questionnaire for patients with synkinesis in facial palsy for implementation in the Dutch language and culture. The Synkinesis Assessment Questionnaire was translated into the Dutch language using a forward-backward translation method. A pilot test with the translated questionnaire was performed in 10 patients with facial palsy and 10 normal subjects. Finally, cross-cultural adaption was accomplished at our outpatient clinic for facial palsy. Analyses for internal consistency, test-retest reliability, and construct validity were performed. Sixty-six patients completed the Dutch Synkinesis Assessment Questionnaire and the Dutch Facial Disability Index. Cronbach's α, representing internal consistency, was 0.80. Test-retest reliability was 0.53 (Spearman's correlation coefficient, P < 0.01). Correlations with the House-Brackmann score, Sunnybrook score, Facial Disability Index physical function, and social/well-being function were -0.29, 0.20, -0.29, and -0.32, respectively. Correlation with the Sunnybrook synkinesis subscore was 0.50 (Spearman's correlation coefficient). The Dutch Synkinesis Assessment Questionnaire shows good psychometric values and can be implemented in the management of Dutch-speaking patients with facial palsy and synkinesis in the Netherlands. Translation of the instrument into other languages may lead to widespread use, making evaluation, and comparison possible among different providers.

  6. Hurricane Sandy Economic Impacts Assessment: A Computable General Equilibrium Approach and Validation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boero, Riccardo; Edwards, Brian Keith

    Economists use computable general equilibrium (CGE) models to assess how economies react and self-organize after changes in policies, technology, and other exogenous shocks. CGE models are equation-based, empirically calibrated, and inspired by Neoclassical economic theory. The focus of this work was to validate the National Infrastructure Simulation and Analysis Center (NISAC) CGE model and apply it to the problem of assessing the economic impacts of severe events. We used the 2012 Hurricane Sandy event as our validation case. In particular, this work first introduces the model and then describes the validation approach and the empirical data available for studying themore » event of focus. Shocks to the model are then formalized and applied. Finally, model results and limitations are presented and discussed, pointing out both the model degree of accuracy and the assessed total damage caused by Hurricane Sandy.« less

  7. Comparative Validity of the Shedler and Westen Assessment Procedure-200

    ERIC Educational Resources Information Center

    Mullins-Sweatt, Stephanie N.; Widiger, Thomas A.

    2008-01-01

    A predominant dimensional model of general personality structure is the five-factor model (FFM). Quite a number of alternative instruments have been developed to assess the domains of the FFM. The current study compares the validity of 2 alternative versions of the Shedler and Westen Assessment Procedure (SWAP-200) FFM scales, 1 that was developed…

  8. A tool for assessing the quality of nursing handovers: a validation study.

    PubMed

    Ferrara, Paolo; Terzoni, Stefano; Davì, Salvatore; Bisesti, Alberto; Destrebecq, Anne

    2017-08-10

    Handover, in particular between two shifts, is a crucial aspect of nursing for patient safety, aimed at ensuring continuity of care. During this process, several factors can affect quality of care and cause errors. This study aimed to assess quality of handovers, by validating the Handoff CEX-Italian scale. The scale was translated from English into Italian and the content validity index was calculated and internal consistency assessed. The scale was used in several units of the San Paolo Teaching Hospital in Milan, Italy. A total of 48 reports were assessed (192 evaluations). The median score was 6, interquartile range (IQR) [5;7] and was not influenced by specific (p=0.21) or overall working experience (p=0.13). The domains showing the lowest median values (median=6, IQR [4;8]) were context, communication, and organisation. Night to morning handovers obtained the lowest scores. CVI-S was 0.96, Cronbach's alpha was 0.79. The Handoff CEX-Italian scale is valid and reliable and it can be used to assess the quality of nurse handovers.

  9. A Preliminary Discriminant and Convergent Validity Study of the Teacher Functional Behavioral Assessment Checklist.

    ERIC Educational Resources Information Center

    Stage, Scott A.; Cheney, Douglas; Walker, Bridget; LaRocque, Michelle

    2002-01-01

    Examines discriminant and convergent validity of the Teacher Functional Behavior Assessment Checklist (TFBAC) using 89 first- through third-grade students. Results are discussed in terms of increasing the convergent validity of the TFBAC, teacher training in concepts about functional behavioral assessment and the possibility of concurrent…

  10. Validation of Computerized Automatic Calculation of the Sequential Organ Failure Assessment Score

    PubMed Central

    Harrison, Andrew M.; Pickering, Brian W.; Herasevich, Vitaly

    2013-01-01

    Purpose. To validate the use of a computer program for the automatic calculation of the sequential organ failure assessment (SOFA) score, as compared to the gold standard of manual chart review. Materials and Methods. Adult admissions (age > 18 years) to the medical ICU with a length of stay greater than 24 hours were studied in the setting of an academic tertiary referral center. A retrospective cross-sectional analysis was performed using a derivation cohort to compare automatic calculation of the SOFA score to the gold standard of manual chart review. After critical appraisal of sources of disagreement, another analysis was performed using an independent validation cohort. Then, a prospective observational analysis was performed using an implementation of this computer program in AWARE Dashboard, which is an existing real-time patient EMR system for use in the ICU. Results. Good agreement between the manual and automatic SOFA calculations was observed for both the derivation (N=94) and validation (N=268) cohorts: 0.02 ± 2.33 and 0.29 ± 1.75 points, respectively. These results were validated in AWARE (N=60). Conclusion. This EMR-based automatic tool accurately calculates SOFA scores and can facilitate ICU decisions without the need for manual data collection. This tool can also be employed in a real-time electronic environment. PMID:23936639

  11. Assessing the Culture of Residency Using the C - Change Resident Survey: Validity Evidence in 34 U.S. Residency Programs.

    PubMed

    Pololi, Linda H; Evans, Arthur T; Civian, Janet T; Shea, Sandy; Brennan, Robert T

    2017-07-01

    A practical instrument is needed to reliably measure the clinical learning environment and professionalism for residents. To develop and present evidence of validity of an instrument to assess the culture of residency programs and the clinical learning environment. During 2014-2015, we surveyed residents using the C - Change Resident Survey to assess residents' perceptions of the culture in their programs. Residents in all years of training in 34 programs in internal medicine, pediatrics, and general surgery in 14 geographically diverse public and private academic health systems. The C - Change Resident Survey assessed residents' perceptions of 13 dimensions of the culture: Vitality, Self-Efficacy, Institutional Support, Relationships/Inclusion, Values Alignment, Ethical/Moral Distress, Respect, Mentoring, Work-Life Integration, Gender Equity, Racial/Ethnic Minority Equity, and self-assessed Competencies. We measured the internal reliability of each of the 13 dimensions and evaluated response process, content validity, and construct-related evidence validity by assessing relationships predicted by our conceptual model and prior research. We also assessed whether the measurements were sensitive to differences in specialty and across institutions. A total of 1708 residents completed the survey [internal medicine: n = 956, pediatrics: n = 411, general surgery: n = 311 (51% women; 16% underrepresented in medicine minority)], with a response rate of 70% (range across programs, 51-87%). Internal consistency of each dimension was high (Cronbach α: 0.73-0.90). The instrument was able to detect significant differences in the learning environment across programs and sites. Evidence of validity was supported by a good response process and the demonstration of several relationships predicted by our conceptual model. The C - Change Resident Survey assesses the clinical learning environment for residents, and we encourage further study of validity in different

  12. Sensor data validation and reconstruction. Phase 1: System architecture study

    NASA Technical Reports Server (NTRS)

    1991-01-01

    The sensor validation and data reconstruction task reviewed relevant literature and selected applicable validation and reconstruction techniques for further study; analyzed the selected techniques and emphasized those which could be used for both validation and reconstruction; analyzed Space Shuttle Main Engine (SSME) hot fire test data to determine statistical and physical relationships between various parameters; developed statistical and empirical correlations between parameters to perform validation and reconstruction tasks, using a computer aided engineering (CAE) package; and conceptually designed an expert system based knowledge fusion tool, which allows the user to relate diverse types of information when validating sensor data. The host hardware for the system is intended to be a Sun SPARCstation, but could be any RISC workstation with a UNIX operating system and a windowing/graphics system such as Motif or Dataviews. The information fusion tool is intended to be developed using the NEXPERT Object expert system shell, and the C programming language.

  13. RELIABILITY AND VALIDITY OF SUBJECTIVE ASSESSMENT OF LUMBAR LORDOSIS IN CONVENTIONAL RADIOGRAPHY.

    PubMed

    Ruhinda, E; Byanyima, R K; Mugerwa, H

    2014-10-01

    Reliability and validity studies of different lumbar curvature analysis and measurement techniques have been documented however there is limited literature on the reliability and validity of subjective visual analysis. Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. A blinded, repeated-measures diagnostic test was carried out on lumbar spine x-ray radiographs. Radiology Department at Joint Clinical Research Centre (JCRC), Mengo-Kampala-Uganda. Seventy (70) lateral lumbar x-ray films were used for this study and were obtained from the archive of JCRC radiology department at Butikiro house, Mengo-Kampala. Poor observer agreement, both inter- and intra-observer, with kappa values of 0.16 was found. Inter-observer agreement was poorer than intra-observer agreement. Kappa values significantly rose when the lumbar lordosis was clustered into four categories without grading each abnormality. The results confirm that subjective assessment of lumbar lordosis has low reliability and validity. Film quality has limited influence on the observer reliability. This study further shows that fewer scale categories of lordosis abnormalities produce better observer reliability.

  14. Assessment of construct validity of a virtual reality laparoscopy simulator.

    PubMed

    Rosenthal, Rachel; Gantert, Walter A; Hamel, Christian; Hahnloser, Dieter; Metzger, Juerg; Kocher, Thomas; Vogelbach, Peter; Scheidegger, Daniel; Oertli, Daniel; Clavien, Pierre-Alain

    2007-08-01

    The aim of this study was to assess whether virtual reality (VR) can discriminate between the skills of novices and intermediate-level laparoscopic surgical trainees (construct validity), and whether the simulator assessment correlates with an expert's evaluation of performance. Three hundred and seven (307) participants of the 19th-22nd Davos International Gastrointestinal Surgery Workshops performed the clip-and-cut task on the Xitact LS 500 VR simulator (Xitact S.A., Morges, Switzerland). According to their previous experience in laparoscopic surgery, participants were assigned to the basic course (BC) or the intermediate course (IC). Objective performance parameters recorded by the simulator were compared to the standardized assessment by the course instructors during laparoscopic pelvitrainer and conventional surgery exercises. IC participants performed significantly better on the VR simulator than BC participants for the task completion time as well as the economy of movement of the right instrument, not the left instrument. Participants with maximum scores in the pelvitrainer cholecystectomy task performed the VR trial significantly faster, compared to those who scored less. In the conventional surgery task, a significant difference between those who scored the maximum and those who scored less was found not only for task completion time, but also for economy of movement of the right instrument. VR simulation provides a valid assessment of psychomotor skills and some basic aspects of spatial skills in laparoscopic surgery. Furthermore, VR allows discrimination between trainees with different levels of experience in laparoscopic surgery establishing construct validity for the Xitact LS 500 clip-and-cut task. Virtual reality may become the gold standard to assess and monitor surgical skills in laparoscopic surgery.

  15. Validation study of a web-based assessment of functional recovery after radical prostatectomy.

    PubMed

    Vickers, Andrew J; Savage, Caroline J; Shouery, Marwan; Eastham, James A; Scardino, Peter T; Basch, Ethan M

    2010-08-05

    Good clinical care of prostate cancer patients after radical prostatectomy depends on careful assessment of post-operative morbidities, yet physicians do not always judge patient symptoms accurately. Logistical problems associated with using paper questionnaire limit their use in the clinic. We have implemented a web-interface ("STAR") for patient-reported outcomes after radical prostatectomy. We analyzed data on the first 9 months of clinical implementation to evaluate the validity of the STAR questionnaire to assess functional outcomes following radical prostatectomy. We assessed response rate, internal consistency within domains, and the association between survey responses and known predictors of sexual and urinary function, including age, time from surgery, nerve sparing status and co-morbidities. Of 1581 men sent an invitation to complete the instrument online, 1235 responded for a response rate of 78%. Cronbach's alpha was 0.84, 0.86 and 0.97 for bowel, urinary and sexual function respectively. All known predictors of sexual and urinary function were significantly associated with survey responses in the hypothesized direction. We have found that web-based assessment of functional recovery after radical prostatectomy is practical and feasible. The instrument demonstrated excellent psychometric properties, suggested that validity is maintained when questions are transferred from paper to electronic format and when patients give responses that they know will be seen by their doctor and added to their clinic record. As such, our system allows ready implementation of patient-reported outcomes into routine clinical practice.

  16. Validation study of a web-based assessment of functional recovery after radical prostatectomy

    PubMed Central

    2010-01-01

    Background Good clinical care of prostate cancer patients after radical prostatectomy depends on careful assessment of post-operative morbidities, yet physicians do not always judge patient symptoms accurately. Logistical problems associated with using paper questionnaire limit their use in the clinic. We have implemented a web-interface ("STAR") for patient-reported outcomes after radical prostatectomy. Methods We analyzed data on the first 9 months of clinical implementation to evaluate the validity of the STAR questionnaire to assess functional outcomes following radical prostatectomy. We assessed response rate, internal consistency within domains, and the association between survey responses and known predictors of sexual and urinary function, including age, time from surgery, nerve sparing status and co-morbidities. Results Of 1581 men sent an invitation to complete the instrument online, 1235 responded for a response rate of 78%. Cronbach's alpha was 0.84, 0.86 and 0.97 for bowel, urinary and sexual function respectively. All known predictors of sexual and urinary function were significantly associated with survey responses in the hypothesized direction. Conclusions We have found that web-based assessment of functional recovery after radical prostatectomy is practical and feasible. The instrument demonstrated excellent psychometric properties, suggested that validity is maintained when questions are transferred from paper to electronic format and when patients give responses that they know will be seen by their doctor and added to their clinic record. As such, our system allows ready implementation of patient-reported outcomes into routine clinical practice. PMID:20687938

  17. Development and content validation of performance assessments for endoscopic third ventriculostomy.

    PubMed

    Breimer, Gerben E; Haji, Faizal A; Hoving, Eelco W; Drake, James M

    2015-08-01

    This study aims to develop and establish the content validity of multiple expert rating instruments to assess performance in endoscopic third ventriculostomy (ETV), collectively called the Neuro-Endoscopic Ventriculostomy Assessment Tool (NEVAT). The important aspects of ETV were identified through a review of current literature, ETV videos, and discussion with neurosurgeons, fellows, and residents. Three assessment measures were subsequently developed: a procedure-specific checklist (CL), a CL of surgical errors, and a global rating scale (GRS). Neurosurgeons from various countries, all identified as experts in ETV, were then invited to participate in a modified Delphi survey to establish the content validity of these instruments. In each Delphi round, experts rated their agreement including each procedural step, error, and GRS item in the respective instruments on a 5-point Likert scale. Seventeen experts agreed to participate in the study and completed all Delphi rounds. After item generation, a total of 27 procedural CL items, 26 error CL items, and 9 GRS items were posed to Delphi panelists for rating. An additional 17 procedural CL items, 12 error CL items, and 1 GRS item were added by panelists. After three rounds, strong consensus (>80% agreement) was achieved on 35 procedural CL items, 29 error CL items, and 10 GRS items. Moderate consensus (50-80% agreement) was achieved on an additional 7 procedural CL items and 1 error CL item. The final procedural and error checklist contained 42 and 30 items, respectively (divided into setup, exposure, navigation, ventriculostomy, and closure). The final GRS contained 10 items. We have established the content validity of three ETV assessment measures by iterative consensus of an international expert panel. Each measure provides unique assessment information and thus can be used individually or in combination, depending on the characteristics of the learner and the purpose of the assessment. These instruments must now

  18. Improving the Validity and Reliability of Large Scale Writing Assessment.

    ERIC Educational Resources Information Center

    Fenton, Ray; Straugh, Tom; Stofflet, Fred; Garrison, Steve

    This paper examines the efforts of the Anchorage School District, Alaska, to improve the validity of its writing assessment as a useful tool for the training of teachers and the characterization of the quality of student writing. The paper examines how a number of changes in the process and scoring of the Anchorage Writing Assessment affected the…

  19. Assessing the Validity of an Annual Survey for Measuring the Enacted Literacy Curriculum

    ERIC Educational Resources Information Center

    Camburn, Eric M.; Han, Seong Won; Sebastian, James

    2017-01-01

    Surveys are frequently used to inform consequential decisions about teachers, policies, and programs. Consequently, it is important to understand the validity of these instruments. This study assesses the validity of measures of instruction captured by an annual survey by comparing survey data with those of a validated daily log. The two…

  20. Movie for the Assessment of Social Cognition (MASC): Spanish validation.

    PubMed

    Lahera, G; Boada, L; Pousa, E; Mirapeix, I; Morón-Nozaleda, G; Marinas, L; Gisbert, L; Pamiàs, M; Parellada, M

    2014-08-01

    We present the Spanish validation of the "Movie for the Assessment of Social Cognition" instrument (MASC-SP). We recruited 22 adolescents and young adults with Asperger syndrome and 26 participants with typical development. The MASC-SP and three other social cognition instruments (Ekman Pictures of Facial Affect test, Reading the Mind in the Eyes Test, and Happé's Strange Stories) were administered to both groups. Individuals with Asperger syndrome had significantly lower scores in all measures of social cognition. The MASC-SP showed strong correlations with all three measures and relative independence of general cognitive functions. Internal consistency was optimal (0.86) and the test-retest was good. The MASC-SP is an ecologically valid and useful tool for assessing social cognition in the Spanish population.

  1. Development and validation of a Malawian version of the primary care assessment tool.

    PubMed

    Dullie, Luckson; Meland, Eivind; Hetlevik, Øystein; Mildestvedt, Thomas; Gjesdal, Sturla

    2018-05-16

    Malawi does not have validated tools for assessing primary care performance from patients' experience. The aim of this study was to develop a Malawian version of Primary Care Assessment Tool (PCAT-Mw) and to evaluate its reliability and validity in the assessment of the core primary care dimensions from adult patients' perspective in Malawi. A team of experts assessed the South African version of the primary care assessment tool (ZA-PCAT) for face and content validity. The adapted questionnaire underwent forward and backward translation and a pilot study. The tool was then used in an interviewer administered cross-sectional survey in Neno district, Malawi, to test validity and reliability. Exploratory factor analysis was performed on a random half of the sample to evaluate internal consistency, reliability and construct validity of items and scales. The identified constructs were then tested with confirmatory factor analysis. Likert scale assumption testing and descriptive statistics were done on the final factor structure. The PCAT-Mw was further tested for intra-rater and inter-rater reliability. From the responses of 631 patients, a 29-item PCAT-Mw was constructed comprising seven multi-item scales, representing five primary care dimensions (first contact, continuity, comprehensiveness, coordination and community orientation). All the seven scales achieved good internal consistency, item-total correlations and construct validity. Cronbach's alpha coefficient ranged from 0.66 to 0.91. A satisfactory goodness of fit model was achieved (GFI = 0.90, CFI = 0.91, RMSEA = 0.05, PCLOSE = 0.65). The full range of possible scores was observed for all scales. Scaling assumptions tests were achieved for all except the two comprehensiveness scales. Intra-class correlation coefficient (ICC) was 0.90 (n = 44, 95% CI 0.81-0.94, p < 0.001) for intra-rater reliability and 0.84 (n = 42, 95% CI 0.71-0.96, p < 0.001) for inter-rater reliability

  2. Independent surgical validation of the new prostate cancer grade-grouping system.

    PubMed

    Spratt, Daniel E; Cole, Adam I; Palapattu, Ganesh S; Weizer, Alon Z; Jackson, William C; Montgomery, Jeffrey S; Dess, Robert T; Zhao, Shuang G; Lee, Jae Y; Wu, Angela; Kunju, Lakshmi P; Talmich, Emily; Miller, David C; Hollenbeck, Brent K; Tomlins, Scott A; Feng, Felix Y; Mehra, Rohit; Morgan, Todd M

    2016-11-01

    To report the independent prognostic impact of the new prostate cancer grade-grouping system in a large external validation cohort of patients treated with radical prostatectomy (RP). Between 1994 and 2013, 3 694 consecutive men were treated with RP at a single institution. To investigate the performance of and validate the grade-grouping system, biochemical recurrence-free survival (bRFS) rates were assessed using Kaplan-Meier tests, Cox-regression modelling, and discriminatory comparison analyses. Separate analyses were performed based on biopsy and RP grade. The median follow-up was 52.7 months. The 5-year actuarial bRFS for biopsy grade groups 1-5 were 94.2%, 89.2%, 73.1%, 63.1%, and 54.7%, respectively (P < 0.001). Similarly, the 5-year actuarial bRFS based on RP grade groups was 96.1%, 93.0%, 74.0%, 64.4%, and 49.9% for grade groups 1-5, respectively (P < 0.001). The adjusted hazard ratios for bRFS relative to biopsy grade group 1 were 1.98, 4.20, 5.57, and 9.32 for groups 2, 3, 4, and 5, respectively (P < 0.001), and for RP grade groups were 2.09, 5.27, 5.86, and 10.42 (P < 0.001). The five-grade-group system had a higher prognostic discrimination compared with the commonly used three-tier system (Gleason score 6 vs 7 vs 8-10). In an independent surgical cohort, we have validated the prognostic benefit of the new prostate cancer grade-grouping system for bRFS, and shown that the benefit is maintained after adjusting for important clinicopathological variables. The greater predictive accuracy of the new system will improve risk stratification in the clinical setting and aid in patient counselling. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.

  3. Validation of a physically based catchment model for application in post-closure radiological safety assessments of deep geological repositories for solid radioactive wastes.

    PubMed

    Thorne, M C; Degnan, P; Ewen, J; Parkin, G

    2000-12-01

    The physically based river catchment modelling system SHETRAN incorporates components representing water flow, sediment transport and radionuclide transport both in solution and bound to sediments. The system has been applied to simulate hypothetical future catchments in the context of post-closure radiological safety assessments of a potential site for a deep geological disposal facility for intermediate and certain low-level radioactive wastes at Sellafield, west Cumbria. In order to have confidence in the application of SHETRAN for this purpose, various blind validation studies have been undertaken. In earlier studies, the validation was undertaken against uncertainty bounds in model output predictions set by the modelling team on the basis of how well they expected the model to perform. However, validation can also be carried out with bounds set on the basis of how well the model is required to perform in order to constitute a useful assessment tool. Herein, such an assessment-based validation exercise is reported. This exercise related to a field plot experiment conducted at Calder Hollow, west Cumbria, in which the migration of strontium and lanthanum in subsurface Quaternary deposits was studied on a length scale of a few metres. Blind predictions of tracer migration were compared with experimental results using bounds set by a small group of assessment experts independent of the modelling team. Overall, the SHETRAN system performed well, failing only two out of seven of the imposed tests. Furthermore, of the five tests that were not failed, three were positively passed even when a pessimistic view was taken as to how measurement errors should be taken into account. It is concluded that the SHETRAN system, which is still being developed further, is a powerful tool for application in post-closure radiological safety assessments.

  4. Development of the Hand Assessment for Infants: evidence of internal scale validity.

    PubMed

    Krumlinde-Sundholm, Lena; Ek, Linda; Sicola, Elisa; Sjöstrand, Lena; Guzzetta, Andrea; Sgandurra, Giuseppina; Cioni, Giovanni; Eliasson, Ann-Christin

    2017-12-01

    The aim of this study was to develop a descriptive and evaluative assessment of upper limb function for infants aged 3 to 12 months and to investigate its internal scale validity for use with infants at risk of unilateral cerebral palsy. The concepts of the test items and scoring criteria were developed. Internal scale validity and aspects of reliability were investigated on the basis of 156 assessments of infants at 3 to 12 months corrected age (mean 7.2mo, SD 2.5) with signs of asymmetric hand use. Rasch measurement model analysis and non-parametric statistics were used. The new test, the Hand Assessment for Infants (HAI), consists of 12 unimanual and five bimanual items, each scored on a 3-point rating scale. It demonstrated a unidimensional construct and good fit to the Rasch model requirements. The excellent person reliability enabled person separation to six significant ability strata. The HAI produced an interval-level measure of bilateral hand use as well as unimanual scores of each hand, allowing a quantification of possible asymmetry expressed as an asymmetry index. The HAI can be considered a valid assessment tool for measuring bilateral hand use and quantifying side difference between hands among infants at risk of developing unilateral cerebral palsy. The Hand Assessment for Infants (HAI) measures the use of both hands and quantifies a possible asymmetry of hand use. HAI is valid for infants at 3 to 12 months corrected age at risk of unilateral cerebral palsy. © 2017 Mac Keith Press.

  5. Validation of a measurement tool to assess awareness of breast cancer.

    PubMed

    Linsell, Louise; Forbes, Lindsay J L; Burgess, Caroline; Kapari, Marcia; Thurnham, Angela; Ramirez, Amanda J

    2010-05-01

    Until now, there has been no universally accepted and validated measure of breast cancer awareness. This study aimed to validate the new Breast Cancer Awareness Measure (BCAM) which assesses, using a self-complete questionnaire, knowledge of breast cancer symptoms and age-related risk, and frequency of breast checking. We measured the psychometric properties of the BCAM in 1035 women attending the NHS Breast Screening Programme: acceptability was assessed using a feedback questionnaire (n=292); sensitivity to change after an intervention promoting breast cancer awareness (n=576), and test-retest reliability (n=167). We also assessed readability, and construct validity using the 'known-groups' method. The readability of the BCAM was high. Over 90% of women found it acceptable. The BCAM was sensitive to change: there was an increase in the proportion of women obtaining the full score for breast cancer awareness one month after receiving the intervention promoting breast cancer awareness; this was greater among those who received a more intensive version (less intensive version (booklet): 9.3%, 95% confidence interval (CI): 4.5-14.1%; more intensive version (interaction with health professional plus booklet): 30%, 95% CI: 23.4-36.6%). Test-retest reliability of the BCAM was moderate to good for most items. Cancer experts had higher levels of cancer awareness than non-medical academics (50% versus 6%, p=0.001), indicating good construct validity. The BCAM is a valid and robust measure of breast cancer awareness suitable for use in surveys of breast cancer awareness in the general population and to evaluate the impact of awareness-raising interventions. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  6. The Research Diagnostic Criteria for Temporomandibular Disorders. I: overview and methodology for assessment of validity.

    PubMed

    Schiffman, Eric L; Truelove, Edmond L; Ohrbach, Richard; Anderson, Gary C; John, Mike T; List, Thomas; Look, John O

    2010-01-01

    The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. The aim of this article is to provide an overview of the project's methodology, descriptive statistics, and data for the study participant sample. This article also details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. The Axis I reference standards were based on the consensus of two criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion examination reliability was also assessed within study sites. Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas > or = 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion examiner agreement with reference standards was excellent (k > or = 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods.

  7. Development and validation of self-reported line drawings of the modified Beighton score for the assessment of generalised joint hypermobility.

    PubMed

    Cooper, Dale J; Scammell, Brigitte E; Batt, Mark E; Palmer, Debbie

    2018-01-17

    The impracticalities and comparative expense of carrying out a clinical assessment is an obstacle in many large epidemiological studies. The purpose of this study was to develop and validate a series of electronic self-reported line drawing instruments based on the modified Beighton scoring system for the assessment of self-reported generalised joint hypermobility. Five sets of line drawings were created to depict the 9-point Beighton score criteria. Each instrument consisted of an explanatory question whereby participants were asked to select the line drawing which best represented their joints. Fifty participants completed the self-report online instrument on two occasions, before attending a clinical assessment. A blinded expert clinical observer then assessed participants' on two occasions, using a standardised goniometry measurement protocol. Validity of the instrument was assessed by participant-observer agreement and reliability by participant repeatability and observer repeatability using unweighted Cohen's kappa (k). Validity and reliability were assessed for each item in the self-reported instrument separately, and for the sum of the total scores. An aggregate score for generalised joint hypermobility was determined based on a Beighton score of 4 or more out of 9. Observer-repeatability between the two clinical assessments demonstrated perfect agreement (k 1.00; 95% CI 1.00, 1.00). Self-reported participant-repeatability was lower but it was still excellent (k 0.91; 95% CI 0.74, 1.00). The participant-observer agreement was excellent (k 0.96; 95% CI 0.87, 1.00). Validity was excellent for the self-report instrument, with a good sensitivity of 0.87 (95% CI 0.81, 0.91) and excellent specificity of 0.99 (95% CI 0.98, 1.00). The self-reported instrument provides a valid and reliable assessment of the presence of generalised joint hypermobility and may have practical use in epidemiological studies.

  8. The Development and Validation of the Religious/Spiritually Integrated Practice Assessment Scale

    ERIC Educational Resources Information Center

    Oxhandler, Holly K.; Parrish, Danielle E.

    2016-01-01

    Objective: This article describes the development and validation of the Religious/Spiritually Integrated Practice Assessment Scale (RSIPAS). The RSIPAS is designed to assess social work practitioners' self-efficacy, attitudes, behaviors, and perceived feasibility concerning the assessment or integration of clients' religious and spiritual beliefs…

  9. Face and Content Validity of the MacArthur Competence Assessment Tool for the Treatment of Iranian Patients.

    PubMed

    Saber, Ali; Tabatabaei, Seyed Mahmoud; Akasheh, Godarz; Sehat, Mojtaba; Zanjani, Zahra; Larijani, Bagher

    2017-01-01

    There is not a valid Persian tool for measuring the decision-making competency of patients. The aim of this study is to evaluate the face and content validity of the MacArthur Competence Assessment Tool for the treatment of Iranian Persian-speaking patients. To assess the validity of the Persian version of the tool, a self-administrated questionnaire was designed. The Lawshe method was also used for assessing each item. Content validity ratio (CVR) and content validity index (CVI) were used to assess the content validity quantitatively. According to the experts' judgment, questions with a CVR ≥0.62 and CVR <0.62 were maintainable and unmaintainable, respectively. The questions were designed in a manner to achieve the desirable result (CVR ≥0.62). The CVI scale (S-CVI) and CVI (S-CVI/Ave) were 0.94 (higher than 0.79). Thus, the content validity was confirmed. Since capacity assessments are usually based on physician's subjective judgment, they are likely to bias and therefore, with this suitably validated tool, we can improve judgment of physicians and health-care providers in out- and in-patient cases.

  10. Development and validation of a reading-related assessment battery in Malay for the purpose of dyslexia assessment.

    PubMed

    Lee, Lay Wah

    2008-06-01

    Malay is an alphabetic language with transparent orthography. A Malay reading-related assessment battery which was conceptualised based on the International Dyslexia Association definition of dyslexia was developed and validated for the purpose of dyslexia assessment. The battery consisted of ten tests: Letter Naming, Word Reading, Non-word Reading, Spelling, Passage Reading, Reading Comprehension, Listening Comprehension, Elision, Rapid Letter Naming and Digit Span. Content validity was established by expert judgment. Concurrent validity was obtained using the schools' language tests as criterion. Evidence of predictive and construct validity was obtained through regression analyses and factor analyses. Phonological awareness was the most significant predictor of word-level literacy skills in Malay, with rapid naming making independent secondary contributions. Decoding and listening comprehension made separate contributions to reading comprehension, with decoding as the more prominent predictor. Factor analysis revealed four factors: phonological decoding, phonological naming, comprehension and verbal short-term memory. In conclusion, despite differences in orthography, there are striking similarities in the theoretical constructs of reading-related tasks in Malay and in English.

  11. Validity Arguments for Diagnostic Assessment Using Automated Writing Evaluation

    ERIC Educational Resources Information Center

    Chapelle, Carol A.; Cotos, Elena; Lee, Jooyoung

    2015-01-01

    Two examples demonstrate an argument-based approach to validation of diagnostic assessment using automated writing evaluation (AWE). "Criterion"®, was developed by Educational Testing Service to analyze students' papers grammatically, providing sentence-level error feedback. An interpretive argument was developed for its use as part of…

  12. Validity and reliability of wii fit balance board for the assessment of balance of healthy young adults and the elderly.

    PubMed

    Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen

    2013-10-01

    [Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86-0.99) for the elderly people and positive correlations (r = 0.58-0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability.

  13. Validity and Reliability of Wii Fit Balance Board for the Assessment of Balance of Healthy Young Adults and the Elderly

    PubMed Central

    Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen

    2013-01-01

    [Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86–0.99) for the elderly people and positive correlations (r = 0.58–0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability. PMID:24259769

  14. Engineering Software Suite Validates System Design

    NASA Technical Reports Server (NTRS)

    2007-01-01

    EDAptive Computing Inc.'s (ECI) EDAstar engineering software tool suite, created to capture and validate system design requirements, was significantly funded by NASA's Ames Research Center through five Small Business Innovation Research (SBIR) contracts. These programs specifically developed Syscape, used to capture executable specifications of multi-disciplinary systems, and VectorGen, used to automatically generate tests to ensure system implementations meet specifications. According to the company, the VectorGen tests considerably reduce the time and effort required to validate implementation of components, thereby ensuring their safe and reliable operation. EDASHIELD, an additional product offering from ECI, can be used to diagnose, predict, and correct errors after a system has been deployed using EDASTAR -created models. Initial commercialization for EDASTAR included application by a large prime contractor in a military setting, and customers include various branches within the U.S. Department of Defense, industry giants like the Lockheed Martin Corporation, Science Applications International Corporation, and Ball Aerospace and Technologies Corporation, as well as NASA's Langley and Glenn Research Centers

  15. Protocol and Demonstrations of Probabilistic Reliability Assessment for Structural Health Monitoring Systems (Preprint)

    DTIC Science & Technology

    2011-11-01

    assessment to quality of localization/characterization estimates. This protocol includes four critical components: (1) a procedure to identify the...critical factors impacting SHM system performance; (2) a multistage or hierarchical approach to SHM system validation; (3) a model -assisted evaluation...Lindgren, E. A ., Buynak, C. F., Steffes, G., Derriso, M., “ Model -assisted Probabilistic Reliability Assessment for Structural Health Monitoring

  16. Validating the Assessment for Measuring Indonesian Secondary School Students Performance in Ecology

    NASA Astrophysics Data System (ADS)

    Rachmatullah, A.; Roshayanti, F.; Ha, M.

    2017-09-01

    The aims of this current study are validating the American Association for the Advancement of Science (AAAS) Ecology assessment and examining the performance of Indonesian secondary school students on the assessment. A total of 611 Indonesian secondary school students (218 middle school students and 393 high school students) participated in the study. Forty-five items of AAAS assessment in the topic of Interdependence in Ecosystems were divided into two versions which every version has 21 similar items. Linking item method was used as the method to combine those two versions of assessment and further Rasch analyses were utilized to validate the instrument. Independent sample t-test was also run to compare the performance of Indonesian students and American students based on the mean of item difficulty. We found that from the total of 45 items, three items were identified as misfitting items. Later on, we also found that both Indonesian middle and high school students were significantly lower performance with very large and medium effect size compared to American students. We will discuss our findings in the regard of validation issue and the connection to Indonesian student’s science literacy.

  17. Expert system validation in prolog

    NASA Technical Reports Server (NTRS)

    Stock, Todd; Stachowitz, Rolf; Chang, Chin-Liang; Combs, Jacqueline

    1988-01-01

    An overview of the Expert System Validation Assistant (EVA) is being implemented in Prolog at the Lockheed AI Center. Prolog was chosen to facilitate rapid prototyping of the structure and logic checkers and since February 1987, we have implemented code to check for irrelevance, subsumption, duplication, deadends, unreachability, and cycles. The architecture chosen is extremely flexible and expansible, yet concise and complementary with the normal interactive style of Prolog. The foundation of the system is in the connection graph representation. Rules and facts are modeled as nodes in the graph and arcs indicate common patterns between rules. The basic activity of the validation system is then a traversal of the connection graph, searching for various patterns the system recognizes as erroneous. To aid in specifying these patterns, a metalanguage is developed, providing the user with the basic facilities required to reason about the expert system. Using the metalanguage, the user can, for example, give the Prolog inference engine the goal of finding inconsistent conclusions among the rules, and Prolog will search the graph intantiations which can match the definition of inconsistency. Examples of code for some of the checkers are provided and the algorithms explained. Technical highlights include automatic construction of a connection graph, demonstration of the use of metalanguage, the A* algorithm modified to detect all unique cycles, general-purpose stacks in Prolog, and a general-purpose database browser with pattern completion.

  18. Validity and Reliability of 10-Hz Global Positioning System to Assess In-line Movement and Change of Direction.

    PubMed

    Nikolaidis, Pantelis T; Clemente, Filipe M; van der Linden, Cornelis M I; Rosemann, Thomas; Knechtle, Beat

    2018-01-01

    The objectives of the present study were to examine the validity and reliability of the 10 Hz Johan GPS unit in assessing in-line movement and change of direction. The validity was tested against the criterion measure of 200 m track-and-field (track-and-field athletes, n = 8) and 20 m shuttle run endurance test (female soccer players, n = 20). Intra-unit and inter-unit reliability was tested by intra-class correlation coefficient (ICC) and coefficient of variation (CV), respectively. An analysis of variance examined differences between the GPS measurement and five laps of 200 m at 15 km/h, and t -test examined differences between the GPS measurement and 20 m shuttle run endurance test. The difference between the GPS measurement and 200 m distance ranged from -0.13 ± 3.94 m (95% CI -3.42; 3.17) in the first lap to 2.13 ± 2.64 m (95% CI -0.08; 4.33) in the fifth lap. A good intra-unit reliability was observed in 200 m (ICC = 0.833, 95% CI 0.535; 0.962). Inter-unit CV ranged from 1.31% (fifth lap) to 2.20% (third lap). The difference between the GPS measurement and 20 m shuttle run endurance test ranged from 0.33 ± 4.16 m (95% CI -10.01; 10.68) in 11.5 km/h to 9.00 ± 5.30 m (95% CI 6.44; 11.56) in 8.0 km/h. A moderate intra-unit reliability was shown in the second and third stage of the 20 m shuttle run endurance test (ICC = 0.718, 95% CI 0.222;0.898) and good reliability in the fifth, sixth, seventh and eighth (ICC = 0.831, 95% CI -0.229;0.996). Inter-unit CV ranged from 2.08% (11.5 km/h) to 3.92% (8.5 km/h). Based on these findings, it was concluded that the 10 Hz Johan system offers an affordable valid and reliable tool for coaches and fitness trainers to monitor training and performance.

  19. Accumulation of Content Validation Evidence for the Critical Thinking Self-Assessment Scale.

    PubMed

    Nair, Girija Gopinathan; Hellsten, Laurie-Ann M; Stamler, Lynnette Leeseberg

    2017-04-01

    Critical thinking skills (CTS) are essential for nurses; assessing students' acquisition of these skills is a mandate of nursing curricula. This study aimed to develop a self-assessment instrument of critical thinking skills (Critical Thinking Self-Assessment Scale [CTSAS]) for students' self-monitoring. An initial pool of 196 items across 6 core cognitive skills and 16 subskills were generated using the American Philosophical Association definition of CTS. Experts' content review of the items and their ratings provided evidence of content relevance using the item-level content validity index (I-CVI) and Aiken's content validity coefficient (VIk). 115 items were retained (range of I-CVI values = .70 to .94 and range of VIk values = .69-.95; significant at p< .05). The CTSAS is the first CTS instrument designed specifically for self-assessment purposes.

  20. Prevalence of malnutrition and validation of bioelectrical impedance analysis for the assessment of body composition in patients with systemic sclerosis.

    PubMed

    Spanjer, Moon J; Bultink, Irene E M; de van der Schueren, Marian A E; Voskuyl, Alexandre E

    2017-06-01

    The aims were to assess the prevalence of malnutrition and to validate bioelectrical impedance analysis (BIA) against whole-body DXA for the assessment of body composition in patients with SSc. Malnutrition was defined as BMI <18.5 kg/m 2 or unintentional weight loss >10% in combination with a fat-free mass index (FFMI) <15 kg/m 2 for women or <17 kg/m 2 for men or BMI <20.0 kg/m 2 (age <70 years) or <22 kg/m 2 (age >70 years). Body composition was assessed in 72 patients with whole-body DXA (Hologic, Discovery A) and BIA (Bodystat Quadscan 400). The manufacturer's equation and the Geneva equation were used to estimate FFM and fat mass. The agreement between BIA and whole-body DXA was assessed with Bland-Altman analysis and intraclass correlation coefficient. Malnutrition was found in 8.3% (n = 6) and low FFMI in 20.8% (n = 15) of patients. The mean difference in FFM between BIA and DXA applying the Geneva equation was 0.02 ( s . d . 2.4) kg, intraclass correlation coefficient 0.97 (95% CI: 0.95, 0.98). Limits of agreement were ±4.6 kg. The manufacturer's equation was less adequate to predict FFM. This study shows a relatively low prevalence of malnutrition in comparison with other studies, but a high prevalence of low FFMI, underlining the necessity of measuring body composition in SSc patients with a standardized and validated method. A good validity of BIA in determining FFM was found at a group level, while at an individual level the FFM may vary by 4.6 kg. © The Author 2017. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  1. A validated search assessment tool: assessing practice-based learning and improvement in a residency program.

    PubMed

    Rana, Gurpreet K; Bradley, Doreen R; Hamstra, Stanley J; Ross, Paula T; Schumacher, Robert E; Frohna, John G; Haftel, Hilary M; Lypson, Monica L

    2011-01-01

    The objective of this study was to validate an assessment instrument for MEDLINE search strategies at an academic medical center. Two approaches were used to investigate if the search assessment tool could capture performance differences in search strategy construction. First, data from an evaluation of MEDLINE searches from a pediatric resident's longitudinal assessment were investigated. Second, a cross-section of search strategies from residents in one incoming class was compared with strategies of residents graduating a year later. MEDLINE search strategies formulated by faculty who had been identified as having search expertise were used as a gold standard comparison. Participants were presented with a clinical scenario and asked to identify the search question and conduct a MEDLINE search. Two librarians rated the blinded search strategies. Search strategy scores were significantly higher for residents who received training than the comparison group with no training. There was no significant difference in search strategy scores between senior residents who received training and faculty experts. The results provide evidence for the validity of the instrument to evaluate MEDLINE search strategies. This assessment tool can measure improvements in information-seeking skills and provide data to fulfill Accreditation Council for Graduate Medical Education competencies.

  2. Measuring suicidality using the personality assessment inventory: a convergent validity study with federal inmates.

    PubMed

    Patry, Marc W; Magaletta, Philip R

    2015-02-01

    Although numerous studies have examined the psychometric properties and clinical utility of the Personality Assessment Inventory in correctional contexts, only two studies to date have specifically focused on suicide ideation. This article examines the convergent validity of the Suicide Ideation Scale and the Suicide Potential Index on the Personality Assessment Inventory in a large, nontreatment sample of male and female federal inmates (N = 1,120). The data indicated robust validity support for both the Suicide Ideation Scale and Suicide Potential Index, which were each correlated with a broad group of validity indices representing multiple assessment modalities. Recommendations for future research to build upon these findings through replication and extension are made. © The Author(s) 2014.

  3. Body surface posture evaluation: construction, validation and protocol of the SPGAP system (Posture evaluation rotating platform system).

    PubMed

    Schwertner, Debora Soccal; Oliveira, Raul; Mazo, Giovana Zarpellon; Gioda, Fabiane Rosa; Kelber, Christian Roberto; Swarowsky, Alessandra

    2016-05-04

    Several posture evaluation devices have been used to detect deviations of the vertebral column. However it has been observed that the instruments present measurement errors related to the equipment, environment or measurement protocol. This study aimed to build, validate, analyze the reliability and describe a measurement protocol for the use of the Posture Evaluation Rotating Platform System (SPGAP, Brazilian abbreviation). The posture evaluation system comprises a Posture Evaluation Rotating Platform, video camera, calibration support and measurement software. Two pilot studies were carried out with 102 elderly individuals (average age 69 years old, SD = ±7.3) to establish a protocol for SPGAP, controlling the measurement errors related to the environment, equipment and the person under evaluation. Content validation was completed with input from judges with expertise in posture measurement. The variation coefficient method was used to validate the measurement by the instrument of an object with known dimensions. Finally, reliability was established using repeated measurements of the known object. Expert content judges gave the system excellent ratings for content validity (mean 9.4 out of 10; SD 1.13). The measurement of an object with known dimensions indicated excellent validity (all measurement errors <1 %) and test-retest reliability. A total of 26 images were needed to stabilize the system. Participants in the pilot studies indicated that they felt comfortable throughout the assessment. The use of only one image can offer measurements that underestimate or overestimate the reality. To verify the images of objects with known dimensions the values for the width and height were, respectively, CV 0.88 (width) and 2.33 (height), SD 0.22 (width) and 0.35 (height), minimum and maximum values 24.83-25.2 (width) and 14.56 - 15.75 (height). In the analysis of different images (similar) of an individual, greater discrepancies were observed in the values found. The

  4. A content validated questionnaire for assessment of self reported venous blood sampling practices

    PubMed Central

    2012-01-01

    Background Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. Findings We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. Conclusions The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward. PMID:22260505

  5. A content validated questionnaire for assessment of self reported venous blood sampling practices.

    PubMed

    Bölenius, Karin; Brulin, Christine; Grankvist, Kjell; Lindkvist, Marie; Söderberg, Johan

    2012-01-19

    Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward.

  6. A Second Dystopia in Education: Validity Issues in Authentic Assessment Practices

    ERIC Educational Resources Information Center

    Hathcoat, John D.; Penn, Jeremy D.; Barnes, Laura L.; Comer, Johnathan C.

    2016-01-01

    Authentic assessments used in response to accountability demands in higher education face at least two threats to validity. First, a lack of interchangeability between assessment tasks introduces bias when using aggregate-based scores at an institutional level. Second, reliance on written products to capture constructs such as critical thinking…

  7. Assessing Sleep Disturbance in Low Back Pain: The Validity of Portable Instruments

    PubMed Central

    Alsaadi, Saad M.; McAuley, James H.; Hush, Julia M.; Bartlett, Delwyn J.; McKeough, Zoe M.; Grunstein, Ronald R.; Dungan, George C.; Maher, Chris G.

    2014-01-01

    Although portable instruments have been used in the assessment of sleep disturbance for patients with low back pain (LBP), the accuracy of the instruments in detecting sleep/wake episodes for this population is unknown. This study investigated the criterion validity of two portable instruments (Armband and Actiwatch) for assessing sleep disturbance in patients with LBP. 50 patients with LBP performed simultaneous overnight sleep recordings in a university sleep laboratory. All 50 participants were assessed by Polysomnography (PSG) and the Armband and a subgroup of 33 participants wore an Actiwatch. Criterion validity was determined by calculating epoch-by-epoch agreement, sensitivity, specificity and prevalence and bias- adjusted kappa (PABAK) for sleep versus wake between each instrument and PSG. The relationship between PSG and the two instruments was assessed using intraclass correlation coefficients (ICC 2, 1). The study participants showed symptoms of sub-threshold insomnia (mean ISI = 13.2, 95% CI = 6.36) and poor sleep quality (mean PSQI = 9.20, 95% CI = 4.27). Observed agreement with PSG was 85% and 88% for the Armband and Actiwatch. Sensitivity was 0.90 for both instruments and specificity was 0.54 and 0.67 and PABAK of 0.69 and 0.77 for the Armband and Actiwatch respectively. The ICC (95%CI) was 0.76 (0.61 to 0.86) and 0.80 (0.46 to 0.92) for total sleep time, 0.52 (0.29 to 0.70) and 0.55 (0.14 to 0.77) for sleep efficiency, 0.64 (0.45 to 0.78) and 0.52 (0.23 to 0.73) for wake after sleep onset and 0.13 (−0.15 to 0.39) and 0.33 (−0.05 to 0.63) for sleep onset latency, for the Armband and Actiwatch, respectively. The findings showed that both instruments have varied criterion validity across the sleep parameters from excellent validity for measures of total sleep time, good validity for measures of sleep efficiency and wake after onset to poor validity for sleep onset latency. PMID:24763506

  8. Assessing sleep disturbance in low back pain: the validity of portable instruments.

    PubMed

    Alsaadi, Saad M; McAuley, James H; Hush, Julia M; Bartlett, Delwyn J; McKeough, Zoe M; Grunstein, Ronald R; Dungan, George C; Maher, Chris G

    2014-01-01

    Although portable instruments have been used in the assessment of sleep disturbance for patients with low back pain (LBP), the accuracy of the instruments in detecting sleep/wake episodes for this population is unknown. This study investigated the criterion validity of two portable instruments (Armband and Actiwatch) for assessing sleep disturbance in patients with LBP. 50 patients with LBP performed simultaneous overnight sleep recordings in a university sleep laboratory. All 50 participants were assessed by Polysomnography (PSG) and the Armband and a subgroup of 33 participants wore an Actiwatch. Criterion validity was determined by calculating epoch-by-epoch agreement, sensitivity, specificity and prevalence and bias- adjusted kappa (PABAK) for sleep versus wake between each instrument and PSG. The relationship between PSG and the two instruments was assessed using intraclass correlation coefficients (ICC 2, 1). The study participants showed symptoms of sub-threshold insomnia (mean ISI = 13.2, 95% CI = 6.36) and poor sleep quality (mean PSQI = 9.20, 95% CI = 4.27). Observed agreement with PSG was 85% and 88% for the Armband and Actiwatch. Sensitivity was 0.90 for both instruments and specificity was 0.54 and 0.67 and PABAK of 0.69 and 0.77 for the Armband and Actiwatch respectively. The ICC (95%CI) was 0.76 (0.61 to 0.86) and 0.80 (0.46 to 0.92) for total sleep time, 0.52 (0.29 to 0.70) and 0.55 (0.14 to 0.77) for sleep efficiency, 0.64 (0.45 to 0.78) and 0.52 (0.23 to 0.73) for wake after sleep onset and 0.13 (-0.15 to 0.39) and 0.33 (-0.05 to 0.63) for sleep onset latency, for the Armband and Actiwatch, respectively. The findings showed that both instruments have varied criterion validity across the sleep parameters from excellent validity for measures of total sleep time, good validity for measures of sleep efficiency and wake after onset to poor validity for sleep onset latency.

  9. Validation of the Evidence-Based Practice Process Assessment Scale

    ERIC Educational Resources Information Center

    Rubin, Allen; Parrish, Danielle E.

    2011-01-01

    Objective: This report describes the reliability, validity, and sensitivity of a scale that assesses practitioners' perceived familiarity with, attitudes of, and implementation of the evidence-based practice (EBP) process. Method: Social work practitioners and second-year master of social works (MSW) students (N = 511) were surveyed in four sites…

  10. Exploring the Relationship between Validity and Comparability in Assessment

    ERIC Educational Resources Information Center

    Crisp, Victoria

    2017-01-01

    This article discusses how comparability relates to current mainstream conceptions of validity, in the context of educational assessment. Relevant literature was used to consider the relationship between these concepts. The article concludes that, depending on the exact claims being made about the appropriate interpretations and uses of the…

  11. Concurrent validation of an inertial measurement system to quantify kicking biomechanics in four football codes.

    PubMed

    Blair, Stephanie; Duthie, Grant; Robertson, Sam; Hopkins, William; Ball, Kevin

    2018-05-17

    Wearable inertial measurement systems (IMS) allow for three-dimensional analysis of human movements in a sport-specific setting. This study examined the concurrent validity of a IMS (Xsens MVN system) for measuring lower extremity and pelvis kinematics in comparison to a Vicon motion analysis system (MAS) during kicking. Thirty footballers from Australian football (n = 10), soccer (n = 10), rugby league and rugby union (n = 10) clubs completed 20 kicks across four conditions. Concurrent validity was assessed using a linear mixed-modelling approach, which allowed the partition of between and within-subject variance from the device measurement error. Results were expressed in raw and standardised units for assessments of differences in means and measurement error, and interpreted via non-clinical magnitude-based inferences. Trivial to small differences were found in linear velocities (foot and pelvis), angular velocities (knee, shank and thigh), sagittal joint (knee and hip) and segment angle (shank and pelvis) means (mean difference: 0.2-5.8%) between the IMS and MAS in Australian football, soccer and the rugby codes. Trivial to small measurement errors (from 0.1 to 5.8%) were found between the IMS and MAS in all kinematic parameters. The IMS demonstrated acceptable levels of concurrent validity compared to a MAS when measuring kicking biomechanics across the four football codes. Wearable IMS offers various benefits over MAS, such as, out-of-laboratory testing, larger measurement range and quick data output, to help improve the ecological validity of biomechanical testing and the timing of feedback. The results advocate the use of IMS to quantify biomechanics of high-velocity movements in sport-specific settings. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. A Serious Game for Clinical Assessment of Cognitive Status: Validation Study.

    PubMed

    Tong, Tiffany; Chignell, Mark; Tierney, Mary C; Lee, Jacques

    2016-05-27

    We propose the use of serious games to screen for abnormal cognitive status in situations where it may be too costly or impractical to use standard cognitive assessments (eg, emergency departments). If validated, serious games in health care could enable broader availability of efficient and engaging cognitive screening. The objective of this work is to demonstrate the feasibility of a game-based cognitive assessment delivered on tablet technology to a clinical sample and to conduct preliminary validation against standard mental status tools commonly used in elderly populations. We carried out a feasibility study in a hospital emergency department to evaluate the use of a serious game by elderly adults (N=146; age: mean 80.59, SD 6.00, range 70-94 years). We correlated game performance against a number of standard assessments, including the Mini-Mental State Examination (MMSE), Montreal Cognitive Assessment (MoCA), and the Confusion Assessment Method (CAM). After a series of modifications, the game could be used by a wide range of elderly patients in the emergency department demonstrating its feasibility for use with these users. Of 146 patients, 141 (96.6%) consented to participate and played our serious game. Refusals to play the game were typically due to concerns of family members rather than unwillingness of the patient to play the game. Performance on the serious game correlated significantly with the MoCA (r=-.339, P <.001) and MMSE (r=-.558, P <.001), and correlated (point-biserial correlation) with the CAM (r=.565, P <.001) and with other cognitive assessments. This research demonstrates the feasibility of using serious games in a clinical setting. Further research is required to demonstrate the validity and reliability of game-based assessments for clinical decision making.

  13. Fitness to plead: Development and validation of a standardised assessment instrument

    PubMed Central

    Stahl, Daniel; Appiah-Kusi, Elizabeth; Brewer, Rebecca; Watts, Michael; Peay, Jill; Blackwood, Nigel

    2018-01-01

    The ability of an individual to participate in courtroom proceedings is assessed by clinicians using legal ‘fitness to plead’ criteria. Findings of ‘unfitness’ are so rare that there is considerable professional unease concerning the utility of the current subjective assessment process. As a result, mentally disordered defendants may be subjected unfairly to criminal trials. The Law Commission in England and Wales has proposed legal reform, as well as the utilisation of a defined psychiatric instrument to assist in fitness to plead assessments. Similar legal reforms are occurring in other jurisdictions. Our objective was to produce and validate a standardised assessment instrument of fitness to plead employing a filmed vignette of criminal proceedings. The instrument was developed in consultation with legal and clinical professionals, and was refined using standard item reduction methods in two initial rounds of testing (n = 212). The factorial structure, test-retest reliability and convergent validity of the resultant instrument were assessed in a further round (n = 160). As a result of this iterative process a 25-item scale was produced, with an underlying two-factor structure representing the foundational and decision-making abilities underpinning fitness to plead. The sub-scales demonstrate good internal consistency (factor 1: 0·76; factor 2: 0·65) and test-retest stability (0·7) as well as excellent convergent validity with scores of intelligence, executive function and mentalising abilities (p≤0·01 in all domains). Overall the standardised Fitness to Plead Assessment instrument has good psychometric properties. It has the potential to ensure that the significant numbers of mentally ill and cognitively impaired individuals who face trial are objectively assessed, and the courtroom process critically informed. PMID:29698396

  14. A Serious Game for Clinical Assessment of Cognitive Status: Validation Study

    PubMed Central

    Chignell, Mark; Tierney, Mary C.; Lee, Jacques

    2016-01-01

    Background We propose the use of serious games to screen for abnormal cognitive status in situations where it may be too costly or impractical to use standard cognitive assessments (eg, emergency departments). If validated, serious games in health care could enable broader availability of efficient and engaging cognitive screening. Objective The objective of this work is to demonstrate the feasibility of a game-based cognitive assessment delivered on tablet technology to a clinical sample and to conduct preliminary validation against standard mental status tools commonly used in elderly populations. Methods We carried out a feasibility study in a hospital emergency department to evaluate the use of a serious game by elderly adults (N=146; age: mean 80.59, SD 6.00, range 70-94 years). We correlated game performance against a number of standard assessments, including the Mini-Mental State Examination (MMSE), Montreal Cognitive Assessment (MoCA), and the Confusion Assessment Method (CAM). Results After a series of modifications, the game could be used by a wide range of elderly patients in the emergency department demonstrating its feasibility for use with these users. Of 146 patients, 141 (96.6%) consented to participate and played our serious game. Refusals to play the game were typically due to concerns of family members rather than unwillingness of the patient to play the game. Performance on the serious game correlated significantly with the MoCA (r=–.339, P <.001) and MMSE (r=–.558, P <.001), and correlated (point-biserial correlation) with the CAM (r=.565, P <.001) and with other cognitive assessments. Conclusions This research demonstrates the feasibility of using serious games in a clinical setting. Further research is required to demonstrate the validity and reliability of game-based assessments for clinical decision making. PMID:27234145

  15. A Low-Cost Contact System to Assess Load Displacement Velocity in a Resistance Training Machine

    PubMed Central

    Buscà, Bernat; Font, Anna

    2011-01-01

    This study sought to determine the validity of a new system for assessing the displacement and average velocity within machine-based resistance training exercise using the Chronojump System. The new design is based on a contact bar and a simple, low-cost mechanism that detects the conductivity of electrical potentials with a precision chronograph. This system allows coaches to assess velocity to control the strength training process. A validation study was performed by assessing the concentric phase parameters of a leg press exercise. Output time data from the Chronojump System in combination with the pre-established range of movement was compared with data from a position sensor connected to a Biopac System. A subset of 87 actions from 11 professional tennis players was recorded and, using the two methods, average velocity and displacement variables in the same action were compared. A t-test for dependent samples and a correlation analysis were undertaken. The r value derived from the correlation between the Biopac System and the contact Chronojump System was >0.94 for all measures of displacement and velocity on all loads (p < 0.01). The Effect Size (ES) was 0.18 in displacement and 0.14 in velocity and ranged from 0.09 to 0.31 and from 0.07 to 0.34, respectively. The magnitude of the difference between the two methods in all parameters and the correlation values provided certain evidence of validity of the Chronojump System to assess the average displacement velocity of loads in a resistance training machine. Key points The assessment of speed in resistance machines is a valuable source of information for strength training. Many commercial systems used to assess velocity, power and force are expensive thereby preventing widespread use by coaches and athletes. The system is intended to be a low-cost device for assessing and controlling the velocity exerted on each repetition in any resistance training machine. The system could be easily adapted in any vertical

  16. A low-cost contact system to assess load displacement velocity in a resistance training machine.

    PubMed

    Buscà, Bernat; Font, Anna

    2011-01-01

    This study sought to determine the validity of a new system for assessing the displacement and average velocity within machine-based resistance training exercise using the Chronojump System. The new design is based on a contact bar and a simple, low-cost mechanism that detects the conductivity of electrical potentials with a precision chronograph. This system allows coaches to assess velocity to control the strength training process. A validation study was performed by assessing the concentric phase parameters of a leg press exercise. Output time data from the Chronojump System in combination with the pre-established range of movement was compared with data from a position sensor connected to a Biopac System. A subset of 87 actions from 11 professional tennis players was recorded and, using the two methods, average velocity and displacement variables in the same action were compared. A t-test for dependent samples and a correlation analysis were undertaken. The r value derived from the correlation between the Biopac System and the contact Chronojump System was >0.94 for all measures of displacement and velocity on all loads (p < 0.01). The Effect Size (ES) was 0.18 in displacement and 0.14 in velocity and ranged from 0.09 to 0.31 and from 0.07 to 0.34, respectively. The magnitude of the difference between the two methods in all parameters and the correlation values provided certain evidence of validity of the Chronojump System to assess the average displacement velocity of loads in a resistance training machine. Key pointsThe assessment of speed in resistance machines is a valuable source of information for strength training.Many commercial systems used to assess velocity, power and force are expensive thereby preventing widespread use by coaches and athletes.The system is intended to be a low-cost device for assessing and controlling the velocity exerted on each repetition in any resistance training machine.The system could be easily adapted in any vertical

  17. Validity of self-assessment in a quality improvement collaborative in Ecuador.

    PubMed

    Hermida, Jorge; Broughton, Edward I; Miller Franco, Lynne

    2011-12-01

    Health care quality improvement (QI) efforts commonly use self-assessment to measure compliance with quality standards. This study investigates the validity of self-assessment of quality indicators. Cross sectional. A maternal and newborn care improvement collaborative intervention conducted in health facilities in Ecuador in 2005. Four external evaluators were trained in abstracting medical records to calculate six indicators reflecting compliance with treatment standards. About 30 medical records per month were examined at 12 participating health facilities for a total of 1875 records. The same records had already been reviewed by QI teams at these facilities (self-assessment). Overall compliance, agreement (using the Kappa statistic), sensitivity and specificity were analyzed. We also examined patterns of disagreement and the effect of facility characteristics on levels of agreement. External evaluators reported compliance of 69-90%, while self-assessors reported 71-92%, with raw agreement of 71-95% and Kappa statistics ranging from fair to almost perfect agreement. Considering external evaluators as the gold standard, sensitivity of self-assessment ranged from 90 to 99% and specificity from 48 to 86%. Simpler indicators had fewer disagreements. When disagreements occurred between self-assessment and external valuators, the former tended to report more positive findings in five of six indicators, but this tendency was not of a magnitude to change program actions. Team leadership, understanding of the tools and facility size had no overall impact on the level of agreement. When compared with external evaluation (gold standard), self-assessment was found to be sufficiently valid for tracking QI team performance. Sensitivity was generally higher than specificity. Simplifying indicators may improve validity.

  18. Expert validation of a teamwork assessment rubric: A modified Delphi study.

    PubMed

    Parratt, Jenny A; Fahy, Kathleen M; Hutchinson, Marie; Lohmann, Gui; Hastie, Carolyn R; Chaseling, Marilyn; O'Brien, Kylie

    2016-01-01

    Teamwork is a 'soft skill' employability competence desired by employers. Poor teamwork skills in healthcare have an impact on adverse outcomes. Teamwork skills are rarely the focus of teaching and assessment in undergraduate courses. The TeamUP Rubric is a tool used to teach and evaluate undergraduate students' teamwork skills. Students also use the rubric to give anonymised peer feedback during team-based academic assignments. The rubric's five domains focus on planning, environment, facilitation, conflict management and individual contribution; each domain is grounded in relevant theory. Students earn marks for their teamwork skills; validity of the assessment rubric is critical. To what extent do experts agree that the TeamUP Rubric is a valid assessment of 'teamwork skills'? Modified Delphi technique incorporating Feminist Collaborative Conversations. A heterogeneous panel of 35 professionals with recognised expertise in communications and/or teamwork. Three Delphi rounds using a survey that included the rubric were conducted either face-to-face, by telephone or online. Quantitative analysis yielded item content validity indices (I-CVI); minimum consensus was pre-set at 70%. An average of the I-CVI also yielded sub-scale (domain) (D-CVI/Ave) and scale content validity indices (S-CVI/Ave). After each Delphi round, qualitative data were analysed and interpreted; Feminist Collaborative Conversations by the research team aimed to clarify and confirm consensus about the wording of items on the rubric. Consensus (at 70%) was obtained for all but one behavioural descriptor of the rubric. We modified that descriptor to address expert concerns. The TeamUP Rubric (Version 4) can be considered to be well validated at that level of consensus. The final rubric reflects underpinning theory, with no areas of conceptual overlap between rubric domains. The final TeamUP Rubric arising from this study validly measures individual student teamwork skills and can be used with

  19. Skeletal age assessment in children using an open compact MRI system.

    PubMed

    Terada, Yasuhiko; Kono, Saki; Tamada, Daiki; Uchiumi, Tomomi; Kose, Katsumi; Miyagi, Ryo; Yamabe, Eiko; Yoshioka, Hiroshi

    2013-06-01

    MRI may be a noninvasive and alternative tool for skeletal age assessment in children, although few studies have reported on this topic. In this article, skeletal age was assessed over a wide range of ages using an open, compact MRI optimized for the imaging of a child's hand and wrist, and its validity was evaluated. MR images and their three-dimensional segmentation visualized detailed skeletal features of each bone in the hand and wrist. Skeletal age was then independently scored from the MR images by two raters, according to the Tanner-Whitehouse Japan system. The skeletal age assessed by MR rating demonstrated a strong positive correlation with chronological age. The intrarater and inter-rater reproducibilities were significantly high. These results demonstrate the validity and reliability of skeletal age assessment using MRI. Copyright © 2012 Wiley Periodicals, Inc.

  20. An Application of Practical Strategies in Assessing the Criterion-Related Validity of Credentialing Examinations.

    ERIC Educational Resources Information Center

    Fidler, James R.

    1993-01-01

    Criterion-related validities of 2 laboratory practitioner certification examinations for medical technologists (MTs) and medical laboratory technicians (MLTs) were assessed for 81 MT and 70 MLT examinees. Validity coefficients are presented for both measures. Overall, summative ratings yielded stronger validity coefficients than ratings based on…

  1. The Reliability and Validity of the Thoracolumbar Injury Classification System in Pediatric Spine Trauma.

    PubMed

    Savage, Jason W; Moore, Timothy A; Arnold, Paul M; Thakur, Nikhil; Hsu, Wellington K; Patel, Alpesh A; McCarthy, Kathryn; Schroeder, Gregory D; Vaccaro, Alexander R; Dimar, John R; Anderson, Paul A

    2015-09-15

    The thoracolumbar injury classification system (TLICS) was evaluated in 20 consecutive pediatric spine trauma cases. The purpose of this study was to determine the reliability and validity of the TLICS in pediatric spine trauma. The TLICS was developed to improve the categorization and management of thoracolumbar trauma. TLICS has been shown to have good reliability and validity in the adult population. The clinical and radiographical findings of 20 pediatric thoracolumbar fractures were prospectively presented to 20 surgeons with disparate levels of training and experience with spinal trauma. These injuries were consecutively scored using the TLICS. Cohen unweighted κ coefficients and Spearman rank order correlation values were calculated for the key parameters (injury morphology, status of posterior ligamentous complex, neurological status, TLICS total score, and proposed management) to assess the inter-rater reliabilities. Five surgeons scored the same cases 3 months later to assess the intra-rater reliability. The actual management of each case was then compared with the treatment recommended by the TLICS algorithm to assess validity. The inter-rater κ statistics of all subgroups (injury morphology, status of the posterior ligamentous complex, neurological status, TLICS total score, and proposed treatment) were within the range of moderate to substantial reproducibility (0.524-0.958). All subgroups had excellent intra-rater reliability (0.748-1.000). The various indices for validity were calculated (80.3% correct, 0.836 sensitivity, 0.785 specificity, 0.676 positive predictive value, 0.899 negative predictive value). Overall, TLICS demonstrated good validity. The TLICS has good reliability and validity when used in the pediatric population. The inter-rater reliability of predicting management and indices for validity are lower than those in adults with thoracolumbar fractures, which is likely due to differences in the way children are treated for certain

  2. Simulated Driving Assessment (SDA) for Teen Drivers: Results from a Validation Study

    PubMed Central

    McDonald, Catherine C.; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S.; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K.

    2015-01-01

    Background Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardized assessments of teen driving skills exist. The purpose of this study was to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. Methods The SDA's 35-minute simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16–17 years, provisional license ≤90 days) and 17 experienced adults (age 25–50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor reviewed videos of SDA performance (DEI Score). Results The SDA demonstrated construct validity: 1.) Teens had a higher Error Score than adults (30 vs. 13, p=0.02); 2.) For each additional error committed, the relative risk of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI: 1.05–1.10, p<0.01). The SDA demonstrated criterion validity: Error Score was correlated with DEI Score (r=−0.66, p<0.001). Conclusions This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. PMID:25740939

  3. Simulated Driving Assessment (SDA) for teen drivers: results from a validation study.

    PubMed

    McDonald, Catherine C; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K

    2015-06-01

    Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardised assessments of teen driving skills exist. The purpose of this study is to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. The SDA's 35 min simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16-17 years, provisional license ≤90 days) and 17 experienced adults (age 25-50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor (DEI Score) reviewed videos of SDA performance. The SDA demonstrated construct validity: (1) teens had a higher Error Score than adults (30 vs. 13, p=0.02); (2) For each additional error committed, the RR of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI 1.05 to 1.10, p<0.01). The SDA-demonstrated criterion validity: Error Score was correlated with DEI Score (r=-0.66, p<0.001). This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  4. Assessing attitude toward same-sex marriage: scale development and validation.

    PubMed

    Lannutti, Pamela J; Lachlan, Kenneth A

    2007-01-01

    This paper reports the results of three studies conducted to develop, refine, and validate a scale which assessed heterosexual adults' attitudes toward same-sex marriage, the Attitude Toward Same-Sex Marriage Scale (ASSMS). The need for such a scale is evidenced in the increasing importance of same-sex marriage in the political arena of the United States and other nations, as well as the growing body of empirical research examining same-sex marriage and related issues (e.g., Lannutti, 2005; Solomon, Rothblum, & Balsam, 2004). The results demonstrate strong reliability, convergent validity, and predictive validity for the ASSMS and suggest that the ASSMS may be adapted to measure attitudes toward civil unions and other forms of relational recognition for same-sex couples. Gender comparisons using the validated scale showed that in college and non-college samples, women had a significantly more positive attitude toward same-sex marriage than did men.

  5. Assessing peristomal skin changes in ostomy patients: validation of the Ostomy Skin Tool.

    PubMed

    Jemec, G B; Martins, L; Claessens, I; Ayello, E A; Hansen, A S; Poulsen, L H; Sibbald, R G

    2011-02-01

    Peristomal skin problems are common and are treated by a variety of health professionals. Clear and consistent communication among these professionals is therefore particularly important. The Ostomy Skin Tool (OST) is a new assessment instrument for the extent and severity of peristomal skin conditions. Formal tests of reliability and validity are necessary for its use in clinical practice, research, and education. To estimate inter- and intra nurse assessment variability of the OST and validity by comparison to a 'gold standard' (GS) defined by an expert panel. Thirty photographs of peristomal skin were presented twice to 20 ostomy care nurses--10 from Denmark (DK) and 10 from Spain (ES)--to determine intra- and inter nurse assessment variability. The same photographs were presented to an international group of experts (dermatologist and ostomy care nurses), to establish a GS for comparison and validation of the results. A high intra-nurse assessment agreement, κ=0·84, was found with no differences in the intra-nurse assessments from the two groups of nurses (DK and ES). The inter-nurse assessment agreement was 'moderate to good', κ=0·54, with the agreement between the experts higher, κ=0·70. A high correlation between the scores from the nurses and the GS were seen in the lower part of the two scales [Discoloration, Erosion, Tissue overgrowth (DET) score<7)]. The study supported the validity of the OST. It is suggested that a categorical scale can be used to illustrate the severity of the DET scores. © 2011 The Authors. BJD © 2011 British Association of Dermatologists.

  6. [Support of the nursing process through electronic nursing documentation systems (UEPD) – Initial validation of an instrument].

    PubMed

    Hediger, Hannele; Müller-Staub, Maria; Petry, Heidi

    2016-01-01

    Electronic nursing documentation systems, with standardized nursing terminology, are IT-based systems for recording the nursing processes. These systems have the potential to improve the documentation of the nursing process and to support nurses in care delivery. This article describes the development and initial validation of an instrument (known by its German acronym UEPD) to measure the subjectively-perceived benefits of an electronic nursing documentation system in care delivery. The validity of the UEPD was examined by means of an evaluation study carried out in an acute care hospital (n = 94 nurses) in German-speaking Switzerland. Construct validity was analyzed by principal components analysis. Initial references of validity of the UEPD could be verified. The analysis showed a stable four factor model (FS = 0.89) scoring in 25 items. All factors loaded ≥ 0.50 and the scales demonstrated high internal consistency (Cronbach's α = 0.73 – 0.90). Principal component analysis revealed four dimensions of support: establishing nursing diagnosis and goals; recording a case history/an assessment and documenting the nursing process; implementation and evaluation as well as information exchange. Further testing with larger control samples and with different electronic documentation systems are needed. Another potential direction would be to employ the UEPD in a comparison of various electronic documentation systems.

  7. External validation of Global Evaluative Assessment of Robotic Skills (GEARS).

    PubMed

    Aghazadeh, Monty A; Jayaratna, Isuru S; Hung, Andrew J; Pan, Michael M; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C

    2015-11-01

    We demonstrate the construct validity, reliability, and utility of Global Evaluative Assessment of Robotic Skills (GEARS), a clinical assessment tool designed to measure robotic technical skills, in an independent cohort using an in vivo animal training model. Using a cross-sectional observational study design, 47 voluntary participants were categorized as experts (>30 robotic cases completed as primary surgeon) or trainees. The trainee group was further divided into intermediates (≥5 but ≤30 cases) or novices (<5 cases). All participants completed a standardized in vivo robotic task in a porcine model. Task performance was evaluated by two expert robotic surgeons and self-assessed by the participants using the GEARS assessment tool. Kruskal-Wallis test was used to compare the GEARS performance scores to determine construct validity; Spearman's rank correlation measured interobserver reliability; and Cronbach's alpha was used to assess internal consistency. Performance evaluations were completed on nine experts and 38 trainees (14 intermediate, 24 novice). Experts demonstrated superior performance compared to intermediates and novices overall and in all individual domains (p < 0.0001). In comparing intermediates and novices, the overall performance difference trended toward significance (p = 0.0505), while the individual domains of efficiency and autonomy were significantly different between groups (p = 0.0280 and 0.0425, respectively). Interobserver reliability between expert ratings was confirmed with a strong correlation observed (r = 0.857, 95 % CI [0.691, 0.941]). Experts and participant scoring showed less agreement (r = 0.435, 95 % CI [0.121, 0.689] and r = 0.422, 95 % CI [0.081, 0.0672]). Internal consistency was excellent for experts and participants (α = 0.96, 0.98, 0.93). In an independent cohort, GEARS was able to differentiate between different robotic skill levels, demonstrating excellent construct validity. As a standardized

  8. Development and initial validation of the assessment of caregiver experience with neuromuscular disease.

    PubMed

    Matsumoto, Hiroko; Clayton-Krasinski, Debora A; Klinge, Stephen A; Gomez, Jaime A; Booker, Whitney A; Hyman, Joshua E; Roye, David P; Vitale, Michael G

    2011-01-01

    Orthopaedic intervention can have a wide range of functional and psychosocial effects on children with neuromuscular disease (NMD). In the multihandicapped child (Gross Motor Classification System IV/V), functional status, pain, psychosocial function, and health-related quality of life also have effects on the families of these child. The purpose of this study is to report the development and initial validation of an outcomes instrument specifically designed to assess the caregiver impact experienced by parents raising severely affected NMD children: the Assessment of Caregiver Experience with Neuromuscular Disease (ACEND). In the first part of this prospective study, 61 children with NMD and their parents were administered a range of earlier validated pediatric health measures. A framework technique was used to select the most appropriate and relevant subset of questions from this large set. Sensitivity analyses guided the development of a master question list measuring caregiver impact, excluding items with low relevance, and modifying unclear questions. In the second part of the study, the ACEND was administered to the caregivers of 46 children with moderate-to-severe NMD. Statistical analyses were conducted to determine validity of the instrument. The resulting ACEND instrument included 2 domains, 7 subdomains, and 41 items. Domain 1, examining physical impact, includes 4 subdomains: feeding/grooming/dressing (6 items), sitting/play (5 items), transfers (5 items), and mobility (7 items). Domain 2, which examines general caregiver impact, included 3 subdomains: time (4 items), emotion (9 items), and finance (5 items). Mean overall relevance rating was 6.21 ± 0.37 and clarity rating was 6.68 ± 0.52 (scale 0 to 7). Multiple floor effects in patients with GMFCS V and ceiling effects in patients with GMFCS III were identified almost exclusively in motor-based items. Virtually no floor or ceiling effects were identified in the time, emotion or finance domains

  9. [Spanish validation of the MacArthur Competence Assessment Tool for Treatment interview to assess patients competence to consent treatment].

    PubMed

    Alvarez Marrodán, Ignacio; Baón Pérez, Beatriz; Navío Acosta, Mercedes; López-Antón, Raul; Lobo Escolar, Elena; Ventura Faci, Tirso

    2014-09-09

    To validate the MacArthur Competence Assessment Tool for Treatment (MacCAT-T) Spanish version, which assesses the mental capacity of patients to consent treatment, by examining 4 areas (Understanding, Appreciation, Reasoning and Expressing a choice). 160 subjects (80 Internal Medicine inpatients, 40 Psychiatric inpatients and 40 healthy controls). MacCAT-T, Mini-Mental Status Examination (MMSE). Feasibility study, reliability and validity calculations (against to gold standard of clinical expert). Mean duration of the MacCAT-T interview was 18min. Inter-rater reliability: Intraclass correlation coefficient for Understanding=0.98, Appreciation=0.97, Reasoning=0.98, Expressing a choice=0.91. Internal consistency (Cronbach's alpha): Understanding=0.87, for Appreciation=0.76, for Reasoning=0.86. Patients considered to be incapable (gold standard) scored lower in all the MacCAT-T areas. Poor performance on the MacCAT-T was related to cognitive impairment assessed by MMSE. Spanish version of the MacCAT-T is feasible, reliable, and valid for assessing the capacity of patients to consent treatment. Copyright © 2013 Elsevier España, S.L. All rights reserved.

  10. Validation of balance-quality assessment using a modified bathroom scale.

    PubMed

    Hewson, D J; Duchêne, J; Hogrel, J-Y

    2015-02-01

    The balance quality tester (BQT), based on a standard electronic bathroom scale has been developed in order to assess balance quality. The BQT includes automatic detection of the person to be tested by means of an infrared detector and bluetooth communication capability for remote assessment when linked to a long-distance communication device such as a mobile phone. The BQT was compared to a standard force plate for validity and agreement. The two most widely reported parameters in balance literature, the area of the centre of pressure (COP) displacement and the velocity of the COP displacement, were compared for 12 subjects, each of whom was tested on ten occasions on each of the 2 days. No significant differences were observed between the BQT and the force plate for either of the two parameters. In addition a high level of agreement was observed between both devices. The BQT is a valid device for remote assessment of balance quality, and could provide a useful tool for long-term monitoring of people with balance problems, particularly during home monitoring.

  11. Evidence on existing caries risk assessment systems: are they predictive of future caries?

    PubMed

    Tellez, M; Gomez, J; Pretty, I; Ellwood, R; Ismail, A I

    2013-02-01

    To critically appraise evidence for the prediction of caries using four caries risk assessment (CRA) systems/guidelines (Cariogram, Caries Management by Risk Assessment (CAMBRA), American Dental Association (ADA), and American Academy of Pediatric Dentistry (AAPD)). This review focused on prospective cohort studies or randomized controlled trials. A systematic search strategy was developed to locate papers published in Medline Ovid and Cochrane databases. The search identified 539 scientific reports, and after title and abstract review, 137 were selected for full review and 14 met the following inclusion criteria: (i) used as validating criterion caries incidence/increment, (ii) involved human subjects and natural carious lesions, and (iii) published in peer-reviewed journals. In addition, papers were excluded if they met one or more of the following criteria: (i) incomplete description of sample selection, outcomes, or small sample size and (ii) not meeting the criteria for best evidence under the prognosis category of the Oxford Centre for Evidence-Based Medicine. There are wide variations among the systems in terms of definitions of caries risk categories, type and number of risk factors/markers, and disease indicators. The Cariogram combined sensitivity and specificity for predicting caries in permanent dentition ranges from 110 to 139 and is the only system for which prospective studies have been conducted to assess its validity. The Cariogram had limited prediction utility in preschool children, and a moderate to good performance for sorting out elderly individuals into caries risk groups. One retrospective analysis on CAMBRA's CRA reported higher incidence of cavitated lesions among those assessed as extreme-risk patients when compared with those at low risk. The evidence on the validity for existing systems for CRA is limited. It is unknown if the identification of high-risk individuals can lead to more effective long-term patient management that prevents

  12. Development and Validation of the Conceptual Assessment of Natural Selection (CANS)

    ERIC Educational Resources Information Center

    Kalinowski, Steven T.; Leonard, Mary J.; Taper, Mark L.

    2016-01-01

    We developed and validated the Conceptual Assessment of Natural Selection (CANS), a multiple-choice test designed to assess how well college students understand the central principles of natural selection. The expert panel that reviewed the CANS concluded its questions were relevant to natural selection and generally did a good job sampling the…

  13. Reliability and Validity of Rubrics for Assessment through Writing

    ERIC Educational Resources Information Center

    Rezaei, Ali Reza; Lovorn, Michael

    2010-01-01

    This experimental project investigated the reliability and validity of rubrics in assessment of students' written responses to a social science "writing prompt". The participants were asked to grade one of the two samples of writing assuming it was written by a graduate student. In fact both samples were prepared by the authors. The…

  14. Examinee Noneffort and the Validity of Program Assessment Results

    ERIC Educational Resources Information Center

    Wise, Steven L.; DeMars, Christine E.

    2010-01-01

    Educational program assessment studies often use data from low-stakes tests to provide evidence of program quality. The validity of scores from such tests, however, is potentially threatened by examinee noneffort. This study investigated the extent to which one type of noneffort--rapid-guessing behavior--distorted the results from three types of…

  15. A Validation Study of Early Adolescents' Pubertal Self-Assessments

    ERIC Educational Resources Information Center

    Schmitz, Katharine E.; Hovell, Melbourne F.; Nichols, Jeanne F.; Irvin, Veronica L.; Keating, Kristen; Simon, Gayle M.; Gehrman, Christine; Jones, Kenneth Lee

    2004-01-01

    This study aimed to determine whether self-assessed puberty is sufficiently reliable and valid to substitute for physician examination when feasibility of physician examination is low (e.g., behavioral research). Adolescents (convenience sample N = 178 endocrinology patients and N = 125 from educational trial; mean age 12.7 and 11.3 years,…

  16. The Validity and Reliability of a Performance Assessment Procedure in Ice Hockey

    ERIC Educational Resources Information Center

    Nadeau, Luc; Richard, Jean-Francois; Godbout, Paul

    2008-01-01

    Background: Coaches and physical educators must obtain valid data relating to the contribution of each of their players in order to assess their level of performance in team sport competition. This information must also be collected and used in real game situations to be more valid. Developed initially for a physical education class context, the…

  17. Skill assessment of the coupled physical-biogeochemical operational Mediterranean Forecasting System

    NASA Astrophysics Data System (ADS)

    Cossarini, Gianpiero; Clementi, Emanuela; Salon, Stefano; Grandi, Alessandro; Bolzon, Giorgio; Solidoro, Cosimo

    2016-04-01

    The Mediterranean Monitoring and Forecasting Centre (Med-MFC) is one of the regional production centres of the European Marine Environment Monitoring Service (CMEMS-Copernicus). Med-MFC operatively manages a suite of numerical model systems (3DVAR-NEMO-WW3 and 3DVAR-OGSTM-BFM) that provides gridded datasets of physical and biogeochemical variables for the Mediterranean marine environment with a horizontal resolution of about 6.5 km. At the present stage, the operational Med-MFC produces ten-day forecast: daily for physical parameters and bi-weekly for biogeochemical variables. The validation of the coupled model system and the estimate of the accuracy of model products are key issues to ensure reliable information to the users and the downstream services. Product quality activities at Med-MFC consist of two levels of validation and skill analysis procedures. Pre-operational qualification activities focus on testing the improvement of the quality of a new release of the model system and relays on past simulation and historical data. Then, near real time (NRT) validation activities aim at the routinely and on-line skill assessment of the model forecast and relays on the NRT available observations. Med-MFC validation framework uses both independent (i.e. Bio-Argo float data, in-situ mooring and vessel data of oxygen, nutrients and chlorophyll, moored buoys, tide-gauges and ADCP of temperature, salinity, sea level and velocity) and semi-independent data (i.e. data already used for assimilation, such as satellite chlorophyll, Satellite SLA and SST and in situ vertical profiles of temperature and salinity from XBT, Argo and Gliders) We give evidence that different variables (e.g. CMEMS-products) can be validated at different levels (i.e. at the forecast level or at the level of model consistency) and at different spatial and temporal scales. The fundamental physical parameters temperature, salinity and sea level are routinely validated on daily, weekly and quarterly base

  18. APPROACHES TO ASSESSING THE VALIDITY OF A FUNCTIONAL OBSERVATIONAL BATTERY

    EPA Science Inventory

    With the growing importance of neurobehavioral assessments at the preliminary stage of chemical testing, it is critical that the screening procedures utilized be valid indicators of neurobehavioral dysfunction in addition to being sensitive, specific, and reliable. fforts in this...

  19. Validation of the German version of the Clinical Assessment Interview for Negative Symptoms (CAINS).

    PubMed

    Engel, Maike; Fritzsche, Anja; Lincoln, Tania Marie

    2014-12-15

    Validated assessment instruments could contribute to a better understanding and assessment of negative symptoms and advance treatment research. The aim of this study was to examine the psychometric properties of a German version of the Clinical Assessment Interview for Negative Symptoms (CAINS). In- and outpatients (N=53) with schizophrenia or schizoaffective disorder were assessed with standardized interviews and questionnaires on negative and positive symptoms and general psychopathology in schizophrenia, depression, the ability to experience anticipatory and consummatory pleasure, and global functioning. The results indicated good psychometric properties, high internal consistency and promising inter-rater agreement for the German version of the CAINS. The two-factor solution of the original version of the CAINS was confirmed, indicating good construct validity. Convergent validity was supported by significant correlations between the CAINS subscales with the negative symptom scale of the Positive and Negative Syndrome Scale, and with consummatory pleasure. The CAINS also exhibited discriminant validity indicated by its non-significant correlations with positive symptoms, general psychopathology and depression that are in line with the findings for the original version of the CAINS. In addition, the CAINS correlated moderately with global functioning. The German version of the CAINS appears to be a valid and suitable diagnostic tool for measuring negative symptoms in schizophrenia. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  20. Is the Veterans Specific Activity Questionnaire Valid to Assess Older Adults Aerobic Fitness?

    PubMed

    de Carvalho Bastone, Alessandra; de Souza Moreira, Bruno; Teixeira, Claudine Patrícia; Dias, João Marcos Domingues; Dias, Rosângela Corrêa

    2016-01-01

    Aerobic fitness in older adults is related to health status, incident disability, nursing home admission, and all-cause mortality. The most accurate quantification of aerobic fitness, expressed as peak oxygen consumption in mL·kg·min, is the cardiorespiratory exercise test; however, it is not feasible in all settings and might offer risk to patients. The Veterans Specific Activity Questionnaire (VSAQ) is a 13-item self-administered symptom questionnaire that estimates aerobic fitness expressed in metabolic equivalents (METs) and has been validated to cardiovascular patients. The purpose of this study was to assess the validity and reliability of the VSAQ in older adults without specific health conditions. A methodological study with a cross-sectional design was conducted with 28 older adults (66-86 years). The VSAQ was administered on 3 occasions by 2 evaluators. Aerobic capacity in METs as measured by the VSAQ was compared with the METs found in an incremental shuttle walk test (ISWT) performed with a portable metabolic measurement system and with accelerometer data. The validity of the VSAQ was found to be moderate-to-good when compared with the METs and distance measured by the ISWT and with the moderate activity per day and steps per day obtained by accelerometry. The Bland-Altman graph analysis showed no values outside the limits of agreement, suggesting good precision between the METs estimated by questionnaire and the METs measured by the ISWT. Also, the intrarater and interrater reliabilities of the instrument were good. The results showed that the VSAQ is a valuable tool to assess the aerobic fitness of older adults.

  1. Additional Support for the Information Systems Analyst Exam as a Valid Program Assessment Tool

    ERIC Educational Resources Information Center

    Carpenter, Donald A.; Snyder, Johnny; Slauson, Gayla Jo; Bridge, Morgan K.

    2011-01-01

    This paper presents a statistical analysis to support the notion that the Information Systems Analyst (ISA) exam can be used as a program assessment tool in addition to measuring student performance. It compares ISA exam scores earned by students in one particular Computer Information Systems program with scores earned by the same students on the…

  2. 'Mechanical restraint-confounders, risk, alliance score': testing the clinical validity of a new risk assessment instrument.

    PubMed

    Deichmann Nielsen, Lea; Bech, Per; Hounsgaard, Lise; Alkier Gildberg, Frederik

    2017-08-01

    Unstructured risk assessment, as well as confounders (underlying reasons for the patient's risk behaviour and alliance), risk behaviour, and parameters of alliance, have been identified as factors that prolong the duration of mechanical restraint among forensic mental health inpatients. To clinically validate a new, structured short-term risk assessment instrument called the Mechanical Restraint-Confounders, Risk, Alliance Score (MR-CRAS), with the intended purpose of supporting the clinicians' observation and assessment of the patient's readiness to be released from mechanical restraint. The content and layout of MR-CRAS and its user manual were evaluated using face validation by forensic mental health clinicians, content validation by an expert panel, and pilot testing within two, closed forensic mental health inpatient units. The three sub-scales (Confounders, Risk, and a parameter of Alliance) showed excellent content validity. The clinical validations also showed that MR-CRAS was perceived and experienced as a comprehensible, relevant, comprehensive, and useable risk assessment instrument. MR-CRAS contains 18 clinically valid items, and the instrument can be used to support the clinical decision-making regarding the possibility of releasing the patient from mechanical restraint. The present three studies have clinically validated a short MR-CRAS scale that is currently being psychometrically tested in a larger study.

  3. Improving the quality of discrete-choice experiments in health: how can we assess validity and reliability?

    PubMed

    Janssen, Ellen M; Marshall, Deborah A; Hauber, A Brett; Bridges, John F P

    2017-12-01

    The recent endorsement of discrete-choice experiments (DCEs) and other stated-preference methods by regulatory and health technology assessment (HTA) agencies has placed a greater focus on demonstrating the validity and reliability of preference results. Areas covered: We present a practical overview of tests of validity and reliability that have been applied in the health DCE literature and explore other study qualities of DCEs. From the published literature, we identify a variety of methods to assess the validity and reliability of DCEs. We conceptualize these methods to create a conceptual model with four domains: measurement validity, measurement reliability, choice validity, and choice reliability. Each domain consists of three categories that can be assessed using one to four procedures (for a total of 24 tests). We present how these tests have been applied in the literature and direct readers to applications of these tests in the health DCE literature. Based on a stakeholder engagement exercise, we consider the importance of study characteristics beyond traditional concepts of validity and reliability. Expert commentary: We discuss study design considerations to assess the validity and reliability of a DCE, consider limitations to the current application of tests, and discuss future work to consider the quality of DCEs in healthcare.

  4. Translation into Brazilian Portuguese and validation of the "Quantitative Global Scarring Grading System for Post-acne Scarring" *

    PubMed Central

    Cachafeiro, Thais Hofmann; Escobar, Gabriela Fortes; Maldonado, Gabriela; Cestari, Tania Ferreira

    2014-01-01

    The "Quantitative Global Scarring Grading System for Postacne Scarring" was developed in English for acne scar grading, based on the number and severity of each type of scar. The aims of this study were to translate this scale into Brazilian Portuguese and verify its reliability and validity. The study followed five steps: Translation, Expert Panel, Back Translation, Approval of authors and Validation. The translated scale showed high internal consistency and high test-retest reliability, confirming its reproducibility. Therefore, it has been validated for our population and can be recommended as a reliable instrument to assess acne scarring. PMID:25184939

  5. Reliability and validity of a smartphone pulse rate application for the assessment of resting and elevated pulse rate.

    PubMed

    Mitchell, Katy; Graff, Megan; Hedt, Corbin; Simmons, James

    2016-08-01

    Purpose/hypothesis: This study was designed to investigate the test-retest reliability, concurrent validity, and the standard error of measurement (SEm) of a pulse rate assessment application (Azumio®'s Instant Heart Rate) on both Android® and iOS® (iphone operating system) smartphones as compared to a FT7 Polar® Heart Rate monitor. Number of subjects: 111. Resting (sitting) pulse rate was assessed twice and then the participants were asked to complete a 1-min standing step test and then immediately re-assessed. The smartphone assessors were blinded to their measurements. Test-retest reliability (intraclass correlation coefficient [ICC 2,1] and 95% confidence interval) for the three tools at rest (time 1/time 2): iOS® (0.76 [0.67-0.83]); Polar® (0.84 [0.78-0.89]); and Android® (0.82 [0.75-0.88]). Concurrent validity at rest time 2 (ICC 2,1) with the Polar® device: IOS® (0.92 [0.88-0.94]) and Android® (0.95 [0.92-0.96]). Concurrent validity post-exercise (time 3) (ICC) with the Polar® device: iOS® (0.90 [0.86-0.93]) and Android® (0.94 [0.91-0.96]). The SEm values for the three devices at rest: iOS® (5.77 beats per minute [BPM]), Polar® (4.56 BPM) and Android® (4.96 BPM). The Android®, iOS®, and Polar® devices showed acceptable test-retest reliability at rest and post-exercise. Both the smartphone platforms demonstrated concurrent validity with the Polar® at rest and post-exercise. The Azumio® Instant Heart Rate application when used by either platform appears to be a reliable and valid tool to assess pulse rate in healthy individuals.

  6. Brief International Cognitive Assessment for MS (BICAMS): international standards for validation.

    PubMed

    Benedict, Ralph H B; Amato, Maria Pia; Boringa, Jan; Brochet, Bruno; Foley, Fred; Fredrikson, Stan; Hamalainen, Paivi; Hartung, Hans; Krupp, Lauren; Penner, Iris; Reder, Anthony T; Langdon, Dawn

    2012-07-16

    An international expert consensus committee recently recommended a brief battery of tests for cognitive evaluation in multiple sclerosis. The Brief International Cognitive Assessment for MS (BICAMS) battery includes tests of mental processing speed and memory. Recognizing that resources for validation will vary internationally, the committee identified validation priorities, to facilitate international acceptance of BICAMS. Practical matters pertaining to implementation across different languages and countries were discussed. Five steps to achieve optimal psychometric validation were proposed. In Step 1, test stimuli should be standardized for the target culture or language under consideration. In Step 2, examiner instructions must be standardized and translated, including all information from manuals necessary for administration and interpretation. In Step 3, samples of at least 65 healthy persons should be studied for normalization, matched to patients on demographics such as age, gender and education. The objective of Step 4 is test-retest reliability, which can be investigated in a small sample of MS and/or healthy volunteers over 1-3 weeks. Finally, in Step 5, criterion validity should be established by comparing MS and healthy controls. At this time, preliminary studies are underway in a number of countries as we move forward with this international assessment tool for cognition in MS.

  7. Development and validation of a fatigue assessment scale for U.S. construction workers.

    PubMed

    Zhang, Mingzong; Sparer, Emily H; Murphy, Lauren A; Dennerlein, Jack T; Fang, Dongping; Katz, Jeffrey N; Caban-Martinez, Alberto J

    2015-02-01

    To develop a fatigue assessment scale and test its reliability and validity for commercial construction workers. Using a two-phased approach, we first identified items (first phase) for the development of a Fatigue Assessment Scale for Construction Workers (FASCW) through review of existing scales in the scientific literature, key informant interviews (n = 11) and focus groups (three groups with six workers each) with construction workers. The second phase included assessment for the reliability, validity, and sensitivity of the new scale using a repeated-measures study design with a convenience sample of construction workers (n = 144). Phase one resulted in a 16-item preliminary scale that after factor analysis yielded a final 10-item scale with two sub-scales ("Lethargy" and "Bodily Ailment"). During phase two, the FASCW and its subscales demonstrated satisfactory internal consistency (alpha coefficients were FASCW [0.91], Lethargy [0.86] and Bodily Ailment [0.84]) and acceptable test-retest reliability (Pearson Correlations Coefficients: 0.59-0.68; Intraclass Correlation Coefficients: 0.74-0.80). Correlation analysis substantiated concurrent and convergent validity. A discriminant analysis demonstrated that the FASCW differentiated between groups with arthritis status and different work hours. The 10-item FASCW with good reliability and validity is an effective tool for assessing the severity of fatigue among construction workers. © 2015 Wiley Periodicals, Inc.

  8. Development and Validation of a Fatigue Assessment Scale for U.S. Construction Workers

    PubMed Central

    Zhang, Mingzong; Sparer, Emily H.; Murphy, Lauren A.; Dennerlein, Jack T.; Fang, Dongping; Katz, Jeffrey N.; Caban-Martinez, Alberto J.

    2015-01-01

    Objective To develop a fatigue assessment scale and test its reliability and validity for commercial construction workers. Methods Using a two-phased approach, we first identified items for the development of a Fatigue Assessment Scale for Construction Workers (FASCW) through review of existing scales in the scientific literature, key informant interviews (n=11) and focus groups (3 groups with 6 workers each) with construction workers. The second phase included assessment for the reliability, validity and sensitivity of the new scale using a repeated-measures study design with a convenience sample of construction workers (n=144). Results Phase one resulted in a 16-item preliminary scale that after factor analysis yielded a final 10-item scale with two sub-scales (“Lethargy” and “Bodily Ailment”).. During phase two, the FASCW and its subscales demonstrated satisfactory internal consistency (alpha coefficients were FASCW (0.91), Lethargy (0.86) and Bodily Ailment (0.84)) and acceptable test-retest reliability (Pearson Correlations Coefficients: 0.59–0.68; Intraclass Correlation Coefficients: 0.74–0.80). Correlation analysis substantiated concurrent and convergent validity. A discriminant analysis demonstrated that the FASCW differentiated between groups with arthritis status and different work hours. Conclusions The 10-item FASCW with good reliability and validity is an effective tool for assessing the severity of fatigue among construction workers. PMID:25603944

  9. Brief Report: Improving the Validity of Assessments of Adolescents' Feelings of Privacy Invasion

    ERIC Educational Resources Information Center

    Laird, Robert D.; Marrero, Matthew D.; Melching, Jessica; Kuhn, Emily S.

    2013-01-01

    Studies of privacy invasion have relied on measures that combine items assessing adolescents' feelings of privacy invasion with items assessing parents' monitoring behaviors. Removing items assessing parents' monitoring behaviors may improve the validity of assessments of privacy invasion. Data were collected from 163 adolescents (M age 13 years,…

  10. Development and validation of an exercise performance support system for people with lower extremity impairment.

    PubMed

    Minor, M A; Reid, J C; Griffin, J Z; Pittman, C B; Patrick, T B; Cutts, J H

    1998-02-01

    To identify innovative strategies to support appropriate, self-directed exercise that increase physical activity levels of people with arthritis. This article reports on one interactive, multimedia exercise performance support system (PSS) for people with lower extremity impairments in strength or flexibility. An interdisciplinary team developed the PSS using self-report of lower extremity musculoskeletal impairments (flexibility and strength) to produce an individualized exercise program with video and print educational materials. Initial evaluation has investigated the validity and reliability of program assessments and recommendations. PSS self-report and professional assessments were similar, with more impairments indicated by self-report. PSS exercise recommendations were similar to those made by 3 expert physical therapists using the same exercise data base. Results of PSS impairment assessments were stable over a 1-week period. PSS exercise recommendations appear to be reliable and a valid reflection of current exercise knowledge in rheumatology. Furthermore, users were able to complete the computer-based program with minimal assistance and reported it to be enjoyable and informative.

  11. Cross-cultural adaptation and validation of Systemic Lupus Erythematosus Quality of Life questionnaire into Arabic.

    PubMed

    Aziz, M M; Galal, M A A; Elzohri, M H; El-Nouby, F; Leong, K P

    2018-04-01

    Systemic lupus erythematosus (SLE) is a chronic autoimmune disease which affects all aspects of quality of life (QoL) of the patients. Comprehensive patient assessment should include QoL measures in addition to the objective clinical measures of the disease. There is no specific Arabic instrument for assessment of QoL of SLE patients. The objective of this study was to translate and cross culturally adapt the SLEQOL questionnaire into Arabic and test its reliability and validity. The SLEQOL questionnaire was translated into Arabic based on the Guidelines for Translation and Cross-cultural Adaptation into other languages. Reliability was assessed by interviewing patients three times: two interviews on the same day by different interviewers and the third interview 14 days later by one of the first interviewers. Validity was assessed by correlating SLEQOL scores of 91 patients with 36-item Short Form Health Survey (SF-36) scores and clinical parameters of the patients. We found that the Arabic version of SLEQOL has a Cronbach's alpha of 0.936, interobserver and intraobserver correlation coefficients of 0.809 and 0.886 respectively. Strong correlations were also found between SLEQOL scores and SF-36 Physical and Mental Component summaries. In conclusion, the Arabic version of SLEQOL is a reliable and valid instrument for measuring QoL of Egyptian SLE patients.

  12. Assessing Physical Activity in Children with Asthma: Convergent Validity between Accelerometer and Electronic Diary Data

    ERIC Educational Resources Information Center

    Floro, Josh N.; Dunton, Genevieve F.; Delfino, Ralph J.

    2009-01-01

    Convergent validity of accelerometer and electronic diary physical activity data was assessed in children with asthma. Sixty-two participants, ages 9-18 years, wore an accelerometer and reported their physical activity level in quarter-hour segments every 2 hr using the Ambulatory Diary Assessment (ADA). Moderate validity was found between…

  13. Evaluation of Two Observational Assessment Systems for Children's Development and Learning

    ERIC Educational Resources Information Center

    Kim, Do-Hong; Smith, JaneDiane

    2010-01-01

    This study provided preliminary evidence for the reliability and validity of "Teaching Strategies GOLD", a recently developed observational system for assessing young children's development and learning. The measurement properties of "Teaching Strategies GOLD" were compared with those of an older instrument, "The Creative…

  14. Measuring Social Relationships in Different Social Systems: The Construction and Validation of the Evaluation of Social Systems (EVOS) Scale

    PubMed Central

    Aguilar-Raab, Corina; Grevenstein, Dennis; Schweitzer, Jochen

    2015-01-01

    Social interactions have gained increasing importance, both as an outcome and as a possible mediator in psychotherapy research. Still, there is a lack of adequate measures capturing relational aspects in multi-person settings. We present a new measure to assess relevant dimensions of quality of relationships and collective efficacy regarding interpersonal interactions in diverse personal and professional social systems including couple partnerships, families, and working teams: the EVOS. Theoretical dimensions were derived from theories of systemic family therapy and organizational psychology. The study was divided in three parts: In Study 1 (N = 537), a short 9-item scale with two interrelated factors was constructed on the basis of exploratory factor analysis. Quality of relationship and collective efficacy emerged as the most relevant dimensions for the quality of social systems. Study 2 (N = 558) confirmed the measurement model using confirmatory factor analysis and established validity with measures of family functioning, life satisfaction, and working team efficacy. Measurement invariance was assessed to ensure that EVOS captures the same latent construct in all social contexts. In Study 3 (N = 317), an English language adaptation was developed, which again confirmed the original measurement model. The EVOS is a theory-based, economic, reliable, and valid measure that covers important aspects of social relationships, applicable for different social systems. It is the first instrument of its kind and an important addition to existing measures of social relationships and related outcome measures in therapeutic and other counseling settings involving multiple persons. PMID:26200357

  15. Validation of an in vitro exposure system for toxicity assessment of air-delivered nanomaterials

    PubMed Central

    Kim, Jong Sung; Peters, Thomas M.; O’Shaughnessy, Patrick T.; Adamcakova-Dodd, Andrea; Thorne, Peter S.

    2013-01-01

    To overcome the limitations of in vitro exposure of submerged lung cells to nanoparticles (NPs), we validated an integrated low flow system capable of generating and depositing airborne NPs directly onto cells at an air–liquid interface (ALI). The in vitro exposure system was shown to provide uniform and controlled dosing of particles with 70.3% efficiency to epithelial cells grown on transwells. This system delivered a continuous airborne exposure of NPs to lung cells without loss of cell viability in repeated 4 h exposure periods. We sequentially exposed cells to air-delivered copper (Cu) NPs in vitro to compare toxicity results to our prior in vivo inhalation studies. The evaluation of cellular dosimetry indicated that a large amount of Cu was taken up, dissolved and released into the basolateral medium (62% of total mass). Exposure to Cu NPs decreased cell viability to 73% (p < 0.01) and significantly (p < 0.05) elevated levels of lactate dehydrogenase, intracellular reactive oxygen species and interleukin-8 that mirrored our findings from subacute in vivo inhalation studies in mice. Our results show that this exposure system is useful for screening of NP toxicity in a manner that represents cellular responses of the pulmonary epithelium in vivo. PMID:22981796

  16. Assessment Methodology for Process Validation Lifecycle Stage 3A.

    PubMed

    Sayeed-Desta, Naheed; Pazhayattil, Ajay Babu; Collins, Jordan; Chen, Shu; Ingram, Marzena; Spes, Jana

    2017-07-01

    The paper introduces evaluation methodologies and associated statistical approaches for process validation lifecycle Stage 3A. The assessment tools proposed can be applied to newly developed and launched small molecule as well as bio-pharma products, where substantial process and product knowledge has been gathered. The following elements may be included in Stage 3A: number of 3A batch determination; evaluation of critical material attributes, critical process parameters, critical quality attributes; in vivo in vitro correlation; estimation of inherent process variability (IPV) and PaCS index; process capability and quality dashboard (PCQd); and enhanced control strategy. US FDA guidance on Process Validation: General Principles and Practices, January 2011 encourages applying previous credible experience with suitably similar products and processes. A complete Stage 3A evaluation is a valuable resource for product development and future risk mitigation of similar products and processes. Elements of 3A assessment were developed to address industry and regulatory guidance requirements. The conclusions made provide sufficient information to make a scientific and risk-based decision on product robustness.

  17. A Systems-Level Approach to Building Sustainable Assessment Cultures: Moderation, Quality Task Design and Dependability of Judgement

    ERIC Educational Resources Information Center

    Colbert, Peta; Wyatt-Smith, Claire; Klenowski, Val

    2012-01-01

    This article considers the conditions that are necessary at system and local levels for teacher assessment to be valid, reliable and rigorous. With sustainable assessment cultures as a goal, the article examines how education systems can support local-level efforts for quality learning and dependable teacher assessment. This is achieved through…

  18. Issues in cross-cultural validity: example from the adaptation, reliability, and validity testing of a Turkish version of the Stanford Health Assessment Questionnaire.

    PubMed

    Küçükdeveci, Ayse A; Sahin, Hülya; Ataman, Sebnem; Griffiths, Bridget; Tennant, Alan

    2004-02-15

    Guidelines have been established for cross-cultural adaptation of outcome measures. However, invariance across cultures must also be demonstrated through analysis of Differential Item Functioning (DIF). This is tested in the context of a Turkish adaptation of the Health Assessment Questionnaire (HAQ). Internal construct validity of the adapted HAQ is assessed by Rasch analysis; reliability, by internal consistency and the intraclass correlation coefficient; external construct validity, by association with impairments and American College of Rheumatology functional stages. Cross-cultural validity is tested through DIF by comparison with data from the UK version of the HAQ. The adapted version of the HAQ demonstrated good internal construct validity through fit of the data to the Rasch model (mean item fit 0.205; SD 0.998). Reliability was excellent (alpha = 0.97) and external construct validity was confirmed by expected associations. DIF for culture was found in only 1 item. Cross-cultural validity was found to be sufficient for use in international studies between the UK and Turkey. Future adaptation of instruments should include analysis of DIF at the field testing stage in the adaptation process.

  19. Validation of a novel duplex ultrasound objective structured assessment of technical skills (DUOSATS) for arterial stenosis detection.

    PubMed

    Jaffer, U; Singh, P; Pandey, V A; Aslam, M; Standfield, N J

    2014-01-01

    Duplex ultrasound facilitates bedside diagnosis and hence timely patient care. Its uptake has been hampered by training and accreditation issues. We have developed an assessment tool for Duplex arterial stenosis measurement for both simulator and patient based training. A novel assessment tool: duplex ultrasound assessment of technical skills was developed. A modified duplex ultrasound assessment of technical skills was used for simulator training. Novice, intermediate experience and expert users of duplex ultrasound were invited to participate. Participants viewed an instructional video and were allowed ample time to familiarize with the equipment. Participants' attempts were recorded and independently assessed by four experts using the modified duplex ultrasound assessment of technical skills. 'Global' assessment was also done on a four point Likert scale. Content, construct and concurrent validity as well as reliability were evaluated. Content and construct validity as well as reliability were demonstrated. The simulator had good satisfaction rating from participants: median 4; range 3-5. Receiver operator characteristic analysis has established a cut point of 22/ 34 and 25/ 40 were most appropriate for simulator and patient based assessment respectively. We have validated a novel assessment tool for duplex arterial stenosis detection. Further work is underway to establish transference validity of simulator training to improved skill in scanning patients. We have developed and validated duplex ultrasound assessment of technical skills for simulator training.

  20. Experimental Validation of a Closed Brayton Cycle System Transient Simulation

    NASA Technical Reports Server (NTRS)

    Johnson, Paul K.; Hervol, David S.

    2006-01-01

    The Brayton Power Conversion Unit (BPCU) located at NASA Glenn Research Center (GRC) in Cleveland, Ohio was used to validate the results of a computational code known as Closed Cycle System Simulation (CCSS). Conversion system thermal transient behavior was the focus of this validation. The BPCU was operated at various steady state points and then subjected to transient changes involving shaft rotational speed and thermal energy input. These conditions were then duplicated in CCSS. Validation of the CCSS BPCU model provides confidence in developing future Brayton power system performance predictions, and helps to guide high power Brayton technology development.

  1. Development and Validation of the Personality Assessment Questionnaire: Test Manual.

    ERIC Educational Resources Information Center

    Rohner, Ronald P.; And Others

    Data are presented evaluating the validity and reliability of the Personality Assessment Questionnaire (PAQ), a self-report questionnaire designed to elicit respondents' perceptions of themselves with respect to seven personality and behavioral dispositions: hostility and aggression, dependence, self-esteem, self-adequacy, emotional…

  2. Validity and reliability of the robotic objective structured assessment of technical skills

    PubMed Central

    Siddiqui, Nazema Y.; Galloway, Michael L.; Geller, Elizabeth J.; Green, Isabel C.; Hur, Hye-Chun; Langston, Kyle; Pitter, Michael C.; Tarr, Megan E.; Martino, Martin A.

    2015-01-01

    Objective Objective structured assessments of technical skills (OSATS) have been developed to measure the skill of surgical trainees. Our aim was to develop an OSATS specifically for trainees learning robotic surgery. Study Design This is a multi-institutional study in eight academic training programs. We created an assessment form to evaluate robotic surgical skill through five inanimate exercises. Obstetrics/gynecology, general surgery, and urology residents, fellows, and faculty completed five robotic exercises on a standard training model. Study sessions were recorded and randomly assigned to three blinded judges who scored performance using the assessment form. Construct validity was evaluated by comparing scores between participants with different levels of surgical experience; inter- and intra-rater reliability were also assessed. Results We evaluated 83 residents, 9 fellows, and 13 faculty, totaling 105 participants; 88 (84%) were from obstetrics/gynecology. Our assessment form demonstrated construct validity, with faculty and fellows performing significantly better than residents (mean scores: 89 ± 8 faculty; 74 ± 17 fellows; 59 ± 22 residents, p<0.01). In addition, participants with more robotic console experience scored significantly higher than those with fewer prior console surgeries (p<0.01). R-OSATS demonstrated good inter-rater reliability across all five drills (mean Cronbach's α: 0.79 ± 0.02). Intra-rater reliability was also high (mean Spearman's correlation: 0.91 ± 0.11). Conclusions We developed an assessment form for robotic surgical skill that demonstrates construct validity, inter- and intra-rater reliability. When paired with standardized robotic skill drills this form may be useful to distinguish between levels of trainee performance. PMID:24807319

  3. Assessing the Validity of Self-Reported Stress-Related Growth

    ERIC Educational Resources Information Center

    Frazier, Patricia A.; Kaler, Matthew E.

    2006-01-01

    The purpose of these studies was to assess the validity of self-reported stress-related growth (SRG). In Study 1, individuals with breast cancer (n = 70) generally did not report greater well-being than a matched comparison group (n = 70). In Study 2, there were no significant differences in well-being between undergraduate students who said that…

  4. Validating a biometric authentication system: sample size requirements.

    PubMed

    Dass, Sarat C; Zhu, Yongfang; Jain, Anil K

    2006-12-01

    Authentication systems based on biometric features (e.g., fingerprint impressions, iris scans, human face images, etc.) are increasingly gaining widespread use and popularity. Often, vendors and owners of these commercial biometric systems claim impressive performance that is estimated based on some proprietary data. In such situations, there is a need to independently validate the claimed performance levels. System performance is typically evaluated by collecting biometric templates from n different subjects, and for convenience, acquiring multiple instances of the biometric for each of the n subjects. Very little work has been done in 1) constructing confidence regions based on the ROC curve for validating the claimed performance levels and 2) determining the required number of biometric samples needed to establish confidence regions of prespecified width for the ROC curve. To simplify the analysis that address these two problems, several previous studies have assumed that multiple acquisitions of the biometric entity are statistically independent. This assumption is too restrictive and is generally not valid. We have developed a validation technique based on multivariate copula models for correlated biometric acquisitions. Based on the same model, we also determine the minimum number of samples required to achieve confidence bands of desired width for the ROC curve. We illustrate the estimation of the confidence bands as well as the required number of biometric samples using a fingerprint matching system that is applied on samples collected from a small population.

  5. Development, initial reliability and validity testing of an observational tool for assessing technical skills of operating room nurses.

    PubMed

    Sevdalis, Nick; Undre, Shabnam; Henry, Janet; Sydney, Elaine; Koutantji, Mary; Darzi, Ara; Vincent, Charles A

    2009-09-01

    The recent emergence of the Systems Approach to the safety and quality of surgical care has triggered individual and team skills training modules for surgeons and anaesthetists and relevant observational assessment tools have been developed. To develop an observational tool that captures operating room (OR) nurses' technical skill and can be used for assessment and training. The Imperial College Assessment of Technical Skills for Nurses (ICATS-N) assesses (i) gowning and gloving, (ii) setting up instrumentation, (iii) draping, and (iv) maintaining sterility. Three to five observable behaviours have been identified for each skill and are rated on 1-6 scales. Feasibility and aspects of reliability and validity were assessed in 20 simulation-based crisis management training modules for trainee nurses and doctors, carried out in a Simulated Operating Room. The tool was feasible to use in the context of simulation-based training. Satisfactory reliability (Cronbach alpha) was obtained across trainers' and trainees' scores (analysed jointly and separately). Moreover, trainer nurse's ratings of the four skills correlated positively, thus indicating adequate content validity. Trainer's and trainees' ratings did not correlate. Assessment of OR nurses' technical skill is becoming a training priority. The present evidence suggests that the ICATS-N could be considered for use as an assessment/training tool for junior OR nurses.

  6. Validation and evaluation of the advanced aeronautical CFD system SAUNA: A method developer's view

    NASA Astrophysics Data System (ADS)

    Shaw, J. A.; Peace, A. J.; Georgala, J. M.; Childs, P. N.

    1993-09-01

    This paper is concerned with a detailed validation and evaluation of the SAUNA CFD system for complex aircraft configurations. The methodology of the complete system is described in brief, including its unique use of differing grid generation strategies (structured, unstructured or both) depending on the geometric complexity of the configuration. A wide range of configurations and flow conditions are chosen in the validation and evaluation exercise to demonstrate the scope of SAUNA. A detailed description of the results from the method is preceded by a discussion on the philosophy behind the strategy followed in the exercise, in terms of equality assessment and the differing roles of the code developer and the code user. It is considered that SAUNA has grown into a highly usable tool for the aircraft designer, in combining flexibility and accuracy in an efficient manner.

  7. Probabilistic Approaches for Multi-Hazard Risk Assessment of Structures and Systems

    NASA Astrophysics Data System (ADS)

    Kwag, Shinyoung

    Performance assessment of structures, systems, and components for multi-hazard scenarios has received significant attention in recent years. However, the concept of multi-hazard analysis is quite broad in nature and the focus of existing literature varies across a wide range of problems. In some cases, such studies focus on hazards that either occur simultaneously or are closely correlated with each other. For example, seismically induced flooding or seismically induced fires. In other cases, multi-hazard studies relate to hazards that are not dependent or correlated but have strong likelihood of occurrence at different times during the lifetime of a structure. The current approaches for risk assessment need enhancement to account for multi-hazard risks. It must be able to account for uncertainty propagation in a systems-level analysis, consider correlation among events or failure modes, and allow integration of newly available information from continually evolving simulation models, experimental observations, and field measurements. This dissertation presents a detailed study that proposes enhancements by incorporating Bayesian networks and Bayesian updating within a performance-based probabilistic framework. The performance-based framework allows propagation of risk as well as uncertainties in the risk estimates within a systems analysis. Unlike conventional risk assessment techniques such as a fault-tree analysis, a Bayesian network can account for statistical dependencies and correlations among events/hazards. The proposed approach is extended to develop a risk-informed framework for quantitative validation and verification of high fidelity system-level simulation tools. Validation of such simulations can be quite formidable within the context of a multi-hazard risk assessment in nuclear power plants. The efficiency of this approach lies in identification of critical events, components, and systems that contribute to the overall risk. Validation of any event or

  8. Content validation of an interprofessional learning video peer assessment tool.

    PubMed

    Nisbet, Gillian; Jorm, Christine; Roberts, Chris; Gordon, Christopher J; Chen, Timothy F

    2017-12-16

    Large scale models of interprofessional learning (IPL) where outcomes are assessed are rare within health professional curricula. To date, there is sparse research describing robust assessment strategies to support such activities. We describe the development of an IPL assessment task based on peer rating of a student generated video evidencing collaborative interprofessional practice. We provide content validation evidence of an assessment rubric in the context of large scale IPL. Two established approaches to scale development in an educational setting were combined. A literature review was undertaken to develop a conceptual model of the relevant domains and issues pertaining to assessment of student generated videos within IPL. Starting with a prototype rubric developed from the literature, a series of staff and student workshops were undertaken to integrate expert opinion and user perspectives. Participants assessed five-minute videos produced in a prior pilot IPL activity. Outcomes from each workshop informed the next version of the rubric until agreement was reached on anchoring statements and criteria. At this point the rubric was declared fit to be used in the upcoming mandatory large scale IPL activity. The assessment rubric consisted of four domains: patient issues, interprofessional negotiation; interprofessional management plan in action; and effective use of video medium to engage audience. The first three domains reflected topic content relevant to the underlying construct of interprofessional collaborative practice. The fourth domain was consistent with the broader video assessment literature calling for greater emphasis on creativity in education. We have provided evidence for the content validity of a video-based peer assessment task portraying interprofessional collaborative practice in the context of large-scale IPL activities for healthcare professional students. Further research is needed to establish the reliability of such a scale.

  9. Automation and validation of DNA-banking systems.

    PubMed

    Thornton, Melissa; Gladwin, Amanda; Payne, Robin; Moore, Rachael; Cresswell, Carl; McKechnie, Douglas; Kelly, Steve; March, Ruth

    2005-10-15

    DNA banking is one of the central capabilities on which modern genetic research rests. The DNA-banking system plays an essential role in the flow of genetic data from patients and genetics researchers to the application of genetic research in the clinic. Until relatively recently, large collections of DNA samples were not common in human genetics. Now, collections of hundreds of thousands of samples are common in academic institutions and private companies. Automation of DNA banking can dramatically increase throughput, eliminate manual errors and improve the productivity of genetics research. An increased emphasis on pharmacogenetics and personalized medicine has highlighted the need for genetics laboratories to operate within the principles of a recognized quality system such as good laboratory practice (GLP). Automated systems are suitable for such laboratories but require a level of validation that might be unfamiliar to many genetics researchers. In this article, we use the AstraZeneca automated DNA archive and reformatting system (DART) as a case study of how such a system can be successfully developed and validated within the principles of GLP.

  10. Threats to the Valid Use of Assessment of Prior Learning in Higher Education: Claimants' Experiences of the Assessment Process

    ERIC Educational Resources Information Center

    Stenlund, Tova

    2012-01-01

    Assessment of Prior Learning (APL) refers to a process where adults' prior learning, formal as well as informal, is assessed and acknowledged. In the first section of this paper, APL and current conceptions of validity in assessments and its evaluation are presented. It is argued that participants in the assessment are an important source of…

  11. Development of the implant surgical technique and assessment rating system

    PubMed Central

    Park, Jung-Chul; Hwang, Ji-Wan; Lee, Jung-Seok; Jung, Ui-Won; Choi, Seong-Ho; Cho, Kyoo-Sung; Chai, Jung-Kiu

    2012-01-01

    Purpose There has been no attempt to establish an objective implant surgical evaluation protocol to assess residents' surgical competence and improve their surgical outcomes. The present study presents a newly developed assessment and rating system and simulation model that can assist the teaching staffs to evaluate the surgical events and surgical skills of residents objectively. Methods Articles published in peer-reviewed English journals were selected using several scientific databases and subsequently reviewed regarding surgical competence and assessment tools. Particularly, medical journals reporting rating and evaluation protocols for various types of medical surgeries were thoroughly analyzed. Based on these studies, an implant surgical technique assessment and rating system (iSTAR) has been developed. Also, a specialized dental typodont was developed for the valid and reliable assessment of surgery. Results The iSTAR consists of two parts including surgical information and task-specific checklists. Specialized simulation model was subsequently produced and can be used in combination with iSTAR. Conclusions The assessment and rating system provided may serve as a reference guide for teaching staffs to evaluate the residents' implant surgical techniques. PMID:22413071

  12. Content validity of the Geriatric Health Assessment Instrument.

    PubMed

    Pedreira, Rhaine Borges Santos; Rocha, Saulo Vasconcelos; Santos, Clarice Alves Dos; Vasconcelos, Lélia Renata Carneiro; Reis, Martha Cerqueira

    2016-01-01

    Assess the content validity of the Elderly Health Assessment Tool with low education. The data collection instrument/questionnaire was prepared and submitted to an expert panel comprising four healthcare professionals experienced in research on epidemiology of aging. The experts were allowed to suggest item inclusion/exclusion and were asked to rate the ability of individual items in questionnaire blocks to encompass target dimensions as "not valid", "somewhat valid" or "valid", using an interval scale. Percent agreement and the Content Validity Index were used as measurements of inter-rater agreement; the minimum acceptable inter-rater agreement was set at 80%. The mean instrument percent agreement rate was 86%, ranging from 63 to 99%, and from 50 to 100% between and within blocks respectively. The Mean Content Validity Index score was 93.47%, ranging from 50 to 100% between individual items. The instrument showed acceptable psychometric properties for application in geriatric populations with low levels of education. It enabled identifying diseases and assisted in choice of strategies related to health of the elderly. Avaliar a validade de conteúdo do Instrumento de Avaliação da Saúde do Idoso com baixa escolaridade. Após a elaboração do instrumento de coleta de dados, o questionário foi submetido à avaliação de um comitê de especialistas, formado por quatro profissionais da área da saúde com experiência em pesquisas da epidemiologia do envelhecimento. Os especialistas puderam sugerir questões a serem incluídas/excluídas do instrumento, e avaliar cada bloco do questionário, observando se as dimensões a serem avaliadas foram abrangidas pelos itens do instrumento, em escala intervalar, como "não válida", "pouco válida" e "válida". Como medidas para avaliar o grau de concordância do instrumento, foram utilizados o porcentual de concordância e o Índice de Validade de Conteúdo. Considerou-se uma taxa aceitável de concordância o valor de

  13. Validation of the Virtual MET as an assessment tool for executive functions.

    PubMed

    Rand, Debbie; Basha-Abu Rukan, Soraya; Weiss, Patrice L Tamar; Katz, Noomi

    2009-08-01

    The purpose of this study was to establish ecological validity and initial construct validity of a Virtual Multiple Errands Test (VMET) as an assessment tool for executive functions. It was implemented within the Virtual Mall (VMall), a novel functional video-capture virtual shopping environment. The main objectives were (1) to examine the relationships between the performance of three groups of participants in the Multiple Errands Test (MET) carried out in a real shopping mall and their performance in the VMET, (2) to assess the relationships between the MET and VMET of the post-stroke participant's level of executive functioning and independence in instrumental activities of daily living, and (3) to compare the performance of post-stroke participants to those of healthy young and older controls in both the MET and VMET. The study population included three groups; post-stroke participants (n = 9), healthy young participants (n = 20), and healthy older participants (n = 20). The VMET was able to differentiate between two age groups of healthy participants and between healthy and post-stroke participants thus demonstrating that it is sensitive to brain injury and ageing and supports construct validity between known groups. In addition, significant correlations were found between the MET and the VMET for both the post-stroke participants and older healthy participants. This provides initial support for the ecological validity of the VMET as an assessment tool of executive functions. However, further psychometric data on temporal stability are needed, namely test-retest reliability and responsiveness, before it is ready for clinical application. Further research using the VMET as an assessment tool within the VMall with larger groups and in additional populations is also recommended.

  14. New Mobile Atmospheric Lidar Systems for Spaceborne Instrument Validation

    NASA Astrophysics Data System (ADS)

    Chazette, P.; Raut, J.-C.; Sanak, J.; Berthier, S.; Dulac, F.; Kim, S. W.; Royer, P.

    2009-04-01

    We present an overview of our different approaches using lidar systems as a tool to validate and develop the new generation of spaceborne missions. We have developed several mini-lidars in order to study the vertical structure, the clouds and the particulate composition of the atmosphere from mobile platforms. Here we focus on three mobile instrumental platforms including a backscatter lidar instrument developed for validation of the Cloud-Aerosol LIdar with Orthogonal Polarization (CALIOP) onboard CALIPSO and of the Interféromètre Atmosphérique de Sondage Infrarouge (IASI) onboard METOP. The first system is operated onboard an ultra-light aircraft (ULA) (Chazette et al., Environ. Sci. Technol., 2007). The second one is operated onboard a stratospheric balloon to study the interest of the measurement synergy with the Infrared Atmospheric Sounding Interferometer (IASI). The third one is part of a truck/car mobile station to be positioned close to the satellite ground-track (e.g. CALIPSO) or inside the area delimitated by the instrumental swath (e.g. IASI). CALIPSO was inserted in the A-Train constellation behind Aqua on 28 April, 2006 (http://www-calipso.larc.nasa.gov/about/atrain.php). One of the main objectives of the scientific mission is the study of atmospheric aerosols. Before the CALIOP lidar profiles could be used in an operational way, it has been necessary to validate both the raw and geophysical data of the instrument. For this purpose, we carried out an experiment in south-eastern France in summer 2007 to validate the aerosol product of CALIOP by operating both the ground-based and the airborne mobile lidars in coincidence with CALIOP. The synergy between the new generation of spaceborne passive and active instruments is promising to assess the concentration of main pollutants as aerosol, O3 and CO, and greenhouse gases as CO2 and CH4 within the planetary boundary layer (PBL) and to increase the accuracy on the vertical profile of temperature. IASI is

  15. Clinical assessment of dysphagia in neurodegeneration (CADN): development, validity and reliability of a bedside tool for dysphagia assessment.

    PubMed

    Vogel, Adam P; Rommel, Natalie; Sauer, Carina; Horger, Marius; Krumm, Patrick; Himmelbach, Marc; Synofzik, Matthis

    2017-06-01

    Screening assessments for dysphagia are essential in neurodegenerative disease. Yet there are no purpose-built tools to quantify swallowing deficits at bedside or in clinical trials. A quantifiable, brief, easy to administer assessment that measures the impact of dysphagia and predicts the presence or absence of aspiration is needed. The Clinical Assessment of Dysphagia in Neurodegeneration (CADN) was designed by a multidisciplinary team (neurology, neuropsychology, speech pathology) validated against strict methodological criteria in two neurodegenerative diseases, Parkinson's disease (PD) and degenerative ataxia (DA). CADN comprises two parts, an anamnesis (part one) and consumption (part two). Two-thirds of patients were assessed using reference tests, the SWAL-QOL symptoms subscale (part one) and videofluoroscopic assessment of swallowing (part two). CADN has 11 items and can be administered and scored in an average of 7 min. Test-retest reliability was established using correlation and Bland-Altman plots. 125 patients with a neurodegenerative disease were recruited; 60 PD and 65 DA. Validity was established using ROC graphs and correlations. CADN has sensitivity of 79 and 84% and specificity 71 and 69% for parts one and two, respectively. Significant correlations with disease severity were also observed (p < 0.001) for PD with small to large associations between disease severity and CADN scores for DA. Cutoff scores were identified that signal the presence of clinically meaningful dysphagia symptomatology and risk of aspiration. The CADN is a reliable, valid, brief, quantifiable, and easily deployed assessment of swallowing in neurodegenerative disease. It is thus ideally suited for both clinical bedside assessment and future multicentre clinical trials in neurodegenerative disease.

  16. Cross-cultural validation of the Children's Assessment of Participation and Enjoyment (CAPE) in Spain.

    PubMed

    Longo, E; Badia, M; Orgaz, B; Verdugo, M A

    2014-03-01

    Despite growing interest in the topic of participation, the construct has not yet been assessed in children and adolescents with and without cerebral palsy (CP) in Spain. As there are no available instruments to measure participation in leisure activities which have been adapted in this country, the goal of this study was to validate a Spanish version of the Children's Assessment of Participation and Enjoyment (CAPE). The sample comprised 199 children and adolescents with CP and 199 without CP, between 8 and 18 years of age, from seven regions in Spain. The adaptation of the original version of CAPE was carried out through translation and backward translation, and the validity of the instrument was analysed. Construct validity was assessed through the correlation of the diverse CAPE domains and the quality of life domains (KIDSCREEN questionnaire). Discriminant validity was established by comparing children and adolescents with CP and typically developing children and adolescents. For test-retest reliability, the children and adolescents with and without CP completed the CAPE questionnaire twice within 4 weeks. The correlations found between the CAPE domains and the quality of life domains show that the CAPE presents construct validity. The CAPE discriminated children and adolescents with CP from those without any disability in the results of participation. According to most CAPE domains, typically developing children and adolescents engage in a greater number of activities than children and adolescents with CP. Test-retest reliability for the Spanish version of CAPE was adequate. The study provides a valid instrument to assess the participation of children and adolescents with and without CP who live in Spain. © 2012 John Wiley & Sons Ltd.

  17. What Counts as Validity Evidence? Examples and Prevalence in a Systematic Review of Simulation-Based Assessment

    ERIC Educational Resources Information Center

    Cook, David A.; Zendejas, Benjamin; Hamstra, Stanley J.; Hatala, Rose; Brydges, Ryan

    2014-01-01

    Ongoing transformations in health professions education underscore the need for valid and reliable assessment. The current standard for assessment validation requires evidence from five sources: content, response process, internal structure, relations with other variables, and consequences. However, researchers remain uncertain regarding the types…

  18. Validity of instruments to assess students' travel and pedestrian safety

    USDA-ARS?s Scientific Manuscript database

    Safe Routes to School (SRTS) programs are designed to make walking and bicycling to school,safe and accessible for children. Despite their growing popularity, few validated measures exist for assessing important outcomes such as type of student transport or pedestrian safety behaviors. This research...

  19. Incremental Validity in the Clinical Assessment of Early Childhood Development

    ERIC Educational Resources Information Center

    Liu, Xin; Zhou, Xiaobin; Lackaff, Julie

    2013-01-01

    The authors demonstrate the increment of clinical validity in early childhood assessment of physical impairment (PI), developmental delay (DD), and autism (AUT) using multiple standardized developmental screening measures such as performance measures and parent and teacher rating scales. Hierarchical regression and sensitivity/specificity analyses…

  20. Validation of the Neuro-QoL measurement system in children with epilepsy.

    PubMed

    Lai, Jin-Shei; Nowinski, Cindy J; Zelko, Frank; Wortman, Katy; Burns, James; Nordli, Douglas R; Cella, David

    2015-05-01

    Children with epilepsy often face complex psychosocial consequences that are not fully captured by existing patient-reported outcome (PRO) measures. The Neurology Quality of Life Measurement System "Neuro-QoL" was developed to provide a set of common PRO measures that address issues important to people with neurologic disorders. This paper reports Neuro-QoL (anxiety, depression, interaction with peers, fatigue, pain, cognitive function, stigma, and upper and lower extremity functions) validation in children with epilepsy. Patients (aged 10-18years) diagnosed with epilepsy completed Neuro-QoL and legacy measures at time 1 (initial study visit) and 6-month follow-up. Internal consistency reliability was also evaluated. Concurrent validity was assessed by comparing Neuro-QoL measures with more established "legacy" measures of the same concepts. Clinical validity was evaluated by comparing mean Neuro-QoL scores of patients grouped by clinical anchors such as disease severity. Responsiveness of the Neuro-QoL from time 1 (initial study visit) to 6months was evaluated using self-reported change as the primary anchor. Sixty-one patients (mean age=13.4years; 62.3% male, 75.9% white) participated. Most patients (64.2%) had been seizure-free in the 3months prior to participation, and seizure frequency was otherwise described as follows: 17.8% daily, 13.3% weekly, 35.6% monthly, and 33.3% yearly. All patients were taking antiepileptic drugs. Patients reported better function/less symptoms compared to the reference groups. Internal consistency (alpha) coefficients ranged from 0.76 to 0.87. Patients with different seizure frequencies differed on anxiety (p<.01) and cognitive function (p<.05). Compared to patients on polytherapy, those on monotherapy had better upper extremity scores (p<.05). Compared to those with localized seizures, those experiencing generalized seizures reported worse stigma (p<.05). Depression, anxiety, lower extremity, fatigue, pain, interaction with peers

  1. Reliability and validity of a nutrition and physical activity environmental self-assessment for child care

    PubMed Central

    Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S

    2007-01-01

    Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC

  2. Reliability and validity of a questionnaire for self-assessment of complete dentures.

    PubMed

    Komagamine, Yuriko; Kanazawa, Manabu; Kaiba, Yoshinori; Sato, Yusuke; Minakuchi, Shunsuke

    2014-05-02

    Demand for complete denture treatment is expected to rise over several decades. However, to date, no questionnaire on complete dentures, as evaluated by edentulous patients, has been shown to be reliable and valid. This study sought to assess the reliability and validity of Patient's Denture Assessment (PDA), which provides a multidimensional evaluation of dentures among edentulous patients. Patients, who had new complete dentures fabricated at the University Hospital of Dentistry, Tokyo Medical and Dental University through 2009 to 2010, were enrolled. The reliability of the PDA was determined by examining internal consistency and test-retest reliability. Internal consistency for all of the question items and the six subscales was measured using Cronbach's α and average inter-item correlation coefficients among 93 participants. For 33 of these participants, test-retest reliability was determined at a 2 month-interval using the interclass correlation coefficients (ICCs) and 95% confidence interval for the summary scores and the six subscale scores. The PDA was validated in 93 participants by examining the difference in the summary score and the six subscale scores of the PDA before and after replacement with new dentures by the paired t-test. Ability to detect change was also tested in 93 patients using effect size. The Cronbach's α for the PDA ranged from 0.56 to 0.93. The average inter-item correlation coefficients ranged from 0.28 to 0.83. ICCs for the PDA ranged from 0.37 to 0.83. The paired t-test showed a significant difference between the summary score and the six subscale scores before and after replacement with new dentures (p < 0.05) and the effect size was 0.97. The PDA demonstrated good reliability by assessing internal consistency and test-retest reliability. In addition, the PDA demonstrated good validity by assessing discriminant validity. Thus, the PDA could help dentists obtain a detailed understanding of the patients' perceptions in using

  3. 78 FR 32184 - HACCP Systems Validation

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-29

    ... DEPARTMENT OF AGRICULTURE Food Safety and Inspection Service 9 CFR part 417 [Docket No. FSIS-2009-0019] HACCP Systems Validation AGENCY: Food Safety and Inspection Service, USDA. ACTION: Notice of public meeting and request for comments. SUMMARY: The Food Safety and Inspection Service (FSIS) is...

  4. Validation of screening tools to assess appetite among geriatric patients.

    PubMed

    Hanisah, R; Suzana, S; Lee, F S

    2012-07-01

    Poor appetite is one of the main contributing factors of poor nutritional status among elderly individuals. Recognizing the importance of assessment of appetite, a cross sectional study was conducted to determine the validity of appetite screening tools namely, the Council on Nutrition Appetite questionnaire (CNAQ) and the simplified nutritional appetite questionnaire (SNAQ) against the appetite, hunger and sensory perception questionnaire (AHSPQ), measures of nutritional status and food intake among geriatric patients at the main general hospital in Malaysia. Nutritional status was assessed using the subjective global assessment (SGA) while food intake was measured using the dietary history questionnaire (DHQ). Anthropometric parameters included weight, height, body mass index (BMI), calf circumference (CC) and mid upper arm circumference (MUAC). A total of 145 subjects aged 60 to 86 years (68.3 ± 5.8 years) with 31.7% men and 68.3% women were recruited from outpatients (35 subjects) and inpatients (110 subjects) of Kuala Lumpur Hospital of Malaysia. As assessed by SGA, most subjects were classified as mild to moderately malnourished (50.4%), followed by normal (38.6%) and severely malnourished (11.0%). A total of 79.3% and 57.2% subjects were classified as having poor appetite according to CNAQ and SNAQ, respectively. CNAQ (80.9%) had a higher sensitivity than SNAQ (69.7%) when validated against nutritional status as assessed using SGA. However, the specificity of SNAQ (62.5%) was higher than CNAQ (23.2%). Positive predictive value for CNAQ and SNAQ were 62.6% and 74.7%, respectively. Cronbach's alpha for CNAQ and SNAQ were 0.546 and 0.578, respectively. History of weight loss over the past one year (Adjusted odds ratio 2.49) (p < 0.01) and thiamine intake less than the recommended nutrient intake (RNI) (Adjusted odds ratio 3.04) (p < 0.05) were risk factors for poor appetite among subjects. In conclusion, malnutrition and poor appetite were prevalent among the

  5. Validation of an automated system for aliquoting of HIV-1 Env-pseudotyped virus stocks.

    PubMed

    Schultz, Anke; Germann, Anja; Fuss, Martina; Sarzotti-Kelsoe, Marcella; Ozaki, Daniel A; Montefiori, David C; Zimmermann, Heiko; von Briesen, Hagen

    2018-01-01

    The standardized assessments of HIV-specific immune responses are of main interest in the preclinical and clinical stage of HIV-1 vaccine development. In this regard, HIV-1 Env-pseudotyped viruses play a central role for the evaluation of neutralizing antibody profiles and are produced according to Good Clinical Laboratory Practice- (GCLP-) compliant manual and automated procedures. To further improve and complete the automated production cycle an automated system for aliquoting HIV-1 pseudovirus stocks has been implemented. The automation platform consists of a modified Tecan-based system including a robot platform for handling racks containing 48 cryovials, a Decapper, a tubing pump and a safety device consisting of ultrasound sensors for online liquid level detection of each individual cryovial. With the aim to aliquot the HIV-1 pseudoviruses in an automated manner under GCLP-compliant conditions a validation plan was developed where the acceptance criteria-accuracy, precision as well as the specificity and robustness-were defined and summarized. By passing the validation experiments described in this article the automated system for aliquoting has been successfully validated. This allows the standardized and operator independent distribution of small-scale and bulk amounts of HIV-1 pseudovirus stocks with a precise and reproducible outcome to support upcoming clinical vaccine trials.

  6. Validation of an automated system for aliquoting of HIV-1 Env-pseudotyped virus stocks

    PubMed Central

    Schultz, Anke; Germann, Anja; Fuss, Martina; Sarzotti-Kelsoe, Marcella; Ozaki, Daniel A.; Montefiori, David C.; Zimmermann, Heiko

    2018-01-01

    The standardized assessments of HIV-specific immune responses are of main interest in the preclinical and clinical stage of HIV-1 vaccine development. In this regard, HIV-1 Env-pseudotyped viruses play a central role for the evaluation of neutralizing antibody profiles and are produced according to Good Clinical Laboratory Practice- (GCLP-) compliant manual and automated procedures. To further improve and complete the automated production cycle an automated system for aliquoting HIV-1 pseudovirus stocks has been implemented. The automation platform consists of a modified Tecan-based system including a robot platform for handling racks containing 48 cryovials, a Decapper, a tubing pump and a safety device consisting of ultrasound sensors for online liquid level detection of each individual cryovial. With the aim to aliquot the HIV-1 pseudoviruses in an automated manner under GCLP-compliant conditions a validation plan was developed where the acceptance criteria—accuracy, precision as well as the specificity and robustness—were defined and summarized. By passing the validation experiments described in this article the automated system for aliquoting has been successfully validated. This allows the standardized and operator independent distribution of small-scale and bulk amounts of HIV-1 pseudovirus stocks with a precise and reproducible outcome to support upcoming clinical vaccine trials. PMID:29300769

  7. A Comparative Study of Adolescent Risk Assessment Instruments: Predictive and Incremental Validity

    ERIC Educational Resources Information Center

    Welsh, Jennifer L.; Schmidt, Fred; McKinnon, Lauren; Chattha, H. K.; Meyers, Joanna R.

    2008-01-01

    Promising new adolescent risk assessment tools are being incorporated into clinical practice but currently possess limited evidence of predictive validity regarding their individual and/or combined use in risk assessments. The current study compares three structured adolescent risk instruments, Youth Level of Service/Case Management Inventory…

  8. Initial Steps in Creating a Developmentally Valid Tool for Observing/Assessing Rope Jumping

    ERIC Educational Resources Information Center

    Roberton, Mary Ann; Thompson, Gregory; Langendorfer, Stephen J.

    2017-01-01

    Background: Valid motor development sequences show the various behaviors that children display as they progress toward competence in specific motor skills. Teachers can use these sequences to observe informally or formally assess their students. While longitudinal study is ultimately required to validate developmental sequences, there are earlier,…

  9. Quantitative safety assessment of air traffic control systems through system control capacity

    NASA Astrophysics Data System (ADS)

    Guo, Jingjing

    Quantitative Safety Assessments (QSA) are essential to safety benefit verification and regulations of developmental changes in safety critical systems like the Air Traffic Control (ATC) systems. Effectiveness of the assessments is particularly desirable today in the safe implementations of revolutionary ATC overhauls like NextGen and SESAR. QSA of ATC systems are however challenged by system complexity and lack of accident data. Extending from the idea "safety is a control problem" in the literature, this research proposes to assess system safety from the control perspective, through quantifying a system's "control capacity". A system's safety performance correlates to this "control capacity" in the control of "safety critical processes". To examine this idea in QSA of the ATC systems, a Control-capacity Based Safety Assessment Framework (CBSAF) is developed which includes two control capacity metrics and a procedural method. The two metrics are Probabilistic System Control-capacity (PSC) and Temporal System Control-capacity (TSC); each addresses an aspect of a system's control capacity. And the procedural method consists three general stages: I) identification of safety critical processes, II) development of system control models and III) evaluation of system control capacity. The CBSAF was tested in two case studies. The first one assesses an en-route collision avoidance scenario and compares three hypothetical configurations. The CBSAF was able to capture the uncoordinated behavior between two means of control, as was observed in a historic midair collision accident. The second case study compares CBSAF with an existing risk based QSA method in assessing the safety benefits of introducing a runway incursion alert system. Similar conclusions are reached between the two methods, while the CBSAF has the advantage of simplicity and provides a new control-based perspective and interpretation to the assessments. The case studies are intended to investigate the

  10. Validity evidence for the Fundamentals of Laparoscopic Surgery (FLS) program as an assessment tool: a systematic review.

    PubMed

    Zendejas, Benjamin; Ruparel, Raaj K; Cook, David A

    2016-02-01

    The Fundamentals of Laparoscopic Surgery (FLS) program uses five simulation stations (peg transfer, precision cutting, loop ligation, and suturing with extracorporeal and intracorporeal knot tying) to teach and assess laparoscopic surgery skills. We sought to summarize evidence regarding the validity of scores from the FLS assessment. We systematically searched for studies evaluating the FLS as an assessment tool (last search update February 26, 2013). We classified validity evidence using the currently standard validity framework (content, response process, internal structure, relations with other variables, and consequences). From a pool of 11,628 studies, we identified 23 studies reporting validity evidence for FLS scores. Studies involved residents (n = 19), practicing physicians (n = 17), and medical students (n = 8), in specialties of general (n = 17), gynecologic (n = 4), urologic (n = 1), and veterinary (n = 1) surgery. Evidence was most common in the form of relations with other variables (n = 22, most often expert-novice differences). Only three studies reported internal structure evidence (inter-rater or inter-station reliability), two studies reported content evidence (i.e., derivation of assessment elements), and three studies reported consequences evidence (definition of pass/fail thresholds). Evidence nearly always supported the validity of FLS total scores. However, the loop ligation task lacks discriminatory ability. Validity evidence confirms expected relations with other variables and acceptable inter-rater reliability, but other validity evidence is sparse. Given the high-stakes use of this assessment (required for board eligibility), we suggest that more validity evidence is required, especially to support its content (selection of tasks and scoring rubric) and the consequences (favorable and unfavorable impact) of assessment.

  11. Optimizing prevention of hospital-acquired venous thromboembolism (VTE): prospective validation of a VTE risk assessment model.

    PubMed

    Maynard, Gregory A; Morris, Timothy A; Jenkins, Ian H; Stone, Sarah; Lee, Joshua; Renvall, Marian; Fink, Ed; Schoenhaus, Robert

    2010-01-01

    Hospital-acquired (HA) venous thromboembolism (VTE) is a common source of morbidity/mortality. Prophylactic measures are underutilized. Available risk assessment models/protocols are not prospectively validated. Improve VTE prophylaxis, reduce HA VTE, and prospectively validate a VTE risk-assessment model. Observational design. Academic medical center. Adult inpatients on medical/surgical services. A simple VTE risk assessment linked to a menu of preferred VTE prophylaxis methods, embedded in order sets. Education, audit/feedback, and concurrent identification of nonadherence. Randomly sampled inpatient audits determined the percent of patients with "adequate" VTE prevention. HA VTE cases were identified concurrently via digital imaging system. Interobserver agreement for VTE risk level and judgment of adequate prophylaxis were calculated from 150 random audits. Interobserver agreement with 5 observers was high (kappa score for VTE risk level = 0.81, and for judgment of "adequate" prophylaxis = 0.90). The percent of patients on adequate prophylaxis improved each of the 3 years (58%, 78%, and 93%; P < 0.001) and reached 98% in the last 6 months of 2007; 361 cases of HA VTE occurred over 3 years. Significant reductions for the risk of HA VTE (risk ratio [RR] = 0.69; 95% confidence interval [CI] = 0.47-0.79) and preventable HA VTE (RR = 0.14; 95% CI = 0.06-0.31) occurred. We detected no increase in heparin-induced thrombocytopenia (HIT) or prophylaxis-related bleeding using administrative data/chart review. We prospectively validated a VTE risk-assessment/prevention protocol by demonstrating ease of use, good interobserver agreement, and effectiveness. Improved VTE prophylaxis resulted in a substantial reduction in HA VTE. (c) 2010 Society of Hospital Medicine.

  12. Challenges in verification and validation of autonomous systems for space exploration

    NASA Technical Reports Server (NTRS)

    Brat, Guillaume; Jonsson, Ari

    2005-01-01

    Space exploration applications offer a unique opportunity for the development and deployment of autonomous systems, due to limited communications, large distances, and great expense of direct operation. At the same time, the risk and cost of space missions leads to reluctance to taking on new, complex and difficult-to-understand technology. A key issue in addressing these concerns is the validation of autonomous systems. In recent years, higher-level autonomous systems have been applied in space applications. In this presentation, we will highlight those autonomous systems, and discuss issues in validating these systems. We will then look to future demands on validating autonomous systems for space, identify promising technologies and open issues.

  13. Rediscovery rate estimation for assessing the validation of significant findings in high-throughput studies.

    PubMed

    Ganna, Andrea; Lee, Donghwan; Ingelsson, Erik; Pawitan, Yudi

    2015-07-01

    It is common and advised practice in biomedical research to validate experimental or observational findings in a population different from the one where the findings were initially assessed. This practice increases the generalizability of the results and decreases the likelihood of reporting false-positive findings. Validation becomes critical when dealing with high-throughput experiments, where the large number of tests increases the chance to observe false-positive results. In this article, we review common approaches to determine statistical thresholds for validation and describe the factors influencing the proportion of significant findings from a 'training' sample that are replicated in a 'validation' sample. We refer to this proportion as rediscovery rate (RDR). In high-throughput studies, the RDR is a function of false-positive rate and power in both the training and validation samples. We illustrate the application of the RDR using simulated data and real data examples from metabolomics experiments. We further describe an online tool to calculate the RDR using t-statistics. We foresee two main applications. First, if the validation study has not yet been collected, the RDR can be used to decide the optimal combination between the proportion of findings taken to validation and the size of the validation study. Secondly, if a validation study has already been done, the RDR estimated using the training data can be compared with the observed RDR from the validation data; hence, the success of the validation study can be assessed. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  14. Validity of two alternative systems for measuring vertical jump height.

    PubMed

    Leard, John S; Cirillo, Melissa A; Katsnelson, Eugene; Kimiatek, Deena A; Miller, Tim W; Trebincevic, Kenan; Garbalosa, Juan C

    2007-11-01

    Vertical jump height is frequently used by coaches, health care professionals, and strength and conditioning professionals to objectively measure function. The purpose of this study is to determine the concurrent validity of the jump and reach method (Vertec) and the contact mat method (Just Jump) in assessing vertical jump height when compared with the criterion reference 3-camera motion analysis system. Thirty-nine college students, 25 females and 14 males between the ages of 18 and 25 (mean age 20.65 years), were instructed to perform the countermovement jump. Reflective markers were placed at the base of the individual's sacrum for the 3-camera motion analysis system to measure vertical jump height. The subject was then instructed to stand on the Just Jump mat beneath the Vertec and perform the jump. Measurements were recorded from each of the 3 systems simultaneously for each jump. The Pearson r statistic between the video and the jump and reach (Vertec) was 0.906. The Pearson r between the video and contact mat (Just Jump) was 0.967. Both correlations were significant at the 0.01 level. Analysis of variance showed a significant difference among the 3 means F(2,235) = 5.51, p < 0.05. The post hoc analysis showed a significant difference between the criterion reference (M = 0.4369 m) and the Vertec (M = 0.3937 m, p = 0.005) but not between the criterion reference and the Just Jump system (M = 0.4420 m, p = 0.972). The Just Jump method of measuring vertical jump height is a valid measure when compared with the 3-camera system. The Vertec was found to have a high correlation with the criterion reference, but the mean differed significantly. This study indicates that a higher degree of confidence is warranted when comparing Just Jump results with a 3-camera system study.

  15. Reliability and validity of a brief method to assess nociceptive flexion reflex (NFR) threshold.

    PubMed

    Rhudy, Jamie L; France, Christopher R

    2011-07-01

    The nociceptive flexion reflex (NFR) is a physiological tool to study spinal nociception. However, NFR assessment can take several minutes and expose participants to repeated suprathreshold stimulations. The 4 studies reported here assessed the reliability and validity of a brief method to assess NFR threshold that uses a single ascending series of stimulations (Peak 1 NFR), by comparing it to a well-validated method that uses 3 ascending/descending staircases of stimulations (Staircase NFR). Correlations between the NFR definitions were high, were on par with test-retest correlations of Staircase NFR, and were not affected by participant sex or chronic pain status. Results also indicated the test-retest reliabilities for the 2 definitions were similar. Using larger stimulus increments (4 mAs) to assess Peak 1 NFR tended to result in higher NFR threshold estimates than using the Staircase NFR definition, whereas smaller stimulus increments (2 mAs) tended to result in lower NFR threshold estimates than the Staircase NFR definition. Neither NFR definition was correlated with anxiety, pain catastrophizing, or anxiety sensitivity. In sum, a single ascending series of electrical stimulations results in a reliable and valid estimate of NFR threshold. However, caution may be warranted when comparing NFR thresholds across studies that differ in the ascending stimulus increments. This brief method to assess NFR threshold is reliable and valid; therefore, it should be useful to clinical pain researchers interested in quickly assessing inter- and intra-individual differences in spinal nociceptive processes. Copyright © 2011 American Pain Society. Published by Elsevier Inc. All rights reserved.

  16. Validity Argument for Assessing L2 Pragmatics in Interaction Using Mixed Methods

    ERIC Educational Resources Information Center

    Youn, Soo Jung

    2015-01-01

    This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate…

  17. A Framework for Conceptualizing and Evaluating the Validity of Instructionally Relevant Assessments

    ERIC Educational Resources Information Center

    Pellegrino, James W.; DiBello, Louis V.; Goldman, Susan R.

    2016-01-01

    Assessments that function close to classroom teaching and learning can play a powerful role in fostering academic achievement. Unfortunately, however, relatively little attention has been given to discussion of the design and validation of such assessments. The present article presents a framework for conceptualizing and organizing the multiple…

  18. Validation to Spanish of the Caring Assessment Tool (CAT-V)

    PubMed Central

    Ayuso, Rosa María Fernández; Velázquez, Juan Manuel Morillo; Ayuso, David Fernández; de la Torre-Montero, Julio César

    2017-01-01

    Resume Objective: to translate and validate to Spanish the Caring Assessment Scale tool, CAT-V, by Joanne Duffy, within the framework of Jean Watson; as a secondary objective, it is proposed to evaluate its psychometric properties. There are tools designed to measure the patient’s perception of provided cares, including CAT-V, the subject of our interest, in a way that it can be used in Spanish-speaking patients. Methods: to meet the objectives, it was performed sequential translation and retro-translation of the scale to be validated, through a standardized procedure. The final version of that scale was validated in a sample of 349 patients from four public and two private hospitals in Madrid, Spain. Results: The instrument was translated and validated with high internal consistency (Cronbach’s alpha .953). The subsequent factor analysis revealed a three-factor structure, not coincident with the data from the US population. Conclusion: it is considered that the translation of CAT-V is a suitable instrument to be used in the evaluation of patient care in Ibero-american health centers whose language is Spanish. PMID:29069268

  19. Brief report: The Brief Alcohol Social Density Assessment (BASDA): convergent, criterion-related, and incremental validity.

    PubMed

    MacKillop, James; Acker, John D; Bollinger, Jared; Clifton, Allan; Miller, Joshua D; Campbell, W Keith; Goodie, Adam S

    2013-09-01

    Alcohol misuse is substantially influenced by social factors, but systematic assessments of social network drinking are typically lengthy. The goal of the present study was to provide further validation of a brief measure of social network alcohol use, the Brief Alcohol Social Density Assessment (BASDA), in a sample of emerging adults. Specifically, the study sought to examine the BASDA's convergent, criterion, and incremental validity in relation to well-established measures of drinking motives and problematic drinking. Participants were 354 undergraduates who were assessed using the BASDA, the Alcohol Use Disorders Identification Test (AUDIT), and the Drinking Motives Questionnaire. Significant associations were observed between the BASDA index of alcohol-related social density and alcohol misuse, social motives, and conformity motives, supporting convergent validity. Criterion-related validity was supported by evidence that significantly greater alcohol involvement was present in the social networks of individuals scoring at or above an AUDIT score of 8, a validated criterion for hazardous drinking. Finally, the BASDA index was significantly associated with alcohol misuse above and beyond drinking motives in relation to AUDIT scores, supporting incremental validity. Taken together, these findings provide further support for the BASDA as an efficient measure of drinking in an individual's social network. Methodological considerations as well as recommendations for future investigations in this area are discussed.

  20. A Comprehensive, Multi-modal Evaluation of the Assessment System of an Undergraduate Research Methodology Course: Translating Theory into Practice.

    PubMed

    Mohammad Abdulghani, Hamza; G Ponnamperuma, Gominda; Ahmad, Farah; Amin, Zubair

    2014-03-01

    To evaluate assessment system of the 'Research Methodology Course' using utility criteria (i.e. validity, reliability, acceptability, educational impact, and cost-effectiveness). This study demonstrates comprehensive evaluation of assessment system and suggests a framework for similar courses. Qualitative and quantitative methods used for evaluation of the course assessment components (50 MCQ, 3 Short Answer Questions (SAQ) and research project) using the utility criteria. RESULTS of multiple evaluation methods for all the assessment components were collected and interpreted together to arrive at holistic judgments, rather than judgments based on individual methods or individual assessment. Face validity, evaluated using a self-administered questionnaire (response rate-88.7%) disclosed that the students perceived that there was an imbalance in the contents covered by the assessment. This was confirmed by the assessment blueprint. Construct validity was affected by the low correlation between MCQ and SAQ scores (r=0.326). There was a higher correlation between the project and MCQ (r=0.466)/SAQ (r=0.463) scores. Construct validity was also affected by the presence of recall type of MCQs (70%; 35/50), item construction flaws and non-functioning distractors. High discriminating indices (>0.35) were found in MCQs with moderate difficulty indices (0.3-0.7). Reliability of the MCQs was 0.75 which could be improved up to 0.8 by increasing the number of MCQs to at least 70. A positive educational impact was found in the form of the research project assessment driving students to present/publish their work in conferences/peer reviewed journals. Cost per student to complete the course was US$164.50. The multi-modal evaluation of an assessment system is feasible and provides thorough and diagnostic information. Utility of the assessment system could be further improved by modifying the psychometrically inappropriate assessment items.

  1. A Comprehensive, Multi-modal Evaluation of the Assessment System of an Undergraduate Research Methodology Course: Translating Theory into Practice

    PubMed Central

    Mohammad Abdulghani, Hamza; G. Ponnamperuma, Gominda; Ahmad, Farah; Amin, Zubair

    2014-01-01

    Objective: To evaluate assessment system of the 'Research Methodology Course' using utility criteria (i.e. validity, reliability, acceptability, educational impact, and cost-effectiveness). This study demonstrates comprehensive evaluation of assessment system and suggests a framework for similar courses. Methods: Qualitative and quantitative methods used for evaluation of the course assessment components (50 MCQ, 3 Short Answer Questions (SAQ) and research project) using the utility criteria. Results of multiple evaluation methods for all the assessment components were collected and interpreted together to arrive at holistic judgments, rather than judgments based on individual methods or individual assessment. Results: Face validity, evaluated using a self-administered questionnaire (response rate-88.7%) disclosed that the students perceived that there was an imbalance in the contents covered by the assessment. This was confirmed by the assessment blueprint. Construct validity was affected by the low correlation between MCQ and SAQ scores (r=0.326). There was a higher correlation between the project and MCQ (r=0.466)/SAQ (r=0.463) scores. Construct validity was also affected by the presence of recall type of MCQs (70%; 35/50), item construction flaws and non-functioning distractors. High discriminating indices (>0.35) were found in MCQs with moderate difficulty indices (0.3-0.7). Reliability of the MCQs was 0.75 which could be improved up to 0.8 by increasing the number of MCQs to at least 70. A positive educational impact was found in the form of the research project assessment driving students to present/publish their work in conferences/peer reviewed journals. Cost per student to complete the course was US$164.50. Conclusions: The multi-modal evaluation of an assessment system is feasible and provides thorough and diagnostic information. Utility of the assessment system could be further improved by modifying the psychometrically inappropriate assessment items

  2. Reliability and Validity of 3 Methods of Assessing Orthopedic Resident Skill in Shoulder Surgery.

    PubMed

    Bernard, Johnathan A; Dattilo, Jonathan R; Srikumaran, Uma; Zikria, Bashir A; Jain, Amit; LaPorte, Dawn M

    Traditional measures for evaluating resident surgical technical skills (e.g., case logs) assess operative volume but not level of surgical proficiency. Our goal was to compare the reliability and validity of 3 tools for measuring surgical skill among orthopedic residents when performing 3 open surgical approaches to the shoulder. A total of 23 residents at different stages of their surgical training were tested for technical skill pertaining to 3 shoulder surgical approaches using the following measures: Objective Structured Assessment of Technical Skills (OSATS) checklists, the Global Rating Scale (GRS), and a final pass/fail assessment determined by 3 upper extremity surgeons. Adverse events were recorded. The Cronbach α coefficient was used to assess reliability of the OSATS checklists and GRS scores. Interrater reliability was calculated with intraclass correlation coefficients. Correlations among OSATS checklist scores, GRS scores, and pass/fail assessment were calculated with Spearman ρ. Validity of OSATS checklists was determined using analysis of variance with postgraduate year (PGY) as a between-subjects factor. Significance was set at p < 0.05 for all tests. Criterion validity was shown between the OSATS checklists and GRS for the 3 open shoulder approaches. Checklist scores showed superior interrater reliability compared with GRS and subjective pass/fail measurements. GRS scores were positively correlated across training years. The incidence of adverse events was significantly higher among PGY-1 and PGY-2 residents compared with more experienced residents. OSATS checklists are a valid and reliable assessment of technical skills across 3 surgical shoulder approaches. However, checklist scores do not measure quality of technique. Documenting adverse events is necessary to assess quality of technique and ultimate pass/fail status. Multiple methods of assessing surgical skill should be considered when evaluating orthopedic resident surgical performance

  3. Assessing performance and validating finite element simulations using probabilistic knowledge

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dolin, Ronald M.; Rodriguez, E. A.

    Two probabilistic approaches for assessing performance are presented. The first approach assesses probability of failure by simultaneously modeling all likely events. The probability each event causes failure along with the event's likelihood of occurrence contribute to the overall probability of failure. The second assessment method is based on stochastic sampling using an influence diagram. Latin-hypercube sampling is used to stochastically assess events. The overall probability of failure is taken as the maximum probability of failure of all the events. The Likelihood of Occurrence simulation suggests failure does not occur while the Stochastic Sampling approach predicts failure. The Likelihood of Occurrencemore » results are used to validate finite element predictions.« less

  4. Neuro-QoL health-related quality of life measurement system: Validation in Parkinson's disease.

    PubMed

    Nowinski, Cindy J; Siderowf, Andrew; Simuni, Tanya; Wortman, Catherine; Moy, Claudia; Cella, David

    2016-05-01

    Neuro-QoL is a multidimensional patient-reported outcome measurement system assessing aspects of physical, mental, and social health identified by neurology patients and caregivers as important. One of the first neurology-specific patient-reported outcome measure systems created using modern test development methods, Neuro-Qol enables brief, yet precise, assessment and the ability to conduct both PD-specific and cross-disease comparisons. We present results of Neuro-QoL clinical validation using a sample of PD patients. A total of 120 PD patients recruited from academic medical centers were assessed at baseline, 1 week, and 6 months. Assessments included Neuro-QoL and general and PD-specific validity measures. Participants were 62% male and 95% white (average age = 66); H & Y stages were 1 (16%), 2 (61%), 3 (18%), and 4 (5%). Internal consistency and test-retest reliability of Neuro-QoL ranged from Cronbach's alphas = 0.81 to 0.94 with intraclass correlation coefficients = 0.66 to 0.80. Pearson's correlations between Neuro-QoL and legacy measures were generally moderate and in expected directions. UPDRS Part 2 was moderately correlated with Neuro-QoL Upper Extremity and Mobility, respectively (r's = -0.44; -0.59). Parkinson's Disease Questionnaire-39 and Neuro-QoL measures of similar constructs showed strong-to-moderate correlations (r's = 0.70-0.44). Neuro-QoL measures of fatigue, mobility, positive emotion, and emotional/behavioral control showed responsiveness to self-reported change. Neuro-QoL is valid for use in PD clinical research. Reliability for all but two measures is sufficient for group comparisons, with some evidence supporting responsiveness to change. Neuro-QoL possesses characteristics, such as brevity, flexibility in administration, and suitability, for cross-disease comparisons that may be advantageous to users in a variety of settings. © 2016 Movement Disorder Society. © 2016 International Parkinson and Movement Disorder

  5. Reliability and criterion validity of an observation protocol for working technique assessments in cash register work.

    PubMed

    Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina

    2016-06-01

    We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.

  6. Environmental Validation of Legionella Control in a VHA Facility Water System.

    PubMed

    Jinadatha, Chetan; Stock, Eileen M; Miller, Steve E; McCoy, William F

    2018-03-01

    OBJECTIVES We conducted this study to determine what sample volume, concentration, and limit of detection (LOD) are adequate for environmental validation of Legionella control. We also sought to determine whether time required to obtain culture results can be reduced compared to spread-plate culture method. We also assessed whether polymerase chain reaction (PCR) and in-field total heterotrophic aerobic bacteria (THAB) counts are reliable indicators of Legionella in water samples from buildings. DESIGN Comparative Legionella screening and diagnostics study for environmental validation of a healthcare building water system. SETTING Veterans Health Administration (VHA) facility water system in central Texas. METHODS We analyzed 50 water samples (26 hot, 24 cold) from 40 sinks and 10 showers using spread-plate cultures (International Standards Organization [ISO] 11731) on samples shipped overnight to the analytical lab. In-field, on-site cultures were obtained using the PVT (Phigenics Validation Test) culture dipslide-format sampler. A PCR assay for genus-level Legionella was performed on every sample. RESULTS No practical differences regardless of sample volume filtered were observed. Larger sample volumes yielded more detections of Legionella. No statistically significant differences at the 1 colony-forming unit (CFU)/mL or 10 CFU/mL LOD were observed. Approximately 75% less time was required when cultures were started in the field. The PCR results provided an early warning, which was confirmed by spread-plate cultures. The THAB results did not correlate with Legionella status. CONCLUSIONS For environmental validation at this facility, we confirmed that (1) 100 mL sample volumes were adequate, (2) 10× concentrations were adequate, (3) 10 CFU/mL LOD was adequate, (4) in-field cultures reliably reduced time to get results by 75%, (5) PCR provided a reliable early warning, and (6) THAB was not predictive of Legionella results. Infect Control Hosp Epidemiol 2018;39:259-266.

  7. Verification and Validation for Flight-Critical Systems (VVFCS)

    NASA Technical Reports Server (NTRS)

    Graves, Sharon S.; Jacobsen, Robert A.

    2010-01-01

    On March 31, 2009 a Request for Information (RFI) was issued by NASA s Aviation Safety Program to gather input on the subject of Verification and Validation (V & V) of Flight-Critical Systems. The responses were provided to NASA on or before April 24, 2009. The RFI asked for comments in three topic areas: Modeling and Validation of New Concepts for Vehicles and Operations; Verification of Complex Integrated and Distributed Systems; and Software Safety Assurance. There were a total of 34 responses to the RFI, representing a cross-section of academic (26%), small & large industry (47%) and government agency (27%).

  8. Statistically Validated Networks in Bipartite Complex Systems

    PubMed Central

    Tumminello, Michele; Miccichè, Salvatore; Lillo, Fabrizio; Piilo, Jyrki; Mantegna, Rosario N.

    2011-01-01

    Many complex systems present an intrinsic bipartite structure where elements of one set link to elements of the second set. In these complex systems, such as the system of actors and movies, elements of one set are qualitatively different than elements of the other set. The properties of these complex systems are typically investigated by constructing and analyzing a projected network on one of the two sets (for example the actor network or the movie network). Complex systems are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set, and this heterogeneity makes it very difficult to discriminate links of the projected network that are just reflecting system's heterogeneity from links relevant to unveil the properties of the system. Here we introduce an unsupervised method to statistically validate each link of a projected network against a null hypothesis that takes into account system heterogeneity. We apply the method to a biological, an economic and a social complex system. The method we propose is able to detect network structures which are very informative about the organization and specialization of the investigated systems, and identifies those relationships between elements of the projected network that cannot be explained simply by system heterogeneity. We also show that our method applies to bipartite systems in which different relationships might have different qualitative nature, generating statistically validated networks in which such difference is preserved. PMID:21483858

  9. Development and Validation of a Multimedia-Based Assessment of Scientific Inquiry Abilities

    ERIC Educational Resources Information Center

    Kuo, Che-Yu; Wu, Hsin-Kai; Jen, Tsung-Hau; Hsu, Ying-Shao

    2015-01-01

    The potential of computer-based assessments for capturing complex learning outcomes has been discussed; however, relatively little is understood about how to leverage such potential for summative and accountability purposes. The aim of this study is to develop and validate a multimedia-based assessment of scientific inquiry abilities (MASIA) to…

  10. Design, validation, and use of an evaluation instrument for monitoring systemic reform

    NASA Astrophysics Data System (ADS)

    Scantlebury, Kathryn; Boone, William; Butler Kahle, Jane; Fraser, Barry J.

    2001-08-01

    Over the past decade, state and national policymakers have promoted systemic reform as a way to achieve high-quality science education for all students. However, few instruments are available to measure changes in key dimensions relevant to systemic reform such as teaching practices, student attitudes, or home and peer support. Furthermore, Rasch methods of analysis are needed to permit valid comparison of different cohorts of students during different years of a reform effort. This article describes the design, development, validation, and use of an instrument that measures student attitudes and several environment dimensions (standards-based teaching, home support, and peer support) using a three-step process that incorporated expert opinion, factor analysis, and item response theory. The instrument was validated with over 8,000 science and mathematics students, taught by more than 1,000 teachers in over 200 schools as part of a comprehensive assessment of the effectiveness of Ohio's systemic reform initiative. When the new four-factor, 20-item questionnaire was used to explore the relative influence of the class, home, and peer environment on student achievement and attitudes, findings were remarkably consistent across 3 years and different units and methods of analysis. All three environments accounted for unique variance in student attitudes, but only the environment of the class accounted for unique variance in student achievement. However, the class environment (standards-based teaching practices) was the strongest independent predictor of both achievement and attitude, and appreciable amounts of the total variance in attitudes were common to the three environments.

  11. Convergent and Discriminant Validity of the Microcomputer Evaluation Screening and Assessment (MESA) Interest Survey.

    ERIC Educational Resources Information Center

    Janikowski, Timothy P.; And Others

    1990-01-01

    Examined construct validity of Microcomputer Evaluation Screening and Assessment (MESA) Interest Survey. Administered MESA and United States Employment Service (USES) Interest Inventory to 74 volunteer rehabilitation clients. Evidence supported convergent and discriminant validity of MESA. Found fewer significant intercorrelations among MESA…

  12. Relating Knowledge about Reading to Teaching Practice: An Exploratory Validity Study of a Teacher Knowledge Assessment

    ERIC Educational Resources Information Center

    Phelps, Geoffrey; Johnson, David; Carlisle, Joanne

    2009-01-01

    The research reported in this paper is focused directly on assessing the validity of the "Teaching Knowledge about Reading and Reading Practices" (TKRRP) assessment. Following the recommendations of the Standards for Educational and Psychological Testing (APA/AERA, 1999), the authors see validation as a process of constructing an…

  13. 45 CFR 95.626 - Independent Verification and Validation.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 45 Public Welfare 1 2013-10-01 2013-10-01 false Independent Verification and Validation. 95.626... (FFP) Specific Conditions for Ffp § 95.626 Independent Verification and Validation. (a) An assessment for independent verification and validation (IV&V) analysis of a State's system development effort may...

  14. 45 CFR 95.626 - Independent Verification and Validation.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 45 Public Welfare 1 2014-10-01 2014-10-01 false Independent Verification and Validation. 95.626... (FFP) Specific Conditions for Ffp § 95.626 Independent Verification and Validation. (a) An assessment for independent verification and validation (IV&V) analysis of a State's system development effort may...

  15. 45 CFR 95.626 - Independent Verification and Validation.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 45 Public Welfare 1 2011-10-01 2011-10-01 false Independent Verification and Validation. 95.626... (FFP) Specific Conditions for Ffp § 95.626 Independent Verification and Validation. (a) An assessment for independent verification and validation (IV&V) analysis of a State's system development effort may...

  16. 45 CFR 95.626 - Independent Verification and Validation.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 45 Public Welfare 1 2012-10-01 2012-10-01 false Independent Verification and Validation. 95.626... (FFP) Specific Conditions for Ffp § 95.626 Independent Verification and Validation. (a) An assessment for independent verification and validation (IV&V) analysis of a State's system development effort may...

  17. Development and validation of a risk assessment tool for gastric cancer in a general Japanese population.

    PubMed

    Iida, Masahiro; Ikeda, Fumie; Hata, Jun; Hirakawa, Yoichiro; Ohara, Tomoyuki; Mukai, Naoko; Yoshida, Daigo; Yonemoto, Koji; Esaki, Motohiro; Kitazono, Takanari; Kiyohara, Yutaka; Ninomiya, Toshiharu

    2018-05-01

    There have been very few reports of risk score models for the development of gastric cancer. The aim of this study was to develop and validate a risk assessment tool for discerning future gastric cancer risk in Japanese. A total of 2444 subjects aged 40 years or over were followed up for 14 years from 1988 (derivation cohort), and 3204 subjects of the same age group were followed up for 5 years from 2002 (validation cohort). The weighting (risk score) of each risk factor for predicting future gastric cancer in the risk assessment tool was determined based on the coefficients of a Cox proportional hazards model in the derivation cohort. The goodness of fit of the established risk assessment tool was assessed using the c-statistic and the Hosmer-Lemeshow test in the validation cohort. During the follow-up, gastric cancer developed in 90 subjects in the derivation cohort and 35 subjects in the validation cohort. In the derivation cohort, the risk prediction model for gastric cancer was established using significant risk factors: age, sex, the combination of Helicobacter pylori antibody and pepsinogen status, hemoglobin A1c level, and smoking status. The incidence of gastric cancer increased significantly as the sum of risk scores increased (P trend < 0.001). The risk assessment tool was validated internally and showed good discrimination (c-statistic = 0.76) and calibration (Hosmer-Lemeshow test P = 0.43) in the validation cohort. We developed a risk assessment tool for gastric cancer that provides a useful guide for stratifying an individual's risk of future gastric cancer.

  18. Reliability and Validity of the Footprint Assessment Method Using Photoshop CS5 Software in Young People with Down Syndrome.

    PubMed

    Gutiérrez-Vilahú, Lourdes; Massó-Ortigosa, Núria; Rey-Abella, Ferran; Costa-Tutusaus, Lluís; Guerra-Balic, Myriam

    2016-05-01

    People with Down syndrome present skeletal abnormalities in their feet that can be analyzed by commonly used gold standard indices (the Hernández-Corvo index, the Chippaux-Smirak index, the Staheli arch index, and the Clarke angle) based on footprint measurements. The use of Photoshop CS5 software (Adobe Systems Software Ireland Ltd, Dublin, Ireland) to measure footprints has been validated in the general population. The present study aimed to assess the reliability and validity of this footprint assessment technique in the population with Down syndrome. Using optical podography and photography, 44 footprints from 22 patients with Down syndrome (11 men [mean ± SD age, 23.82 ± 3.12 years] and 11 women [mean ± SD age, 24.82 ± 6.81 years]) were recorded in a static bipedal standing position. A blinded observer performed the measurements using a validated manual method three times during the 4-month study, with 2 months between measurements. Test-retest was used to check the reliability of the Photoshop CS5 software measurements. Validity and reliability were obtained by intraclass correlation coefficient (ICC). The reliability test for all of the indices showed very good values for the Photoshop CS5 method (ICC, 0.982-0.995). Validity testing also found no differences between the techniques (ICC, 0.988-0.999). The Photoshop CS5 software method is reliable and valid for the study of footprints in young people with Down syndrome.

  19. Predictive Validity of Measures of the Pathfinder Scaling Algorithm on Programming Performance: Alternative Assessment Strategy for Programming Education

    ERIC Educational Resources Information Center

    Lau, Wilfred W. F.; Yuen, Allan H. K.

    2009-01-01

    Recent years have seen a shift in focus from assessment of learning to assessment for learning and the emergence of alternative assessment methods. However, the reliability and validity of these methods as assessment tools are still questionable. In this article, we investigated the predictive validity of measures of the Pathfinder Scaling…

  20. Feasibility and validity of animal-based indicators for on-farm welfare assessment of thermal stress in dairy goats

    NASA Astrophysics Data System (ADS)

    Battini, Monica; Barbieri, Sara; Fioni, Luna; Mattiello, Silvana

    2016-02-01

    This investigation tested the feasibility and validity of indicators of cold and heat stress in dairy goats for on-farm welfare assessment protocols. The study was performed on two intensive dairy farms in Italy. Two different 3-point scale (0-2) scoring systems were applied to assess cold and heat stress. Cold and heat stress scores were visually assessed from outside the pen in the morning, afternoon and evening in January-February, April-May and July 2013 for a total of nine sessions of observations/farm. Temperature (°C), relative humidity (%) and wind speed (km/h) were recorded and Thermal Heat Index (THI) was calculated. The sessions were allocated to three climatic seasons, depending on THI ranges: cold (<50), neutral (50-65) and hot (>65). Score 2 was rarely assessed; therefore, scores 1 and 2 were aggregated for statistical analysis. The amount of goats suffering from cold stress was significantly higher in the cold season than in neutral ( P < 0.01) and hot ( P < 0.001) seasons. Signs of heat stress were recorded only in the hot season ( P < 0.001). The visual assessment from outside the pen confirms the on-farm feasibility of both indicators: No constraint was found and time required was less than 10 min. Our results show that cold and heat stress scores are valid indicators to detect thermal stress in intensively managed dairy goats. The use of a binary scoring system (presence/absence), merging scores 1 and 2, may be a further refinement to improve the feasibility. This study also allows the prediction of optimal ranges of THI for dairy goat breeds in intensive husbandry systems, setting a comfort zone included into 55 and 70.

  1. Validation of On-board Cloud Cover Assessment Using EO-1

    NASA Technical Reports Server (NTRS)

    Mandl, Dan; Miller, Jerry; Griffin, Michael; Burke, Hsiao-hua

    2003-01-01

    The purpose of this NASA Earth Science Technology Office funded effort was to flight validate an on-board cloud detection algorithm and to determine the performance that can be achieved with a Mongoose V flight computer. This validation was performed on the EO-1 satellite, which is operational, by uploading new flight code to perform the cloud detection. The algorithm was developed by MIT/Lincoln Lab and is based on the use of the Hyperion hyperspectral instrument using selected spectral bands from 0.4 to 2.5 microns. The Technology Readiness Level (TRL) of this technology at the beginning of the task was level 5 and was TRL 6 upon completion. In the final validation, an 8 second (0.75 Gbytes) Hyperion image was processed on-board and assessed for percentage cloud cover within 30 minutes. It was expected to take many hours and perhaps a day considering that the Mongoose V is only a 6-8 MIP machine in performance. To accomplish this test, the image taken had to have level 0 and level 1 processing performed on-board before the cloud algorithm was applied. For almost all of the ground test cases and all of the flight cases, the cloud assessment was within 5% of the correct value and in most cases within 1-2%.

  2. Organizational Systems Questionnaire (OSQ) Validity Study

    ERIC Educational Resources Information Center

    Billings, James C.; Kimball, Thomas G.; Shumway, Sterling T.; Korinek, Alan W.

    2007-01-01

    Marriage and family therapists (MFTs), who are trained in systems theory and consult with complex and difficult systems (e.g., couples and families), are uniquely suited to both assess and intervene in broader organizational systems. However, MFTs are in need of more systemically designed assessment tools to guide and inform their interventions…

  3. [Development and validation of an instrument for initial nursing assessment].

    PubMed

    Fernández-Sola, Cayetano; Granero-Molina, José; Mollinedo-Mallea, Judith; de Gonzales, María Hilda Peredo; Aguilera-Manrique, Gabriel; Ponce, Mara Luna

    2012-12-01

    The objective of this study, conducted in Bolivia from April to July of 2008, is the design and validation of an initial nursing assessment instrument to be used in clinical and educational environments in Santa Cruz (Bolivia). Twelve Bolivian nurses participated; both document analysis as well as consensus techniques were used to determine the categories and criteria to be assessed. Categories included in the nursing assessment instrument are a physical assessment and the eleven Gordon's Functional Health Patterns. The nursing assessment instrument stands out as being concise, easy to complete and utilizing a nursing approach. It does not include items for advanced nursing assessment. However, it incorporates items regarding lifestyle and the patient's autonomy. The nursing assessment instrument contributes to improving the quality of clinical records, supports the nursing diagnosis and implementation of the nursing process, promotes the nurse's role and helps to standardize practice.

  4. The Predictive Validity of Interim Assessment Scores Based on the Full-Information Bifactor Model for the Prediction of End-of-Grade Test Performance

    ERIC Educational Resources Information Center

    Immekus, Jason C.; Atitya, Ben

    2016-01-01

    Interim tests are a central component of district-wide assessment systems, yet their technical quality to guide decisions (e.g., instructional) has been repeatedly questioned. In response, the study purpose was to investigate the validity of a series of English Language Arts (ELA) interim assessments in terms of dimensionality and prediction of…

  5. Validation of a wireless modular monitoring system for structures

    NASA Astrophysics Data System (ADS)

    Lynch, Jerome P.; Law, Kincho H.; Kiremidjian, Anne S.; Carryer, John E.; Kenny, Thomas W.; Partridge, Aaron; Sundararajan, Arvind

    2002-06-01

    A wireless sensing unit for use in a Wireless Modular Monitoring System (WiMMS) has been designed and constructed. Drawing upon advanced technological developments in the areas of wireless communications, low-power microprocessors and micro-electro mechanical system (MEMS) sensing transducers, the wireless sensing unit represents a high-performance yet low-cost solution to monitoring the short-term and long-term performance of structures. A sophisticated reduced instruction set computer (RISC) microcontroller is placed at the core of the unit to accommodate on-board computations, measurement filtering and data interrogation algorithms. The functionality of the wireless sensing unit is validated through various experiments involving multiple sensing transducers interfaced to the sensing unit. In particular, MEMS-based accelerometers are used as the primary sensing transducer in this study's validation experiments. A five degree of freedom scaled test structure mounted upon a shaking table is employed for system validation.

  6. Local Assessment: Using Genre Analysis to Validate Directed Self-Placement

    ERIC Educational Resources Information Center

    Gere, Anne Ruggles; Aull, Laura; Escudero, Moises Damian Perales; Lancaster, Zak; Lei, Elizabeth Vander

    2013-01-01

    Grounded in the principle that writing assessment should be locally developed and controlled, this article describes a study that contextualizes and validates the decisions that students make in the modified Directed Self-Placement (DSP) process used at the University of Michigan. The authors present results of a detailed text analysis of…

  7. Donabedian's structure-process-outcome quality of care model: Validation in an integrated trauma system.

    PubMed

    Moore, Lynne; Lavoie, André; Bourgeois, Gilles; Lapointe, Jean

    2015-06-01

    According to Donabedian's health care quality model, improvements in the structure of care should lead to improvements in clinical processes that should in turn improve patient outcome. This model has been widely adopted by the trauma community but has not yet been validated in a trauma system. The objective of this study was to assess the performance of an integrated trauma system in terms of structure, process, and outcome and evaluate the correlation between quality domains. Quality of care was evaluated for patients treated in a Canadian provincial trauma system (2005-2010; 57 centers, n = 63,971) using quality indicators (QIs) developed and validated previously. Structural performance was measured by transposing on-site accreditation visit reports onto an evaluation grid according to American College of Surgeons criteria. The composite process QI was calculated as the average sum of proportions of conformity to 15 process QIs derived from literature review and expert opinion. Outcome performance was measured using risk-adjusted rates of mortality, complications, and readmission as well as hospital length of stay (LOS). Correlation was assessed with Pearson's correlation coefficients. Statistically significant correlations were observed between structure and process QIs (r = 0.33), and process and outcome QIs (r = -0.33 for readmission, r = -0.27 for LOS). Significant positive correlations were also observed between outcome QIs (r = 0.37 for mortality-readmission; r = 0.39 for mortality-LOS and readmission-LOS; r = 0.45 for mortality-complications; r = 0.34 for readmission-complications; 0.63 for complications-LOS). Significant correlations between quality domains observed in this study suggest that Donabedian's structure-process-outcome model is a valid model for evaluating trauma care. Trauma centers that perform well in terms of structure also tend to perform well in terms of clinical processes, which in turn has a favorable influence on patient outcomes

  8. Reliability and Validity of the Korean Cancer Pain Assessment Tool (KCPAT)

    PubMed Central

    Kim, Jeong A; Lee, Juneyoung; Park, Jeanno; Lee, Myung Ah; Yeom, Chang Hwan; Jang, Se Kwon; Yoon, Duck Mi; Kim, Jun Suk

    2005-01-01

    The Korean Cancer Pain Assessment Tool (KCPAT), which was developed in 2003, consists of questions concerning the location of pain, the nature of pain, the present pain intensity, the symptoms associated with the pain, and psychosocial/spiritual pain assessments. This study was carried out to evaluate the reliability and validity of the KCPAT. A stratified, proportional-quota, clustered, systematic sampling procedure was used. The study population (903 cancer patients) was 1% of the target population (90,252 cancer patients). A total of 314 (34.8%) questionnaires were collected. The results showed that the average pain score (5 point on Likert scale) according to the cancer type and the at-present average pain score (VAS, 0-10) were correlated (r=0.56, p<0.0001), and showed moderate agreement (kappa=0.364). The mean satisfaction score was 3.8 (1-5). The average time to complete the questionnaire was 8.9 min. In conclusion, the KCPAT is a reliable and valid instrument for assessing cancer pain in Koreans. PMID:16224166

  9. Validity and reliability of a video questionnaire to assess physical function in older adults.

    PubMed

    Balachandran, Anoop; N Verduin, Chelsea; Potiaumpai, Melanie; Ni, Meng; Signorile, Joseph F

    2016-08-01

    Self-report questionnaires are widely used to assess physical function in older adults. However, they often lack a clear frame of reference and hence interpreting and rating task difficulty levels can be problematic for the responder. Consequently, the usefulness of traditional self-report questionnaires for assessing higher-level functioning is limited. Video-based questionnaires can overcome some of these limitations by offering a clear and objective visual reference for the performance level against which the subject is to compare his or her perceived capacity. Hence the purpose of the study was to develop and validate a novel, video-based questionnaire to assess physical function in older adults independently living in the community. A total of 61 community-living adults, 60years or older, were recruited. To examine validity, 35 of the subjects completed the video questionnaire, two types of physical performance tests: a test of instrumental activity of daily living (IADL) included in the Short Physical Functional Performance battery (PFP-10), and a composite of 3 performance tests (30s chair stand, single-leg balance and usual gait speed). To ascertain reliability, two-week test-retest reliability was assessed in the remaining 26 subjects who did not participate in validity testing. The video questionnaire showed a moderate correlation with the IADLs (Spearman rho=0.64, p<0.001; 95% CI (0.4, 0.8)), and a lower correlation with the composite score of physical performance tests (Spearman rho=0.49, p<0.01; 95% CI (0.18, 0.7)). The test-retest assessment yielded an intra-class correlation (ICC) of 0.87 (p<0.001; 95% CI (0.70, 0.94)) and a Cronbach's alpha of 0.89 demonstrating good reliability and internal consistency. Our results show that the video questionnaire developed to evaluate physical function in community-living older adults is a valid and reliable assessment tool; however, further validation is needed for definitive conclusions. Copyright © 2016

  10. Assessing Students' Understanding of Macroevolution: Concerns regarding the validity of the MUM

    NASA Astrophysics Data System (ADS)

    Novick, Laura R.; Catley, Kefyn M.

    2012-11-01

    In a recent article, Nadelson and Southerland (2010. Development and preliminary evaluation of the Measure of Understanding of Macroevolution: Introducing the MUM. The Journal of Experimental Education, 78, 151-190) reported on their development of a multiple-choice concept inventory intended to assess college students' understanding of macroevolutionary concepts, the Measure of Understanding Macroevolution (MUM). Given that the only existing evolution inventories assess understanding of natural selection, a microevolutionary concept, a valid assessment of students' understanding of macroevolution would be a welcome and necessary addition to the field of science education. Although the conceptual framework underlying Nadelson and Southerland's test is promising, we believe the test has serious shortcomings with respect to validity evidence for the construct being tested. We argue and provide evidence that these problems are serious enough that the MUM should not be used in its current form to measure students' understanding of macroevolution.

  11. UAS-Systems Integration, Validation, and Diagnostics Simulation Capability

    NASA Technical Reports Server (NTRS)

    Buttrill, Catherine W.; Verstynen, Harry A.

    2014-01-01

    As part of the Phase 1 efforts of NASA's UAS-in-the-NAS Project a task was initiated to explore the merits of developing a system simulation capability for UAS to address airworthiness certification requirements. The core of the capability would be a software representation of an unmanned vehicle, including all of the relevant avionics and flight control system components. The specific system elements could be replaced with hardware representations to provide Hardware-in-the-Loop (HWITL) test and evaluation capability. The UAS Systems Integration and Validation Laboratory (UAS-SIVL) was created to provide a UAS-systems integration, validation, and diagnostics hardware-in-the-loop simulation capability. This paper discusses how SIVL provides a robust and flexible simulation framework that permits the study of failure modes, effects, propagation paths, criticality, and mitigation strategies to help develop safety, reliability, and design data that can assist with the development of certification standards, means of compliance, and design best practices for civil UAS.

  12. Validation Methods Research for Fault-Tolerant Avionics and Control Systems: Working Group Meeting, 2

    NASA Technical Reports Server (NTRS)

    Gault, J. W. (Editor); Trivedi, K. S. (Editor); Clary, J. B. (Editor)

    1980-01-01

    The validation process comprises the activities required to insure the agreement of system realization with system specification. A preliminary validation methodology for fault tolerant systems documented. A general framework for a validation methodology is presented along with a set of specific tasks intended for the validation of two specimen system, SIFT and FTMP. Two major areas of research are identified. First, are those activities required to support the ongoing development of the validation process itself, and second, are those activities required to support the design, development, and understanding of fault tolerant systems.

  13. Fit for purpose and modern validity theory in clinical outcomes assessment.

    PubMed

    Edwards, Michael C; Slagle, Ashley; Rubright, Jonathan D; Wirth, R J

    2018-07-01

    The US Food and Drug Administration (FDA), as part of its regulatory mission, is charged with determining whether a clinical outcome assessment (COA) is "fit for purpose" when used in clinical trials to support drug approval and product labeling. In this paper, we will provide a review (and some commentary) on the current state of affairs in COA development/evaluation/use with a focus on one aspect: How do you know you are measuring the right thing? In the psychometric literature, this concept is referred to broadly as validity and has itself evolved over many years of research and application. After a brief introduction, the first section will review current ideas about "fit for purpose" and how it has been viewed by FDA. This section will also describe some of the unique challenges to COA development/evaluation/use in the clinical trials space. Following this, we provide an overview of modern validity theory as it is currently understood in the psychometric tradition. This overview will focus primarily on the perspective of validity theorists such as Messick and Kane whose work forms the backbone for the bulk of high-stakes assessment in areas such as education, psychology, and health outcomes. We situate the concept of fit for purpose within the broader context of validity. By comparing and contrasting the approaches and the situations where they have traditionally been applied, we identify areas of conceptual overlap as well as areas where more discussion and research are needed.

  14. Validity and reliability of Patient-Reported Outcomes Measurement Information System (PROMIS) Instruments in Osteoarthritis

    PubMed Central

    Broderick, Joan E.; Schneider, Stefan; Junghaenel, Doerte U.; Schwartz, Joseph E.; Stone, Arthur A.

    2013-01-01

    Objective Evaluation of known group validity, ecological validity, and test-retest reliability of four domain instruments from the Patient Reported Outcomes Measurement System (PROMIS) in osteoarthritis (OA) patients. Methods Recruitment of an osteoarthritis sample and a comparison general population (GP) through an Internet survey panel. Pain intensity, pain interference, physical functioning, and fatigue were assessed for 4 consecutive weeks with PROMIS short forms on a daily basis and compared with same-domain Computer Adaptive Test (CAT) instruments that use a 7-day recall. Known group validity (comparison of OA and GP), ecological validity (comparison of aggregated daily measures with CATs), and test-retest reliability were evaluated. Results The recruited samples matched (age, sex, race, ethnicity) the demographic characteristics of the U.S. sample for arthritis and the 2009 Census for the GP. Compliance with repeated measurements was excellent: > 95%. Known group validity for CATs was demonstrated with large effect sizes (pain intensity: 1.42, pain interference: 1.25, and fatigue: .85). Ecological validity was also established through high correlations between aggregated daily measures and weekly CATs (≥ .86). Test-retest validity (7-day) was very good (≥ .80). Conclusion PROMIS CAT instruments demonstrated known group and ecological validity in a comparison of osteoarthritis patients with a general population sample. Adequate test-retest reliability was also observed. These data provide encouraging initial data on the utility of these PROMIS instruments for clinical and research outcomes in osteoarthritis patients. PMID:23592494

  15. Establishment and Validation of GV-SAPS II Scoring System for Non-Diabetic Critically Ill Patients

    PubMed Central

    Liu, Wen-Yue; Lin, Shi-Gang; Zhu, Gui-Qi; Poucke, Sven Van; Braddock, Martin; Zhang, Zhongheng; Mao, Zhi; Shen, Fei-Xia

    2016-01-01

    Background and Aims Recently, glucose variability (GV) has been reported as an independent risk factor for mortality in non-diabetic critically ill patients. However, GV is not incorporated in any severity scoring system for critically ill patients currently. The aim of this study was to establish and validate a modified Simplified Acute Physiology Score II scoring system (SAPS II), integrated with GV parameters and named GV-SAPS II, specifically for non-diabetic critically ill patients to predict short-term and long-term mortality. Methods Training and validation cohorts were exacted from the Multiparameter Intelligent Monitoring in Intensive Care database III version 1.3 (MIMIC-III v1.3). The GV-SAPS II score was constructed by Cox proportional hazard regression analysis and compared with the original SAPS II, Sepsis-related Organ Failure Assessment Score (SOFA) and Elixhauser scoring systems using area under the curve of the receiver operator characteristic (auROC) curve. Results 4,895 and 5,048 eligible individuals were included in the training and validation cohorts, respectively. The GV-SAPS II score was established with four independent risk factors, including hyperglycemia, hypoglycemia, standard deviation of blood glucose levels (GluSD), and SAPS II score. In the validation cohort, the auROC values of the new scoring system were 0.824 (95% CI: 0.813–0.834, P< 0.001) and 0.738 (95% CI: 0.725–0.750, P< 0.001), respectively for 30 days and 9 months, which were significantly higher than other models used in our study (all P < 0.001). Moreover, Kaplan-Meier plots demonstrated significantly worse outcomes in higher GV-SAPS II score groups both for 30-day and 9-month mortality endpoints (all P< 0.001). Conclusions We established and validated a modified prognostic scoring system that integrated glucose variability for non-diabetic critically ill patients, named GV-SAPS II. It demonstrated a superior prognostic capability and may be an optimal scoring system

  16. Establishment and Validation of GV-SAPS II Scoring System for Non-Diabetic Critically Ill Patients.

    PubMed

    Liu, Wen-Yue; Lin, Shi-Gang; Zhu, Gui-Qi; Poucke, Sven Van; Braddock, Martin; Zhang, Zhongheng; Mao, Zhi; Shen, Fei-Xia; Zheng, Ming-Hua

    2016-01-01

    Recently, glucose variability (GV) has been reported as an independent risk factor for mortality in non-diabetic critically ill patients. However, GV is not incorporated in any severity scoring system for critically ill patients currently. The aim of this study was to establish and validate a modified Simplified Acute Physiology Score II scoring system (SAPS II), integrated with GV parameters and named GV-SAPS II, specifically for non-diabetic critically ill patients to predict short-term and long-term mortality. Training and validation cohorts were exacted from the Multiparameter Intelligent Monitoring in Intensive Care database III version 1.3 (MIMIC-III v1.3). The GV-SAPS II score was constructed by Cox proportional hazard regression analysis and compared with the original SAPS II, Sepsis-related Organ Failure Assessment Score (SOFA) and Elixhauser scoring systems using area under the curve of the receiver operator characteristic (auROC) curve. 4,895 and 5,048 eligible individuals were included in the training and validation cohorts, respectively. The GV-SAPS II score was established with four independent risk factors, including hyperglycemia, hypoglycemia, standard deviation of blood glucose levels (GluSD), and SAPS II score. In the validation cohort, the auROC values of the new scoring system were 0.824 (95% CI: 0.813-0.834, P< 0.001) and 0.738 (95% CI: 0.725-0.750, P< 0.001), respectively for 30 days and 9 months, which were significantly higher than other models used in our study (all P < 0.001). Moreover, Kaplan-Meier plots demonstrated significantly worse outcomes in higher GV-SAPS II score groups both for 30-day and 9-month mortality endpoints (all P< 0.001). We established and validated a modified prognostic scoring system that integrated glucose variability for non-diabetic critically ill patients, named GV-SAPS II. It demonstrated a superior prognostic capability and may be an optimal scoring system for prognostic evaluation in this patient group.

  17. Establishing the Validity of the Personality Assessment Inventory Drug and Alcohol Scales in a Corrections Sample

    ERIC Educational Resources Information Center

    Patry, Marc W.; Magaletta, Philip R.; Diamond, Pamela M.; Weinman, Beth A.

    2011-01-01

    Although not originally designed for implementation in correctional settings, researchers and clinicians have begun to use the Personality Assessment Inventory (PAI) to assess offenders. A relatively small number of studies have made attempts to validate the alcohol and drug abuse scales of the PAI, and only a very few studies have validated those…

  18. A Calculus of Occupational Skill Attainment: Building More Validity into a Valid Assessment System

    ERIC Educational Resources Information Center

    Munyofu, Paul; Kohr, Richard

    2009-01-01

    This study investigated several aspects of occupational skill assessment as implemented in one state: (1) What is the extent to which student achievement on the cognitive component was related to their achievement on the psychomotor component of the technical skill assessments? (2) How efficiently was their overall composite attainment calculated?…

  19. Brief report: improving the validity of assessments of adolescents' feelings of privacy invasion.

    PubMed

    Laird, Robert D; Marrero, Matthew D; Melching, Jessica; Kuhn, Emily S

    2013-02-01

    Studies of privacy invasion have relied on measures that combine items assessing adolescents' feelings of privacy invasion with items assessing parents' monitoring behaviors. Removing items assessing parents' monitoring behaviors may improve the validity of assessments of privacy invasion. Data were collected from 163 adolescents (M age 13 years, 5 months; 47% female; 50% European American, non-Hispanic, 46% African American) and their mothers. A model specifying separate factors for privacy invasion and monitoring behavior fit adolescent-reported and parent-reported data significantly better than a single factor model. Although privacy invasion and monitoring behavior were positively associated, privacy invasion and monitoring behavior correlations were significantly different from one another across all ten variables reported by adolescents and across eight of the nine variables reported by mothers. The pattern of results strongly supports a recommendation for researchers to exclude items assessing monitoring behaviors to provide a more valid assessment of privacy invasion. Copyright © 2012 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.

  20. Validity of Montreal Cognitive Assessment in non-english speaking patients with Parkinson's disease.

    PubMed

    Krishnan, Syam; Justus, Sunitha; Meluveettil, Radhamani; Menon, Ramshekhar N; Sarma, Sankara P; Kishore, Asha

    2015-01-01

    The Montreal Cognitive Assessment is a brief and easy screening tool for accurately testing cognitive dysfunction in Parkinson's disease. We tested its validity for use in non-English (Malayalam) speaking patients with Parkinson's disease. We developed a Malayalam (a south-Indian language) version of Montreal Cognitive Assessment and applied to 70 patients with Parkinson's disease and 60 age- and education-matched healthy controls. Metric properties were assessed, and the scores were compared with the performance in validated Malayalam versions of Mini Mental Status Examination and Addenbrooke's Cognitive Examination. The Montreal Cognitive Assessment-Malayalam showed good internal consistency and test-retest reliability and its scores correlated with Mini Mental Status Examination (patients: R = 0.70; P < 0.001; healthy controls: R = 0.26; P = 0.04) and Addenbrooke's Cognitive Examination (patients: R = 0.8; P < 0.001; healthy controls: R = 0.52; P < 0.001) scores. This study establishes the reliability of cross-cultural adaptation of Montreal Cognitive Assessment for assessing cognition in Malayalam-speaking Parkinson's disease patients for early screening and potential future interventions for cognitive dysfunction.

  1. Solar-Diesel Hybrid Power System Optimization and Experimental Validation

    NASA Astrophysics Data System (ADS)

    Jacobus, Headley Stewart

    As of 2008 1.46 billion people, or 22 percent of the World's population, were without electricity. Many of these people live in remote areas where decentralized generation is the only method of electrification. Most mini-grids are powered by diesel generators, but new hybrid power systems are becoming a reliable method to incorporate renewable energy while also reducing total system cost. This thesis quantifies the measurable Operational Costs for an experimental hybrid power system in Sierra Leone. Two software programs, Hybrid2 and HOMER, are used during the system design and subsequent analysis. Experimental data from the installed system is used to validate the two programs and to quantify the savings created by each component within the hybrid system. This thesis bridges the gap between design optimization studies that frequently lack subsequent validation and experimental hybrid system performance studies.

  2. Assessment of sedentary behaviors and transport-related activities by questionnaire: a validation study.

    PubMed

    Mensah, Keitly; Maire, Aurélia; Oppert, Jean-Michel; Dugas, Julien; Charreire, Hélène; Weber, Christiane; Simon, Chantal; Nazare, Julie-Anne

    2016-08-09

    Comprehensive assessment of sedentary behavior (SB) and physical activity (PA), including transport-related activities (TRA), is required to design innovative PA promotion strategies. There are few validated instruments that simultaneously assess the different components of human movement according to their context of practice (e.g. work, transport, leisure). We examined test-retest reliability and validity of the Sedentary, Transportation and Activity Questionnaire (STAQ), a newly developed questionnaire dedicated to assessing context-specific SB, TRA and PA. Ninety six subjects (51 women) kept a contextualized activity-logbook and wore a hip accelerometer (Actigraph GT3X + (TM)) for a 7-day or 14-day period, at the end of which they completed the STAQ. Activity-energy expenditure was measured in a subgroup of 45 subjects using the double labeled water (DLW) method. Test-retest reliability was assessed using intra-class-coefficients (ICC) in a subgroup of 32 subjects who filled the questionnaire twice one month apart. Accelerometry was annotated using the logbook to obtain total and context-specific objective estimates of SB. Spearman correlations, Bland-Altman plots and ICC were used to analyze validity with logbook, accelerometry and DLW data validity criteria. Test-retest reliability was fair for total sitting time (ICC = 0.52), good to excellent for work sitting time (ICC = 0.71), transport-related walking (ICC = 0.61) and car use (ICC = 0.67), and leisure screen-related SB (ICC = 0.64-0.79), but poor for total sitting time during leisure and transport-related contexts. For validity, compared to accelerometry, significant correlations were found for STAQ estimates of total (r = 0.54) and context-specific sitting times with stronger correlations for work sitting time (r = 0.88), and screen times (TV/DVD viewing: r = 0.46; other screens: r = 0.42) than for transport (r = 0.35) or leisure-related sitting-times (r

  3. The development and validation of the Youth Actuarial Care Needs Assessment Tool for Non-Offenders (Y-ACNAT-NO).

    PubMed

    Assink, Mark; van der Put, Claudia E; Oort, Frans J; Stams, Geert Jan J M

    2015-03-04

    In The Netherlands, police officers not only come into contact with juvenile offenders, but also with a large number of juveniles who were involved in a criminal offense, but not in the role of a suspect (i.e., juvenile non-offenders). Until now, no valid and reliable instrument was available that can be used by Dutch police officers for estimating the risk for future care needs of juvenile non-offenders. In the present study, the Youth Actuarial Care Needs Assessment Tool for Non-Offenders (Y-ACNAT-NO) was developed for predicting the risk for future care needs that consisted of (1) a future supervision order as imposed by a juvenile court judge and (2) future worrisome incidents involving child abuse, domestic violence/strife, and/or sexual offensive behavior at the juvenile's living address (i.e., problems in the child-rearing environment). Police records of 3,200 juveniles were retrieved from the Dutch police registration system after which the sample was randomly split in a construction (n = 1,549) and validation sample (n = 1,651). The Y-ACNAT-NO was developed by performing an Exhaustive CHAID analysis using the construction sample. The predictive validity of the instrument was examined in the validation sample by calculating several performance indicators that assess discrimination and calibration. The CHAID output yielded an instrument that consisted of six variables and eleven different risk groups. The risk for future care needs ranged from 0.06 in the lowest risk group to 0.83 in the highest risk group. The AUC value in the validation sample was .764 (95% CI [.743, .784]) and Sander's calibration score indicated an average assessment error of 3.74% in risk estimates per risk category. The Y-ACNAT-NO is the first instrument that can be used by Dutch police officers for estimating the risk for future care needs of juvenile non-offenders. The predictive validity of the Y-ACNAT-NO in terms of discrimination and calibration was sufficient to justify

  4. Validity, reliability and support for implementation of independence-scaled procedural assessment in laparoscopic surgery.

    PubMed

    Kramp, Kelvin H; van Det, Marc J; Veeger, Nic J G M; Pierie, Jean-Pierre E N

    2016-06-01

    There is no widely used method to evaluate procedure-specific laparoscopic skills. The first aim of this study was to develop a procedure-based assessment method. The second aim was to compare its validity, reliability and feasibility with currently available global rating scales (GRSs). An independence-scaled procedural assessment was created by linking the procedural key steps of the laparoscopic cholecystectomy to an independence scale. Subtitled and blinded videos of a novice, an intermediate and an almost competent trainee, were evaluated with GRSs (OSATS and GOALS) and the independence-scaled procedural assessment by seven surgeons, three senior trainees and six scrub nurses. Participants received a short introduction to the GRSs and independence-scaled procedural assessment before assessment. The validity was estimated with the Friedman and Wilcoxon test and the reliability with the intra-class correlation coefficient (ICC). A questionnaire was used to evaluate user opinion. Independence-scaled procedural assessment and GRS scores improved significantly with surgical experience (OSATS p = 0.001, GOALS p < 0.001, independence-scaled procedural assessment p < 0.001). The ICCs of the OSATS, GOALS and independence-scaled procedural assessment were 0.78, 0.74 and 0.84, respectively, among surgeons. The ICCs increased when the ratings of scrub nurses were added to those of the surgeons. The independence-scaled procedural assessment was not considered more of an administrative burden than the GRSs (p = 0.692). A procedural assessment created by combining procedural key steps to an independence scale is a valid, reliable and acceptable assessment instrument in surgery. In contrast to the GRSs, the reliability of the independence-scaled procedural assessment exceeded the threshold of 0.8, indicating that it can also be used for summative assessment. It furthermore seems that scrub nurses can assess the operative competence of surgical trainees.

  5. Development and validation of a professionalism assessment scale for medical students

    PubMed Central

    Klemenc-Ketis, Zalika; Vrecko, Helena

    2014-01-01

    Objectives To develop and validate a scale for the assess-ment of professionalism in medical students based on students' perceptions of and attitudes towards professional-ism in medicine. Methods This was a mixed methods study with under-graduate medical students. Two focus groups were carried out with 12 students, followed by a transcript analysis (grounded theory method with open coding). Then, a 3-round Delphi with 20 family medicine experts was carried out. A psychometric assessment of the scale was performed with a group of 449 students. The items of the Professional-ism Assessment Scale could be answered on a five-point Likert scale. Results After the focus groups, the first version of the PAS consisted of 56 items and after the Delphi study, 30 items remained. The final sample for quantitative study consisted of 122 students (27.2% response rate). There were 95 (77.9%) female students in the sample. The mean age of the sample was 22.1 ± 2.1 years. After the principal component analysis, we removed 8 items and produced the final version of the PAS (22 items). The Cronbach's alpha of the scale was 0.88. Factor analysis revealed three factors: empathy and humanism, professional relationships and development and responsibility. Conclusions The new Professionalism Assessment Scale proved to be valid and reliable. It can be used for the assessment of professionalism in undergraduate medical students. PMID:25382090

  6. The development and validity of the Salford Gait Tool: an observation-based clinical gait assessment tool.

    PubMed

    Toro, Brigitte; Nester, Christopher J; Farren, Pauline C

    2007-03-01

    To develop the construct, content, and criterion validity of the Salford Gait Tool (SF-GT) and to evaluate agreement between gait observations using the SF-GT and kinematic gait data. Tool development and comparative evaluation. University in the United Kingdom. For designing construct and content validity, convenience samples of 10 children with hemiplegic, diplegic, and quadriplegic cerebral palsy (CP) and 152 physical therapy students and 4 physical therapists were recruited. For developing criterion validity, kinematic gait data of 13 gait clusters containing 56 children with hemiplegic, diplegic, and quadriplegic CP and 11 neurologically intact children was used. For clinical evaluation, a convenience sample of 23 pediatric physical therapists participated. We developed a sagittal plane observational gait assessment tool through a series of design, test, and redesign iterations. The tool's grading system was calibrated using kinematic gait data of 13 gait clusters and was evaluated by comparing the agreement of gait observations using the SF-GT with kinematic gait data. Criterion standard kinematic gait data. There was 58% mean agreement based on grading categories and 80% mean agreement based on degree estimations evaluated with the least significant difference method. The new SF-GT has good concurrent criterion validity.

  7. Movie for the Assessment of Social Cognition (MASC): Spanish Validation

    ERIC Educational Resources Information Center

    Lahera, G.; Boada, L.; Pousa, E.; Mirapeix, I.; Morón-Nozaleda, G.; Marinas, L.; Gisbert, L.; Pamiàs, M.; Parellada, M.

    2014-01-01

    We present the Spanish validation of the "Movie for the Assessment of Social Cognition" instrument (MASC-SP). We recruited 22 adolescents and young adults with Asperger syndrome and 26 participants with typical development. The MASC-SP and three other social cognition instruments (Ekman Pictures of Facial Affect test, Reading the Mind in…

  8. A Greenhouse-Gas Information System: Monitoring and Validating Emissions Reporting and Mitigation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jonietz, Karl K.; Dimotakis, Paul E.; Rotman, Douglas A.

    2011-09-26

    This study and report focus on attributes of a greenhouse-gas information system (GHGIS) needed to support MRV&V needs. These needs set the function of such a system apart from scientific/research monitoring of GHGs and carbon-cycle systems, and include (not exclusively): the need for a GHGIS that is operational, as required for decision-support; the need for a system that meets specifications derived from imposed requirements; the need for rigorous calibration, verification, and validation (CV&V) standards, processes, and records for all measurement and modeling/data-inversion data; the need to develop and adopt an uncertainty-quantification (UQ) regimen for all measurement and modeling data; andmore » the requirement that GHGIS products can be subjected to third-party questioning and scientific scrutiny. This report examines and assesses presently available capabilities that could contribute to a future GHGIS. These capabilities include sensors and measurement technologies; data analysis and data uncertainty quantification (UQ) practices and methods; and model-based data-inversion practices, methods, and their associated UQ. The report further examines the need for traceable calibration, verification, and validation processes and attached metadata; differences between present science-/research-oriented needs and those that would be required for an operational GHGIS; the development, operation, and maintenance of a GHGIS missions-operations center (GMOC); and the complex systems engineering and integration that would be required to develop, operate, and evolve a future GHGIS.« less

  9. Assessing personal initiative among vocational training students: development and validation of a new measure.

    PubMed

    Balluerka, Nekane; Gorostiaga, Arantxa; Ulacia, Imanol

    2014-11-14

    Personal initiative characterizes people who are proactive, persistent and self-starting when facing the difficulties that arise in achieving goals. Despite its importance in the educational field there is a scarcity of measures to assess students' personal initiative. Thus, the aim of the present study was to develop a questionnaire to assess this variable in the academic environment and to validate it for adolescents and young adults. The sample comprised 244 vocational training students. The questionnaire showed a factor structure including three factors (Proactivity-Prosocial behavior, Persistence and Self-Starting) with acceptable indices of internal consistency (ranging between α = .57 and α =.73) and good convergent validity with respect to the Self-Reported Initiative scale. Evidence of external validity was also obtained based on the relationships between personal initiative and variables such as self-efficacy, enterprising attitude, responsibility and control aspirations, conscientiousness, and academic achievement. The results indicate that this new measure is very useful for assessing personal initiative among vocational training students.

  10. Assessing Knowledge Sharing Among Academics: A Validation of the Knowledge Sharing Behavior Scale (KSBS).

    PubMed

    Ramayah, T; Yeap, Jasmine A L; Ignatius, Joshua

    2014-04-01

    There is a belief that academics tend to hold on tightly to their knowledge and intellectual resources. However, not much effort has been put into the creation of a valid and reliable instrument to measure knowledge sharing behavior among the academics. To apply and validate the Knowledge Sharing Behavior Scale (KSBS) as a measure of knowledge sharing behavior within the academic community. Respondents (N = 447) were academics from arts and science streams in 10 local, public universities in Malaysia. Data were collected using the 28-item KSBS that assessed four dimensions of knowledge sharing behavior namely written contributions, organizational communications, personal interactions, and communities of practice. The exploratory factor analysis showed that the items loaded on the dimension constructs that they were supposed to represent, thus proving construct validity. A within-factor analysis revealed that each set of items representing their intended dimension loaded on only one construct, therefore establishing convergent validity. All four dimensions were not perfectly correlated with each other or organizational citizenship behavior, thereby proving discriminant validity. However, all four dimensions correlated with organizational commitment, thus confirming predictive validity. Furthermore, all four factors correlated with both tacit and explicit sharing, which confirmed their concurrent validity. All measures also possessed sufficient reliability (α > .70). The KSBS is a valid and reliable instrument that can be used to formally assess the types of knowledge artifacts residing among academics and the degree of knowledge sharing in relation to those artifacts. © The Author(s) 2014.

  11. Validation of the "Security Needs Assessment Profile" for measuring the profiles of security needs of Chinese forensic psychiatric inpatients.

    PubMed

    Siu, B W M; Au-Yeung, C C Y; Chan, A W L; Chan, L S Y; Yuen, K K; Leung, H W; Yan, C K; Ng, K K; Lai, A C H; Davies, S; Collins, M

    Mapping forensic psychiatric services with the security needs of patients is a salient step in service planning, audit and review. A valid and reliable instrument for measuring the security needs of Chinese forensic psychiatric inpatients was not yet available. This study aimed to develop and validate the Chinese version of the Security Needs Assessment Profile for measuring the profiles of security needs of Chinese forensic psychiatric inpatients. The Security Needs Assessment Profile by Davis was translated into Chinese. Its face validity, content validity, construct validity and internal consistency reliability were assessed by measuring the security needs of 98 Chinese forensic psychiatric inpatients. Principal factor analysis for construct validity provided a six-factor security needs model explaining 68.7% of the variance. Based on the Cronbach's alpha coefficient, the internal consistency reliability was rated as acceptable for procedural security (0.73), and fair for both physical security (0.62) and relational security (0.58). A significant sex difference (p=0.002) in total security score was found. The Chinese version of the Security Needs Assessment Profile is a valid and reliable instrument for assessing the security needs of Chinese forensic psychiatric inpatients. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Developing a model for hospital inherent safety assessment: Conceptualization and validation.

    PubMed

    Yari, Saeed; Akbari, Hesam; Gholami Fesharaki, Mohammad; Khosravizadeh, Omid; Ghasemi, Mohammad; Barsam, Yalda; Akbari, Hamed

    2018-01-01

    Paying attention to the safety of hospitals, as the most crucial institute for providing medical and health services wherein a bundle of facilities, equipment, and human resource exist, is of significant importance. The present research aims at developing a model for assessing hospitals' safety based on principles of inherent safety design. Face validity (30 experts), content validity (20 experts), construct validity (268 examples), convergent validity, and divergent validity have been employed to validate the prepared questionnaire; and the items analysis, the Cronbach's alpha test, ICC test (to measure reliability of the test), composite reliability coefficient have been used to measure primary reliability. The relationship between variables and factors has been confirmed at 0.05 significance level by conducting confirmatory factor analysis (CFA) and structural equations modeling (SEM) technique with the use of Smart-PLS. R-square and load factors values, which were higher than 0.67 and 0.300 respectively, indicated the strong fit. Moderation (0.970), simplification (0.959), substitution (0.943), and minimization (0.5008) have had the most weights in determining the inherent safety of hospital respectively. Moderation, simplification, and substitution, among the other dimensions, have more weight on the inherent safety, while minimization has the less weight, which could be due do its definition as to minimize the risk.

  13. Spanish validation of the Negative Symptom Assessment-16 (NSA-16) in patients with schizophrenia.

    PubMed

    Garcia-Alvarez, Leticia; Garcia-Portilla, María Paz; Saiz, Pilar Alejandra; Fonseca-Pedrero, Eduardo; Bobes-Bascaran, María Teresa; Gomar, Jesús; Muñiz, José; Bobes, Julio

    2018-04-05

    Negative symptoms are prevalent in schizophrenia and associated with a poorer outcome. Validated newer psychometric instruments could contribute to better assessment and improved treatment of negative symptoms. The Negative Symptom Assessment-16 (NSA-16) has been shown to have strong psychometric properties, but there is a need for validation in non-English languages. This study aimed to examine the psychometric properties of a Spanish version of the NSA-16 (Sp-NSA-16). Observational, cross-sectional validation study in a sample of 123 outpatients with schizophrenia. NSA-16, PANSS, HDRS, CGI-SCH and PSP. The results indicate appropriate psychometric properties, high internal consistency (Cronbach's alpha=0.86), convergent validity (PANSS negative scale, PANSS Marder Negative Factor and CGI-negative symptoms r values between 0.81 and 0.94) and divergent validity (PANSS positive scale and the HDRS r values between 0.10 and 0.34). In addition, the NSA-16 also exhibited discriminant validity (ROC curve=0.97, 95% CI=0.94 to 1.00; 94.3% sensitivity and 83.3% specificity). The Sp-NSA-16 is reliable and valid for measuring negative symptoms in patients with schizophrenia. This provides Spanish clinicians with a new tool for clinical practice and research. However, it is necessary to provide further information about its inter-rater reliability. Copyright © 2018 SEP y SEPB. Publicado por Elsevier España, S.L.U. All rights reserved.

  14. Reliability and validity of the Japanese Migraine Disability Assessment (MIDAS) Questionnaire.

    PubMed

    Iigaya, Miho; Sakai, Fumihiko; Kolodner, Kenneth B; Lipton, Richard B; Stewart, Walter F

    2003-04-01

    This study was designed to assess the test-retest reliability, internal consistency, and validity of a Japanese translation of the Migraine Disability Assessment (MIDAS) Questionnaire in a sample of Japanese patients with headache. Previous studies have demonstrated that the English-language version of the MIDAS Questionnaire is a reliable and valid instrument for the assessment of migraine-related disability. Any translations of the MIDAS Questionnaire must also be assessed for reliability and validity. Study participants were recruited from the patient population attending either the Neurology Department of Kitasato University or an affiliated clinic. Participants were eligible for study entry if they had 6 or more primary headaches per year. For reliability testing, participants completed the MIDAS Questionnaire on 2 occasions, exactly 2 weeks apart. To assess validity, patients were also invited to participate in a 90-day daily diary study. Composite measures from the 90-day diaries were compared to equivalent MIDAS measures (ie, 5 questions on headache-related disability and 1 question each on average pain intensity and headache frequency in the last 3 months) and to the total MIDAS score obtained from a third MIDAS Questionnaire completed at the end of this 90-day period. One hundred one patients between the ages of 21 and 77 years were recruited (81 women and 20 men). Ninety-nine patients (80 women and 19 men) participated in the diary study. At baseline, 46.5% of patients were MIDAS grade I or II (minimal, mild, or infrequent disability), 22.2% were MIDAS grade III (moderate disability), and 31.3% were MIDAS grade IV (severe disability). Test-retest Spearman correlations for the 5 disability questions and the questions on average pain intensity and headache frequency ranged from 0.59 to 0.80 (P<.0001). The test-retest Spearman correlation coefficient for the total MIDAS score was 0.83 (P<.0001). The degree to which individual MIDAS questions correlated with

  15. Predictive validity and correlates of self-assessed resilience among U.S. Army soldiers.

    PubMed

    Campbell-Sills, Laura; Kessler, Ronald C; Ursano, Robert J; Sun, Xiaoying; Taylor, Charles T; Heeringa, Steven G; Nock, Matthew K; Sampson, Nancy A; Jain, Sonia; Stein, Murray B

    2018-02-01

    Self-assessment of resilience could prove valuable to military and other organizations whose personnel confront foreseen stressors. We evaluated the validity of self-assessed resilience among U.S. Army soldiers, including whether predeployment perceived resilience predicted postdeployment emotional disorder. Resilience was assessed via self-administered questionnaire among new soldiers reporting for basic training (N = 35,807) and experienced soldiers preparing to deploy to Afghanistan (N = 8,558). Concurrent validity of self-assessed resilience was evaluated among recruits by estimating its association with past-month emotional disorder. Predictive validity was examined among 3,526 experienced soldiers with no lifetime emotional disorder predeployment. Predictive models estimated associations of predeployment resilience with incidence of emotional disorder through 9 months postdeployment and with marked improvement in coping at 3 months postdeployment. Weights-adjusted regression models incorporated stringent controls for risk factors. Soldiers characterized themselves as very resilient on average [M = 14.34, SD = 4.20 (recruits); M = 14.75, SD = 4.31 (experienced soldiers); theoretical range = 0-20]. Demographic characteristics exhibited only modest associations with resilience, while severity of childhood maltreatment was negatively associated with resilience in both samples. Among recruits, resilience was inversely associated with past-month emotional disorder [adjusted odds ratio (AOR) = 0.65, 95% CI = 0.62-0.68, P < .0005 (per standard score increase)]. Among deployed soldiers, greater predeployment resilience was associated with decreased incidence of emotional disorder (AOR = 0.91; 95% CI = 0.84-0.98; P = .016) and increased odds of improved coping (AOR = 1.36; 95% CI = 1.24-1.49; P < .0005) postdeployment. Findings supported validity of self-assessed resilience among soldiers, although its predictive effect on incidence of

  16. The Italian version of the Mouth Handicap in Systemic Sclerosis scale (MHISS) is valid, reliable and useful in assessing oral health-related quality of life (OHRQoL) in systemic sclerosis (SSc) patients.

    PubMed

    Maddali Bongi, S; Del Rosso, A; Miniati, I; Galluccio, F; Landi, G; Tai, G; Matucci-Cerinic, M

    2012-09-01

    In systemic sclerosis (SSc), mouth and face involvement leads to problems in oral health-related quality of life (OHRQoL). Mouth Handicap in Systemic Sclerosis scale (MHISS) is a 12-item questionnaire specifically quantifying mouth disability in SSc, organized in 3 subscales. Our aim was to validate Italian version of MHISS, by assessing its test-retest reliability and internal and external consistency in Italian SSc patients. Forty SSc patients (7 dSSc, 33 lSSc; age and disease duration: 57.27 ± 11.41, 9.4 ± 4.4 years; 22 with sicca syndrome) were evaluated with MHISS. MHISS was translated following a forward-backward translation procedure, with independent translations and counter-translation. Test-retest reliability was evaluated, comparing the results of two administrations, with intraclass correlation coefficient (ICC). Internal consistency was assessed by Cronbach's α and external consistency by comparison with mouth opening. MHISS has a good test-retest reliability (ICC: 0.93) and internal consistency (Cronbach's α:0.99). A good external consistency was confirmed by correlation with mouth opening (rho: -0,3869, p: 0.0137). Total MHISS score was 17.65 ± 5.20, with scores of subscale 1 (reduced mouth opening) of 6.60 ± 2.85 and scores of subscales 2 (sicca syndrome) and 3 (aesthetic concerns) of 7.82 ± 2.59 and 3.22 ± 1.14. Total and subscale 2 scores are higher in dSSc than in lSSc. This result may be due to the higher presence of sicca syndrome in dSSc than in lSSc (p = 0.0109). Our results support validity and reliability in Italian SSc patients of MHISS, specifically measuring SSc OHRQoL.

  17. Validity of assessing child feeding with virtual reality.

    PubMed

    Persky, Susan; Goldring, Megan R; Turner, Sara A; Cohen, Rachel W; Kistler, William D

    2018-04-01

    Assessment of parents' child feeding behavior is challenging, and there is need for additional methodological approaches. Virtual reality technology allows for the creation of behavioral measures, and its implementation overcomes several limitations of existing methods. This report evaluates the validity and usability of the Virtual Reality (VR) Buffet among a sample of 52 parents of children aged 3-7. Participants served a meal of pasta and apple juice in both a virtual setting and real-world setting (counterbalanced and separated by a distractor task). They then created another meal for their child, this time choosing from the full set of food options in the VR Buffet. Finally, participants completed a food estimation task followed by a questionnaire, which assessed their perceptions of the VR Buffet. Results revealed that the amount of virtual pasta served by parents correlated significantly with the amount of real pasta they served, r s  = 0.613, p < .0001, as did served amounts of virtual and real apple juice, r s  = 0.822, p < .0001. Furthermore, parents' perception of the calorie content of chosen foods was significantly correlated with observed calorie content (r s  = 0.438, p = .002), and parents agreed that they would feed the meal they created to their child (M = 4.43, SD = 0.82 on a 1-5 scale). The data presented here demonstrate that parent behavior in the VR Buffet is highly related to real-world behavior, and that the tool is well-rated by parents. Given the data presented and the potential benefits of the abundant behavioral data the VR Buffet can provide, we conclude that it is a valid and needed addition to the array of tools for assessing feeding behavior. Published by Elsevier Ltd.

  18. Training and Validation of Standardized Patients for Unannounced Assessment of Physicians' Management of Depression

    ERIC Educational Resources Information Center

    Shirazi, Mandana; Sadeghi, Majid; Emami, A.; Kashani, A. Sabouri; Parikh, Sagar; Alaeddini, F.; Arbabi, Mohammad; Wahlstrom, Rolf

    2011-01-01

    Objective: Standardized patients (SPs) have been developed to measure practitioner performance in actual practice settings, but results have not been fully validated for psychiatric disorders. This study describes the process of creating reliable and valid SPs for unannounced assessment of general-practitioners' management of depression disorders…

  19. Helicopter simulation validation using flight data

    NASA Technical Reports Server (NTRS)

    Key, D. L.; Hansen, R. S.; Cleveland, W. B.; Abbott, W. Y.

    1982-01-01

    A joint NASA/Army effort to perform a systematic ground-based piloted simulation validation assessment is described. The best available mathematical model for the subject helicopter (UH-60A Black Hawk) was programmed for real-time operation. Flight data were obtained to validate the math model, and to develop models for the pilot control strategy while performing mission-type tasks. The validated math model is to be combined with motion and visual systems to perform ground based simulation. Comparisons of the control strategy obtained in flight with that obtained on the simulator are to be used as the basis for assessing the fidelity of the results obtained in the simulator.

  20. Reliability and validity of procedure-based assessments in otolaryngology training.

    PubMed

    Awad, Zaid; Hayden, Lindsay; Robson, Andrew K; Muthuswamy, Keerthini; Tolley, Neil S

    2015-06-01

    To investigate the reliability and construct validity of procedure-based assessment (PBA) in assessing performance and progress in otolaryngology training. Retrospective database analysis using a national electronic database. We analyzed PBAs of otolaryngology trainees in North London from core trainees (CTs) to specialty trainees (STs). The tool contains six multi-item domains: consent, planning, preparation, exposure/closure, technique, and postoperative care, rated as "satisfactory" or "development required," in addition to an overall performance rating (pS) of 1 to 4. Individual domain score, overall calculated score (cS), and number of "development-required" items were calculated for each PBA. Receiver operating characteristic analysis helped determine sensitivity and specificity. There were 3,152 otolaryngology PBAs from 46 otolaryngology trainees analyzed. PBA reliability was high (Cronbach's α 0.899), and sensitivity approached 99%. cS correlated positively with pS and level in training (rs : +0.681 and +0.324, respectively). ST had higher cS and pS than CT (93% ± 0.6 and 3.2 ± 0.03 vs. 71% ± 3.1 and 2.3 ± 0.08, respectively; P < .001). cS and pS increased from CT1 to ST8 showing construct validity (rs : +0.348 and +0.354, respectively; P < .001). The technical skill domain had the highest utilization (98% of PBAs) and was the best predictor of cS and pS (rs : +0.96 and +0.66, respectively). PBA is reliable and valid for assessing otolaryngology trainees' performance and progress at all levels. It is highly sensitive in identifying competent trainees. The tool is used in a formative and feedback capacity. The technical domain is the best predictor and should be given close attention. NA. © 2014 The American Laryngological, Rhinological and Otological Society, Inc.

  1. The Irvine, Beatties, and Bresnahan (IBB) Forelimb Recovery Scale: An Assessment of Reliability and Validity

    PubMed Central

    Irvine, Karen-Amanda; Ferguson, Adam R.; Mitchell, Kathleen D.; Beattie, Stephanie B.; Lin, Amity; Stuck, Ellen D.; Huie, J. Russell; Nielson, Jessica L.; Talbott, Jason F.; Inoue, Tomoo; Beattie, Michael S.; Bresnahan, Jacqueline C.

    2014-01-01

    The IBB scale is a recently developed forelimb scale for the assessment of fine control of the forelimb and digits after cervical spinal cord injury [SCI; (1)]. The present paper describes the assessment of inter-rater reliability and face, concurrent and construct validity of this scale following SCI. It demonstrates that the IBB is a reliable and valid scale that is sensitive to severity of SCI and to recovery over time. In addition, the IBB correlates with other outcome measures and is highly predictive of biological measures of tissue pathology. Multivariate analysis using principal component analysis (PCA) demonstrates that the IBB is highly predictive of the syndromic outcome after SCI (2), and is among the best predictors of bio-behavioral function, based on strong construct validity. Altogether, the data suggest that the IBB, especially in concert with other measures, is a reliable and valid tool for assessing neurological deficits in fine motor control of the distal forelimb, and represents a powerful addition to multivariate outcome batteries aimed at documenting recovery of function after cervical SCI in rats. PMID:25071704

  2. Validity and Reliability of Field-Based Measures for Assessing Movement Skill Competency in Lifelong Physical Activities: A Systematic Review.

    PubMed

    Hulteen, Ryan M; Lander, Natalie J; Morgan, Philip J; Barnett, Lisa M; Robertson, Samuel J; Lubans, David R

    2015-10-01

    It has been suggested that young people should develop competence in a variety of 'lifelong physical activities' to ensure that they can be active across the lifespan. The primary aim of this systematic review is to report the methodological properties, validity, reliability, and test duration of field-based measures that assess movement skill competency in lifelong physical activities. A secondary aim was to clearly define those characteristics unique to lifelong physical activities. A search of four electronic databases (Scopus, SPORTDiscus, ProQuest, and PubMed) was conducted between June 2014 and April 2015 with no date restrictions. Studies addressing the validity and/or reliability of lifelong physical activity tests were reviewed. Included articles were required to assess lifelong physical activities using process-oriented measures, as well as report either one type of validity or reliability. Assessment criteria for methodological quality were adapted from a checklist used in a previous review of sport skill outcome assessments. Movement skill assessments for eight different lifelong physical activities (badminton, cycling, dance, golf, racquetball, resistance training, swimming, and tennis) in 17 studies were identified for inclusion. Methodological quality, validity, reliability, and test duration (time to assess a single participant), for each article were assessed. Moderate to excellent reliability results were found in 16 of 17 studies, with 71% reporting inter-rater reliability and 41% reporting intra-rater reliability. Only four studies in this review reported test-retest reliability. Ten studies reported validity results; content validity was cited in 41% of these studies. Construct validity was reported in 24% of studies, while criterion validity was only reported in 12% of studies. Numerous assessments for lifelong physical activities may exist, yet only assessments for eight lifelong physical activities were included in this review

  3. Development, Sensibility, and Validity of a Systemic Autoimmune Rheumatic Disease Case Ascertainment Tool.

    PubMed

    Armstrong, Susan M; Wither, Joan E; Borowoy, Alan M; Landolt-Marticorena, Carolina; Davis, Aileen M; Johnson, Sindhu R

    2017-01-01

    Case ascertainment through self-report is a convenient but often inaccurate method to collect information. The purposes of this study were to develop, assess the sensibility, and validate a tool to identify cases of systemic autoimmune rheumatic diseases (SARD) in the outpatient setting. The SARD tool was administered to subjects sampled from specialty clinics. Determinants of sensibility - comprehensibility, feasibility, validity, and acceptability - were evaluated using a numeric rating scale from 1-7. Comprehensibility was evaluated using the Flesch Reading Ease and the Flesch-Kincaid Grade Level. Self-reported diagnoses were validated against medical records using Cohen's κ statistic. There were 141 participants [systemic lupus erythematosus (SLE), systemic sclerosis (SSc), rheumatoid arthritis, Sjögren syndrome (SS), inflammatory myositis (polymyositis/dermatomyositis; PM/DM), and controls] who completed the questionnaire. The Flesch Reading Ease score was 77.1 and the Flesch-Kincaid Grade Level was 4.4. Respondents endorsed (mean ± SD) comprehensibility (6.12 ± 0.92), feasibility (5.94 ± 0.81), validity (5.35 ± 1.10), and acceptability (3.10 ± 2.03). The SARD tool had a sensitivity of 0.91 (95% CI 0.88-0.94) and a specificity of 0.99 (95% CI 0.96-1.00). The agreement between the SARD tool and medical record was κ = 0.82 (95% CI 0.77-0.88). Subgroup analysis by SARD found κ coefficients for SLE to be κ = 0.88 (95% CI 0.79-0.97), SSc κ = 1.0 (95% CI 1.0-1.0), PM/DM κ = 0.72 (95% CI 0.49-0.95), and SS κ = 0.85 (95% CI 0.71-0.99). The screening questions had sensitivity ranging from 0.96 to 1.0 and specificity ranging from 0.88 to 1.0. This SARD case ascertainment tool has demonstrable sensibility and validity. The use of both screening and confirmatory questions confers added accuracy.

  4. Assessing the validity of sales self-efficacy: a cautionary tale.

    PubMed

    Gupta, Nina; Ganster, Daniel C; Kepes, Sven

    2013-07-01

    We developed a focused, context-specific measure of sales self-efficacy and assessed its incremental validity against the broad Big 5 personality traits with department store salespersons, using (a) both a concurrent and a predictive design and (b) both objective sales measures and supervisory ratings of performance. We found that in the concurrent study, sales self-efficacy predicted objective and subjective measures of job performance more than did the Big 5 measures. Significant differences between the predictability of subjective and objective measures of performance were not observed. Predictive validity coefficients were generally lower than concurrent validity coefficients. The results suggest that there are different dynamics operating in concurrent and predictive designs and between broad and contextualized measures; they highlight the importance of distinguishing between these designs and measures in meta-analyses. The results also point to the value of focused, context-specific personality predictors in selection research. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  5. Frameworks to assess health systems governance: a systematic review

    PubMed Central

    Smith, Helen; van den Broek, Nynke

    2017-01-01

    Abstract Governance of the health system is a relatively new concept and there are gaps in understanding what health system governance is and how it could be assessed. We conducted a systematic review of the literature to describe the concept of governance and the theories underpinning as applied to health systems; and to identify which frameworks are available and have been applied to assess health systems governance. Frameworks were reviewed to understand how the principles of governance might be operationalized at different levels of a health system. Electronic databases and web portals of international institutions concerned with governance were searched for publications in English for the period January 1994 to February 2016. Sixteen frameworks developed to assess governance in the health system were identified and are described. Of these, six frameworks were developed based on theories from new institutional economics; three are primarily informed by political science and public management disciplines; three arise from the development literature and four use multidisciplinary approaches. Only five of the identified frameworks have been applied. These used the principal–agent theory, theory of common pool resources, North’s institutional analysis and the cybernetics theory. Governance is a practice, dependent on arrangements set at political or national level, but which needs to be operationalized by individuals at lower levels in the health system; multi-level frameworks acknowledge this. Three frameworks were used to assess governance at all levels of the health system. Health system governance is complex and difficult to assess; the concept of governance originates from different disciplines and is multidimensional. There is a need to validate and apply existing frameworks and share lessons learnt regarding which frameworks work well in which settings. A comprehensive assessment of governance could enable policy makers to prioritize solutions for

  6. Validation and clinical significance of the Childhood Myositis Assessment Scale for assessment of muscle function in the juvenile idiopathic inflammatory myopathies.

    PubMed

    Huber, Adam M; Feldman, Brian M; Rennebohm, Robert M; Hicks, Jeanne E; Lindsley, Carol B; Perez, Maria D; Zemel, Lawrence S; Wallace, Carol A; Ballinger, Susan H; Passo, Murray H; Reed, Ann M; Summers, Ronald M; White, Patience H; Katona, Ildy M; Miller, Frederick W; Lachenbruch, Peter A; Rider, Lisa G

    2004-05-01

    To examine the measurement characteristics of the Childhood Myositis Assessment Scale (CMAS) in children with juvenile idiopathic inflammatory myopathy (juvenile IIM), and to obtain preliminary data on the clinical significance of CMAS scores. One hundred eight children with juvenile IIM were evaluated on 2 occasions, 7-9 months apart, using various measures of physical function, strength, and disease activity. Interrater reliability, construct validity, and responsiveness of the CMAS were examined. The minimum clinically important difference (MID) and CMAS scores corresponding to various degrees of physical disability were estimated. The intraclass correlation coefficient for 26 patients assessed by 2 examiners was 0.89, indicating very good interrater reliability. The CMAS score correlated highly with the Childhood Health Assessment Questionnaire (C-HAQ) score and with findings on manual muscle testing (MMT) (r(s) = -0.73 and 0.73, respectively) and moderately with physician-assessed global disease activity and skin activity, parent-assessed global disease severity, and muscle magnetic resonance imaging (r(s) = -0.44 to -0.61), thereby demonstrating good construct validity. The standardized response mean was 0.81 (95% confidence interval 0.53, 1.09) in patients with at least 0.8 cm improvement on a 10-cm visual analog scale for physician-assessed global disease activity, indicating strong responsiveness. In bivariate regression models predicting physician-assessed global disease activity, MMT remained significant in models containing the CMAS (P = 0.03) while the C-HAQ did not (P = 0.4). Estimates of the MID ranged from 1.5 to 3.0 points on a 0-52-point scale. CMAS scores corresponding to no, mild, mild-to-moderate, and moderate physical disability, respectively, were 48, 45, 39, and 30. The CMAS exhibits good reliability, construct validity, and responsiveness, and is therefore a valid instrument for the assessment of physical function, muscle strength, and

  7. Validation of the questionnaire on hand function assessment in leprosy.

    PubMed

    Ferreira, Telma Leonel; Alvarez, Rosicler Rocha Aiza; Virmond, Marcos da Cunha Lopes

    2012-06-01

    To validate the psychometric properties of the questionnaire on hand function assessment in leprosy. Study conducted with a convenience sample of 101 consecutive patients in Brasília (Central-Western Brazil), from June 2008 to July 2009. The individuals were adults affected by leprosy, with impairment of the ulnar, median and radial nerves. Interobservers and intraobserver reproducibility was analyzed through successive interviews, and construct validity was analyzed through association between age, clinical form of leprosy, duration of nerve injury, grip and pinch strength measured with a dynamometer, sensibility test performed with Semmes-Weinstein monofilaments and manual ability assessment using the Jebsen test of hand function. Pondered kappa coefficient was calculated and a Bland-Altman plot was constructed to assess the reproducibility of the instrument. For internal consistency, Cronbach's alpha coefficient was utilized. Pearson's correlation coefficient was calculated and a multiple regression model was used. The pondered kappa values for interobservers and intraobserver assessments ranged from 0.86 to 0.97 and from 0.85 to 0.97, respectively. The value of Cronbach's alpha coefficient was 0.967. Pearson's correlation coefficient showed an association (p < 0.001) among duration of nerve injury, grip and pinch strength, cutaneous sensibility and mean score in the Jebsen Test. The mean score of the questionnaire on hand functional assessment in leprosy was associated with operational classification of leprosy, duration of nerve injury, grip strength, cutaneous sensibility and manual ability (p < 0.0001 for the model as a whole). The questionnaire on hand functional assessment in leprosy presents almost perfect interobservers and intraobserver reproducibility, high internal consistency and correlation with operational classification of leprosy, duration of nerve injury, grip strength, cutaneous sensibility in the hands and manual ability.

  8. Cross-cultural adaptation and validation of the neonatal/infant Braden Q risk assessment scale.

    PubMed

    de Lima, Edson Luiz; de Brito, Maria José Azevedo; de Souza, Diba Maria Sebba Tosta; Salomé, Geraldo Magela; Ferreira, Lydia Masako

    2016-02-01

    To translate into Brazilian Portuguese and cross-culturally adapt the Neonatal/Infant Braden Q Risk Assessment Scale (Neonatal/Infant Braden Q Scale), and test the psychometric properties, reproducibility and validity of the instrument. There is a lack of studies on the development of pressure ulcers in children, especially in neonates. Thirty professionals participated in the cross-cultural adaptation of the Brazilian-Portuguese version of the scale. Fifty neonates of both sexes were assessed between July 2013 and June 2014. Reliability and reproducibility were tested in 20 neonates and construct validity was measured by correlating the Neonatal/Infant Braden Q Scale with the Braden Q Risk Assessment Scale (Braden Q Scale). Discriminant validity was assessed by comparing the scores of neonates with and without ulcers. The scale showed inter-rater reliability (ICC = 0.98; P < 0.001) and intra-rater reliability (ICC = 0.79; P < 0.001). A strong correlation was found between the Neonatal/Infant Braden Q Scale and Braden Q Scale (r = 0.96; P < 0.001). The cross-culturally adapted Brazilian version of the Neonatal/Infant Braden Q Scale is a reliable instrument, showing face, content and construct validity. Copyright © 2015 Tissue Viability Society. Published by Elsevier Ltd. All rights reserved.

  9. Validity and reliability of the Turkish version of the pressure ulcer prevention knowledge assessment instrument.

    PubMed

    Tulek, Zeliha; Polat, Cansu; Ozkan, Ilknur; Theofanidis, Dimitris; Togrol, Rifat Erdem

    2016-11-01

    Sound knowledge of pressure ulcers is important to enable good prevention. There are limited instruments assessing pressure ulcer knowledge. The Pressure Ulcer Prevention Knowledge Assessment Instrument is among the scales of which psychometric properties have been studied rigorously and reflects the latest evidence. This study aimed to evaluate the validity and reliability of the Turkish version of the Pressure Ulcer Prevention Knowledge Assessment Instrument (PUPKAI-T), an instrument that assesses knowledge of pressure ulcer prevention by using multiple-choice questions. Linguistic validity was verified through front-to-back translation. Psychometric properties of the instrument were studied on a sample of 150 nurses working in a tertiary hospital in Istanbul, Turkey. The content validity index of the translated instrument was 0.94, intra-class correlation coefficients were between 0.37 and 0.80, item difficulty indices were between 0.21 and 0.88, discrimination indices were 0.20-0.78, and the Kuder Richardson for the internal consistency was 0.803. The PUPKAI-T was found to be a valid and reliable tool to evaluate nurses' knowledge on pressure ulcer prevention. The PUPKAI-T may be a useful tool for determining educational needs of nurses on pressure ulcer prevention. Copyright © 2016 Tissue Viability Society. Published by Elsevier Ltd. All rights reserved.

  10. Evaluation of the Validity and Response Burden of Patient Self-Report Measures of the Pain Assessment Screening Tool and Outcomes Registry (PASTOR).

    PubMed

    Cook, Karon F; Kallen, Michael A; Buckenmaier, Chester; Flynn, Diane M; Hanling, Steven R; Collins, Teresa S; Joltes, Kristin; Kwon, Kyung; Medina-Torne, Sheila; Nahavandi, Parisa; Suen, Joshua; Gershon, Richard

    2017-07-01

    In 2009, the Army Pain Management Task Force was chartered. On the basis of their findings, the Department of Defense recommended a comprehensive pain management strategy that included development of a standardized pain assessment system that would collect patient-reported outcomes data to inform the patient-provider clinical encounter. The result was the Pain Assessment Screening Tool and Outcomes Registry (PASTOR). The purpose of this study was to assess the validity and response burden of the patient-reported outcome measures in PASTOR. Data for analyses were collected from 681 individuals who completed PASTOR at baseline and follow-up as part of their routine clinical care. The survey tool included self-report measures of pain severity and pain interference (measured using the National Institutes of Health Patient-Reported Outcome Measurement Information System [PROMIS] and the Defense and Veterans Pain Rating scale). PROMIS measures of pain correlates also were administered. Validation analyses included estimation of score associations among measures, comparison of scores of known groups, responsiveness, ceiling and floor effects, and response burden. Results of psychometric testing provided substantial evidence for the validity of PASTOR self-report measures in this population. Expected associations among scores largely supported the concurrent validity of the measures. Scores effectively distinguished among respondents on the basis of their self-reported impressions of general health. PROMIS measures were administered using computer adaptive testing and each, on average, required less than 1 minute to administer. Statistical and graphical analyses demonstrated the responsiveness of PASTOR measures over time. Reprint & Copyright © 2017 Association of Military Surgeons of the U.S.

  11. Reliability and validity of the instrument used in BRFSS to assess physical activity.

    PubMed

    Yore, Michelle M; Ham, Sandra A; Ainsworth, Barbara E; Kruger, Judy; Reis, Jared P; Kohl, Harold W; Macera, Caroline A

    2007-08-01

    State-level statistics of adherence to the physical activity objectives in Healthy People 2010 are derived from the Behavioral Risk Factor Surveillance System (BRFSS) data. BRFSS physical activity questions were updated in 2001 to include domains of leisure time, household, and transportation-related activity of moderate- and vigorous intensity, and walking questions. This article reports the reliability and validity of these questions. The BRFSS Physical Activity Study (BPAS) was conducted from September 2000 to May 2001 in Columbia, SC. Sixty participants were followed for 22 d; they answered the physical activity questions three times via telephone, wore a pedometer and accelerometer, and completed a daily physical activity log for 1 wk. Measures for moderate, vigorous, recommended (i.e., met the criteria for moderate or vigorous), and strengthening activities were created according to Healthy People 2010 operational definitions. Reliability and validity were assessed using Cohen's kappa (kappa) and Pearson correlation coefficients. Seventy-three percent of participants met the recommended activity criteria compared with 45% in the total U.S. population. Test-retest reliability (kappa) was 0.35-0.53 for moderate activity, 0.80-0.86 for vigorous activity, 0.67-0.84 for recommended activity, and 0.85-0.92 for strengthening. Validity (kappa) of the survey (using the accelerometer as the standard) was 0.17-0.22 for recommended activity. Validity (kappa) of the survey (using the physical activity log as the standard) was 0.40-0.52 for recommended activity. The validity and reliability of the BRFSS physical activity questions suggests that this instrument can classify groups of adults into the levels of recommended and vigorous activity as defined by Healthy People 2010. Repeated administration of these questions over time will help to identify trends in physical activity.

  12. Montreal-Toulouse Language Assessment Battery: evidence of criterion validity from patients with aphasia.

    PubMed

    Pagliarin, Karina Carlesso; Ortiz, Karin Zazo; Barreto, Simone dos Santos; Pimenta Parente, Maria Alice de Mattos; Nespoulous, Jean-Luc; Joanette, Yves; Fonseca, Rochele Paz

    2015-10-15

    The Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) provides a general description of language processing and related components in adults with brain injury. The present study aimed at verifying the criterion-related validity of the Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) by assessing its ability to discriminate between individuals with unilateral brain damage with and without aphasia. The investigation was carried out in a Brazilian community-based sample of 104 adults, divided into four groups: 26 participants with left hemisphere damage (LHD) with aphasia, 25 participants with right hemisphere damage (RHD), 28 with LHD non-aphasic, and 25 healthy adults. There were significant differences between patients with aphasia and the other groups on most total and subtotal scores on MTL-BR tasks. The results showed strong criterion-related validity evidence for the MTL-BR Battery, and provided important information regarding hemispheric specialization and interhemispheric cooperation. Future research is required to search for additional evidence of sensitivity, specificity and validity of the MTL-BR in samples with different types of aphasia and degrees of language impairment. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Using Dynamic Risk and Protective Factors to Predict Inpatient Aggression: Reliability and Validity of START Assessments

    PubMed Central

    Desmarais, Sarah L.; Nicholls, Tonia L.; Wilson, Catherine M.; Brink, Johann

    2012-01-01

    The Short-Term Assessment of Risk and Treatability (START) is a relatively new structured professional judgment guide for the assessment and management of short-term risks associated with mental, substance use, and personality disorders. The scheme may be distinguished from other violence risk instruments because of its inclusion of 20 dynamic factors that are rated in terms of both vulnerability and strength. This study examined the reliability and validity of START assessments in predicting inpatient aggression. Research assistants completed START assessments for 120 male forensic psychiatric patients through review of hospital files. They additionally completed Historical-Clinical-Risk Management – 20 (HCR-20) and the Hare Psychopathy Checklist: Screening Version (PCL:SV) assessments. Outcome data was coded from hospital files for a 12-month follow-up period using the Overt Aggression Scale (OAS). START assessments evidenced excellent interrater reliability and demonstrated both predictive and incremental validity over the HCR-20 Historical subscale scores and PCL:SV total scores. Overall, results support the reliability and validity of START assessments, and use of the structured professional judgment approach more broadly, as well as the value of using dynamic risk and protective factors to assess violence risk. PMID:22250595

  14. Content validation using an expert panel: assessment process for assistive technology adopted by farmers with disabilities.

    PubMed

    Mathew, S N; Field, W E; French, B F

    2011-07-01

    This article reports the use of an expert panel to perform content validation of an experimental assessment process for the safety of assistive technology (AT) adopted by farmers with disabilities. The validation process was conducted by a panel of six experts experienced in the subject matter, i.e., design, use, and assessment of AT for farmers with disabilities. The exercise included an evaluation session and two focus group sessions. The evaluation session consisted of using the assessment process under consideration by the panel to evaluate a set of nine ATs fabricated by a farmer on his farm site. The expert panel also participated in the focus group sessions conducted immediately before and after the evaluation session. The resulting data were analyzed using discursive analysis, and the results were incorporated into the final assessment process. The method and the results are presented with recommendations for the use of expert panels in research projects and validation of assessment tools.

  15. Validation of an Evaluation Model for Learning Management Systems

    ERIC Educational Resources Information Center

    Kim, S. W.; Lee, M. G.

    2008-01-01

    This study aims to validate a model for evaluating learning management systems (LMS) used in e-learning fields. A survey of 163 e-learning experts, regarding 81 validation items developed through literature review, was used to ascertain the importance of the criteria. A concise list of explanatory constructs, including two principle factors, was…

  16. The Transition Readiness Assessment Questionnaire (TRAQ): its factor structure, reliability, and validity.

    PubMed

    Wood, David L; Sawicki, Gregory S; Miller, M David; Smotherman, Carmen; Lukens-Bull, Katryne; Livingood, William C; Ferris, Maria; Kraemer, Dale F

    2014-01-01

    National consensus statements recommend that providers regularly assess the transition readiness skills of adolescent and young adults (AYA). In 2010 we developed a 29-item version of Transition Readiness Assessment Questionnaire (TRAQ). We reevaluated item performance and factor structure, and reassessed the TRAQ's reliability and validity. We surveyed youth from 3 academic clinics in Jacksonville, Florida; Chapel Hill, North Carolina; and Boston, Massachusetts. Participants were AYA with special health care needs aged 14 to 21 years. From a convenience sample of 306 patients, we conducted item reduction strategies and exploratory factor analysis (EFA). On a second convenience sample of 221 patients, we conducted confirmatory factor analysis (CFA). Internal reliability was assessed by Cronbach's alpha and criterion validity. Analyses were conducted by the Wilcoxon rank sum test and mixed linear models. The item reduction and EFA resulted in a 20-item scale with 5 identified subscales. The CFA conducted on a second sample provided a good fit to the data. The overall scale has high reliability overall (Cronbach's alpha = .94) and good reliability for 4 of the 5 subscales (Cronbach's alpha ranging from .90 to .77 in the pooled sample). Each of the 5 subscale scores were significantly higher for adolescents aged 18 years and older versus those younger than 18 (P < .0001) in both univariate and multivariate analyses. The 20-item, 5-factor structure for the TRAQ is supported by EFA and CFA on independent samples and has good internal reliability and criterion validity. Additional work is needed to expand or revise the TRAQ subscales and test their predictive validity. Copyright © 2014 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.

  17. Laboratory validation of MEMS-based sensors for post-earthquake damage assessment image

    NASA Astrophysics Data System (ADS)

    Pozzi, Matteo; Zonta, Daniele; Santana, Juan; Colin, Mikael; Saillen, Nicolas; Torfs, Tom; Amditis, Angelos; Bimpas, Matthaios; Stratakos, Yorgos; Ulieru, Dumitru; Bairaktaris, Dimitirs; Frondistou-Yannas, Stamatia; Kalidromitis, Vasilis

    2011-04-01

    The evaluation of seismic damage is today almost exclusively based on visual inspection, as building owners are generally reluctant to install permanent sensing systems, due to their high installation, management and maintenance costs. To overcome this limitation, the EU-funded MEMSCON project aims to produce small size sensing nodes for measurement of strain and acceleration, integrating Micro-Electro-Mechanical Systems (MEMS) based sensors and Radio Frequency Identification (RFID) tags in a single package that will be attached to reinforced concrete buildings. To reduce the impact of installation and management, data will be transmitted to a remote base station using a wireless interface. During the project, sensor prototypes were produced by assembling pre-existing components and by developing ex-novo miniature devices with ultra-low power consumption and sensing performance beyond that offered by sensors available on the market. The paper outlines the device operating principles, production scheme and working at both unit and network levels. It also reports on validation campaigns conducted in the laboratory to assess system performance. Accelerometer sensors were tested on a reduced scale metal frame mounted on a shaking table, back to back with reference devices, while strain sensors were embedded in both reduced and full-scale reinforced concrete specimens undergoing increasing deformation cycles up to extensive damage and collapse. The paper assesses the economical sustainability and performance of the sensors developed for the project and discusses their applicability to long-term seismic monitoring.

  18. The local lymph node assay and the assessment of relative potency: status of validation.

    PubMed

    Basketter, David A; Gerberick, Frank; Kimber, Ian

    2007-08-01

    For the prediction of skin sensitization potential, the local lymph node assay (LLNA) is a fully validated alternative to guinea-pig tests. More recently, information from LLNA dose-response analyses has been used to assess the relative potency of skin sensitizing chemicals. These data are then deployed for risk assessment and risk management. In this commentary, the utility and validity of these relative potency measurements are reviewed. It is concluded that the LLNA does provide a valuable assessment of relative sensitizing potency in the form of the estimated concentration of a chemical required to produce a threefold stimulation of draining lymph node cell proliferation compared with concurrent controls (EC3 value) and that all reasonable validation requirements have been addressed successfully. EC3 measurements are reproducible in both intra- and interlaboratory evaluations and are stable over time. It has been shown also, by several independent groups, that EC3 values correlate closely with data on relative human skin sensitization potency. Consequently, the recommendation made here is that LLNA EC3 measurements should now be regarded as a validated method for the determination of the relative potency of skin sensitizing chemicals, a conclusion that has already been reached by a number of independent expert groups.

  19. Validity and reliability of the Turkish Migraine Disability Assessment (MIDAS) questionnaire.

    PubMed

    Ertaş, Mustafa; Siva, Aksel; Dalkara, Turgay; Uzuner, Nevzat; Dora, Babür; Inan, Levent; Idiman, Fethi; Sarica, Yakup; Selçuki, Deniz; Sirin, Hadiye; Oğuzhanoğlu, Atilla; Irkeç, Ceyla; Ozmenoğlu, Mehmet; Ozbenli, Taner; Oztürk, Musa; Saip, Sabahattin; Neyal, Münife; Zarifoğlu, Mehmet

    2004-09-01

    The aim of this study is to assess the comprehensibility, internal consistency, patient-physician reliability, test-retest reliability, and validity of Turkish version of Migraine Disability Assessment (MIDAS) questionnaire in patients with headache. MIDAS questionnaire has been developed by Stewart et al and shown to be reliable and valid to determine the degree of disability caused by migraine. This study was designed as a national multicenter study to demonstrate the reliability and validity of Turkish version of MIDAS questionnaire. Patients applying to 17 Neurology Clinics in Turkey were evaluated at the baseline (visit 1), week 4 (visit 2), and week 12 (visit 3) visits in terms of disease severity and comprehensibility, internal consistency, test-retest reliability, and validity of MIDAS. Since the severity of the disease has been found to change significantly at visit 2 compared to visit 1, test-retest reliability was assessed using the MIDAS scores of a subgroup of patients whose disease severity remained unchanged (up to +/-3 days difference in the number of days with headache between visits 1 and 2). A total of 306 patients (86.2% female, mean age: 35.0 +/- 9.8 years) were enrolled into the study. A total of 65.7%, 77.5%, 82.0% of patients reported that "they had fully understood the MIDAS questionnaire" in visits 1, 2, and 3, respectively. A highly positive correlation was found between physician and patient and the applied total MIDAS scores in all three visits (Spearman correlation coefficients were R= 0.87, 0.83, and 0.90, respectively, P <.001). Internal consistency of MIDAS was assessed using Cronbach's alpha and was found at acceptable (>0.7) or excellent (>0.8) levels in both patient and physician applied MIDAS scores, respectively. Total MIDAS score showed good test-retest reliability (R= 0.68). Both the number of days with headache and the total MIDAS scores were positively correlated at all visits with correlation coefficients between 0.47 and

  20. Validation of self assessment patient knowledge questionnaire for heart failure patients.

    PubMed

    Lainscak, Mitja; Keber, Irena

    2005-12-01

    Several studies showed insufficient knowledge and poor compliance to non-pharmacological management in heart failure patients. Only a limited number of validated tools are available to assess their knowledge. The aim of the study was to test our 10-item Patient knowledge questionnaire. The Patient knowledge questionnaire was administered to 42 heart failure patients from Heart failure clinic and to 40 heart failure patients receiving usual care. Construct validity (Pearson correlation coefficient), internal consistency (Cronbach alpha), reproducibility (Wilcoxon signed rank test), and reliability (chi-square test and Student's t-test for independent samples) were assessed. Overall score of the Patient knowledge questionnaire had the strongest correlation to the question about regular weighing (r=0.69) and the weakest to the question about presence of heart disease (r=0.33). There was a strong correlation between question about fluid retention and questions assessing regular weighing, (r=0.86), weight of one litre of water (r=0.86), and salt restriction (r=0.57). The Cronbach alpha was 0.74 and could be improved by exclusion of questions about clear explanation (Chronbach alpha 0.75), importance of fruit, soup, and vegetables (Chronbach alpha 0.75), and self adjustment of diuretic (Chronbach alpha 0.81). During reproducibility testing 91% to 98% of questions were answered equally. Patients from Heart failure clinic scored significantly better than patients receiving usual care (7.9 (1.3) vs. 5.7 (2.2), p<0.001). Patient knowledge questionnaire is a valid and reliable tool to measure knowledge of heart failure patients.