validation test results: Topics by Science.gov

Sample records for validation test results

10 CFR 26.139 - Reporting initial validity and drug test results.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 10 Energy 1 2014-01-01 2014-01-01 false Reporting initial validity and drug test results. 26.139... § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall... permitted under § 26.75(h), positive test results from initial drug tests at the licensee testing facility...
10 CFR 26.139 - Reporting initial validity and drug test results.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 10 Energy 1 2012-01-01 2012-01-01 false Reporting initial validity and drug test results. 26.139... § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall... permitted under § 26.75(h), positive test results from initial drug tests at the licensee testing facility...
10 CFR 26.139 - Reporting initial validity and drug test results.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 10 Energy 1 2013-01-01 2013-01-01 false Reporting initial validity and drug test results. 26.139 Section 26.139 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall...
10 CFR 26.139 - Reporting initial validity and drug test results.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 10 Energy 1 2011-01-01 2011-01-01 false Reporting initial validity and drug test results. 26.139 Section 26.139 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall...
10 CFR 26.139 - Reporting initial validity and drug test results.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 10 Energy 1 2010-01-01 2010-01-01 false Reporting initial validity and drug test results. 26.139 Section 26.139 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall...
Evaluation of tools used to measure calcium and/or dairy consumption in adults.

PubMed

Magarey, Anthea; Baulderstone, Lauren; Yaxley, Alison; Markow, Kylie; Miller, Michelle

2015-05-01

To identify and critique tools for the assessment of Ca and/or dairy intake in adults, in order to ascertain the most accurate and reliable tools available. A systematic review of the literature was conducted using defined inclusion and exclusion criteria. Articles reporting on originally developed tools or testing the reliability or validity of existing tools that measure Ca and/or dairy intake in adults were included. Author-defined criteria for reporting reliability and validity properties were applied. Studies conducted in Western countries. Adults. Thirty papers, utilising thirty-six tools assessing intake of dairy, Ca or both, were identified. Reliability testing was conducted on only two dairy and five Ca tools, with results indicating that only one dairy and two Ca tools were reliable. Validity testing was conducted for all but four Ca-only tools. There was high reliance in validity testing on lower-order tests such as correlation and failure to differentiate between statistical and clinically meaningful differences. Results of the validity testing suggest one dairy and five Ca tools are valid. Thus one tool was considered both reliable and valid for the assessment of dairy intake and only two tools proved reliable and valid for the assessment of Ca intake. While several tools are reliable and valid, their application across adult populations is limited by the populations in which they were tested. These results indicate a need for tools that assess Ca and/or dairy intake in adults to be rigorously tested for reliability and validity.
Student mathematical imagination instruments: construction, cultural adaptation and validity

NASA Astrophysics Data System (ADS)

Dwijayanti, I.; Budayasa, I. K.; Siswono, T. Y. E.

2018-03-01

Imagination has an important role as the center of sensorimotor activity of the students. The purpose of this research is to construct the instrument of students’ mathematical imagination in understanding concept of algebraic expression. The researcher performs validity using questionnaire and test technique and data analysis using descriptive method. Stages performed include: 1) the construction of the embodiment of the imagination; 2) determine the learning style questionnaire; 3) construct instruments; 4) translate to Indonesian as well as adaptation of learning style questionnaire content to student culture; 5) perform content validation. The results stated that the constructed instrument is valid by content validation and empirical validation so that it can be used with revisions. Content validation involves Indonesian linguists, english linguists and mathematics material experts. Empirical validation is done through a legibility test (10 students) and shows that in general the language used can be understood. In addition, a questionnaire test (86 students) was analyzed using a biserial point correlation technique resulting in 16 valid items with a reliability test using KR 20 with medium reability criteria. While the test instrument test (32 students) to find all items are valid and reliability test using KR 21 with reability is 0,62.
Process Skill Assessment Instrument: Innovation to measure student’s learning result holistically

NASA Astrophysics Data System (ADS)

Azizah, K. N.; Ibrahim, M.; Widodo, W.

2018-01-01

Science process skills (SPS) are very important skills for students. However, the fact that SPS is not being main concern in the primary school learning is undeniable. This research aimed to develop a valid, practical, and effective assessment instrument to measure student’s SPS. Assessment instruments comprise of worksheet and test. This development research used one group pre-test post-test design. Data were obtained with validation, observation, and test method to investigate validity, practicality, and the effectivenss of the instruments. Results showed that the validity of assessment instruments is very valid, the reliability is categorized as reliable, student SPS activities have a high percentage, and there is significant improvement on student’s SPS score. It can be concluded that assessment instruments of SPS are valid, practical, and effective to be used to measure student’s SPS result.
The predictive value of the sacral base pressure test in detecting specific types of sacroiliac dysfunction

PubMed Central

Mitchell, Travis D.; Urli, Kristina E.; Breitenbach, Jacques; Yelverton, Chris

2007-01-01

Abstract Objective This study aimed to evaluate the validity of the sacral base pressure test in diagnosing sacroiliac joint dysfunction. It also determined the predictive powers of the test in determining which type of sacroiliac joint dysfunction was present. Methods This was a double-blind experimental study with 62 participants. The results from the sacral base pressure test were compared against a cluster of previously validated tests of sacroiliac joint dysfunction to determine its validity and predictive powers. The external rotation of the feet, occurring during the sacral base pressure test, was measured using a digital inclinometer. Results There was no statistically significant difference in the results of the sacral base pressure test between the types of sacroiliac joint dysfunction. In terms of the results of validity, the sacral base pressure test was useful in identifying positive values of sacroiliac joint dysfunction. It was fairly helpful in correctly diagnosing patients with negative test results; however, it had only a “slight” agreement with the diagnosis for κ interpretation. Conclusions In this study, the sacral base pressure test was not a valid test for determining the presence of sacroiliac joint dysfunction or the type of dysfunction present. Further research comparing the agreement of the sacral base pressure test or other sacroiliac joint dysfunction tests with a criterion standard of diagnosis is necessary. PMID:19674694
Construct Validity of Neuropsychological Tests in Schizophrenia.

ERIC Educational Resources Information Center

Allen, Daniel N.; Aldarondo, Felito; Goldstein, Gerald; Huegel, Stephen G.; Gilbertson, Mark; van Kammen, Daniel P.

1998-01-01

The construct validity of neuropsychological tests in patients with schizophrenia was studied with 39 patients who were evaluated with a battery of six tests assessing attention, memory, and abstract reasoning abilities. Results support the construct validity of the neuropsychological tests in patients with schizophrenia. (SLD)
The CPT Reading Comprehension Test: A Validity Study.

ERIC Educational Resources Information Center

Napoli, Anthony R.; Raymond, Lanette A.; Coffey, Cheryl A.; Bosco, Diane M.

1998-01-01

Describes a study done at Suffolk County Community College (New York) that assessed the validity of the College Board's Computerized Placement Test in Reading Comprehension (CPT-R) by comparing test results of 1,154 freshmen with the results of the Degree of Power Reading Test. Results confirmed the CPT-R's reliability in identifying basic…
Validity and Reliability Testing of an e-learning Questionnaire for Chemistry Instruction

NASA Astrophysics Data System (ADS)

Guspatni, G.; Kurniawati, Y.

2018-04-01

The aim of this paper is to examine validity and reliability of a questionnaire used to evaluate e-learning implementation in chemistry instruction. 48 questionnaires were filled in by students who had studied chemistry through e-learning system. The questionnaire consisted of 20 indicators evaluating students’ perception on using e-learning. Parametric testing was done as data were assumed to follow normal distribution. Item validity of the questionnaire was examined through item-total correlation using Pearson’s formula while its reliability was assessed with Cronbach’s alpha formula. Moreover, convergent validity was assessed to see whether indicators building a factor had theoretically the same underlying construct. The result of validity testing revealed 19 valid indicators while the result of reliability testing revealed Cronbach’s alpha value of .886. The result of factor analysis showed that questionnaire consisted of five factors, and each of them had indicators building the same construct. This article shows the importance of factor analysis to get a construct valid questionnaire before it is used as research instrument.
Validation of the breast evaluation questionnaire for breast hypertrophy and breast reduction.

PubMed

Lewin, Richard; Elander, Anna; Lundberg, Jonas; Hansson, Emma; Thorarinsson, Andri; Claudelin, Malin; Bladh, Helena; Lidén, Mattias

2018-06-13

There is a lack of published, validated questionnaires for evaluating psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. To validate the breast evaluation questionnaire (BEQ), originally developed for the assessment of breast augmentation patients, for the assessment of psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. Validation study Subjects: Women with macromastia Methods: The validation of the BEQ, adapted to breast reduction, was performed in several steps. Content validity, reliability, construct validity and responsiveness were assessed. The original version was adjusted according to the results for content validity and resulted in item reduction and a modified BEQ (mBEQ) that was then assessed for reliability, construct validity and responsiveness. Internal and external validation was performed for the modified BEQ. Convergent validity was tested against Breast-Q (reduction) and discriminate validity was tested against the SF-36. Known-groups validation revealed significant differences between the normal population and patients undergoing breast reduction surgery. The BEQ showed good reliability by test-re-test analysis and high responsiveness. The modified BEQ may be reliable, valid and responsive instrument for assessing women who undergo breast reduction.
Validation of NASA Thermal Ice Protection Computer Codes. Part 1; Program Overview

NASA Technical Reports Server (NTRS)

Miller, Dean; Bond, Thomas; Sheldon, David; Wright, William; Langhals, Tammy; Al-Khalil, Kamel; Broughton, Howard

1996-01-01

The Icing Technology Branch at NASA Lewis has been involved in an effort to validate two thermal ice protection codes developed at the NASA Lewis Research Center. LEWICE/Thermal (electrothermal deicing & anti-icing), and ANTICE (hot-gas & electrothermal anti-icing). The Thermal Code Validation effort was designated as a priority during a 1994 'peer review' of the NASA Lewis Icing program, and was implemented as a cooperative effort with industry. During April 1996, the first of a series of experimental validation tests was conducted in the NASA Lewis Icing Research Tunnel(IRT). The purpose of the April 96 test was to validate the electrothermal predictive capabilities of both LEWICE/Thermal, and ANTICE. A heavily instrumented test article was designed and fabricated for this test, with the capability of simulating electrothermal de-icing and anti-icing modes of operation. Thermal measurements were then obtained over a range of test conditions, for comparison with analytical predictions. This paper will present an overview of the test, including a detailed description of: (1) the validation process; (2) test article design; (3) test matrix development; and (4) test procedures. Selected experimental results will be presented for de-icing and anti-icing modes of operation. Finally, the status of the validation effort at this point will be summarized. Detailed comparisons between analytical predictions and experimental results are contained in the following two papers: 'Validation of NASA Thermal Ice Protection Computer Codes: Part 2- The Validation of LEWICE/Thermal' and 'Validation of NASA Thermal Ice Protection Computer Codes: Part 3-The Validation of ANTICE'
Validation Test Report For The CRWMS Analysis and Logistics Visually Interactive Model Calvin Version 3.0, 10074-Vtr-3.0-00

DOE Office of Scientific and Technical Information (OSTI.GOV)

S. Gillespie

2000-07-27

This report describes the tests performed to validate the CRWMS ''Analysis and Logistics Visually Interactive'' Model (CALVIN) Version 3.0 (V3.0) computer code (STN: 10074-3.0-00). To validate the code, a series of test cases was developed in the CALVIN V3.0 Validation Test Plan (CRWMS M&O 1999a) that exercises the principal calculation models and options of CALVIN V3.0. Twenty-five test cases were developed: 18 logistics test cases and 7 cost test cases. These cases test the features of CALVIN in a sequential manner, so that the validation of each test case is used to demonstrate the accuracy of the input to subsequentmore » calculations. Where necessary, the test cases utilize reduced-size data tables to make the hand calculations used to verify the results more tractable, while still adequately testing the code's capabilities. Acceptance criteria, were established for the logistics and cost test cases in the Validation Test Plan (CRWMS M&O 1999a). The Logistics test cases were developed to test the following CALVIN calculation models: Spent nuclear fuel (SNF) and reactivity calculations; Options for altering reactor life; Adjustment of commercial SNF (CSNF) acceptance rates for fiscal year calculations and mid-year acceptance start; Fuel selection, transportation cask loading, and shipping to the Monitored Geologic Repository (MGR); Transportation cask shipping to and storage at an Interim Storage Facility (ISF); Reactor pool allocation options; and Disposal options at the MGR. Two types of cost test cases were developed: cases to validate the detailed transportation costs, and cases to validate the costs associated with the Civilian Radioactive Waste Management System (CRWMS) Management and Operating Contractor (M&O) and Regional Servicing Contractors (RSCs). For each test case, values calculated using Microsoft Excel 97 worksheets were compared to CALVIN V3.0 scenarios with the same input data and assumptions. All of the test case results compare with the CALVIN V3.0 results within the bounds of the acceptance criteria. Therefore, it is concluded that the CALVIN V3.0 calculation models and options tested in this report are validated.« less
Development and validation of a knowledge test for health professionals regarding lifestyle modification.

PubMed

Talip, Whadi-ah; Steyn, Nelia P; Visser, Marianne; Charlton, Karen E; Temple, Norman

2003-09-01

We wanted to develop and validate a test that assesses the knowledge and practices of health professionals (HPs) with regard to the role of nutrition, physical activity, and smoking cessation (lifestyle modification) in chronic diseases of lifestyle. A descriptive cross-sectional validation study was carried out. The validation design consisted of two phases, namely 1) test planning and development and 2) test evaluation. The study sample consisted of five groups of HPs: dietitians, dietetic interns, general practitioners, medical students, and nurses. The overall response rate was 58%, resulting in a sample size of 186 participants. A test was designed to evaluate the knowledge and practices of HPs. The test was first evaluated by an expert group to ensure content, construct, and face validity. Thereafter, the questionnaire was tested on five groups of HPs to test for criterion validity. Internal consistency was evaluated by Cronbach's alpha. An expert panel ensured content, construct, and face validity of the test. Groups with the most training and exposure to nutrition (dietitians and dietetic interns) had the highest group mean score, ranging from 61% to 88%, whereas those with limited nutrition training (general practitioners, medical students, and nurses) had significantly lower scores, ranging from 26% to 80%. This result demonstrated criterion validity. Internal consistency of the overall test demonstrated a Cronbach's alpha of 0.99. Most HPs identified the mass media as their main source of information on lifestyle modification. These HPs also identified lack of time, lack of patient compliance, and lack of knowledge as barriers that prevent them from providing counseling on lifestyle modification. The results of this study showed that this test instrument identifies groups of health professionals with adequate training (knowledge) in lifestyle modification and those who require further training (knowledge).
Measuring social alienation in adolescence: translation and validation of the Jessor and Jessor Social Alienation Scale.

PubMed

Safipour, Jalal; Tessma, Mesfin Kassaye; Higginbottom, Gina; Emami, Azita

2010-12-01

The objective of the study is to translate and examine the reliability and validity of the Jessor and Jessor Social Alienation Scale for use in a Swedish context. The study involved four phases of testing: (1) Translation and back-translation; (2) a pilot test to evaluate the translation; (3) reliability testing; and (4) a validity test. Main participants of this study were 446 students (Age = 15-19, SD = 1.01, Mean = 17). Results from the reliability test showed high internal consistency and stability. Face, content and construct validity were demonstrated using experts and confirmatory factor analysis. The results of testing the Swedish version of the alienation scale revealed an acceptable level of reliability and validity, and is appropriate for use in the Swedish context. © 2010 The Authors. Scandinavian Journal of Psychology © 2010 The Scandinavian Psychological Associations.
Item Development and Validity Testing for a Self- and Proxy Report: The Safe Driving Behavior Measure

PubMed Central

Classen, Sherrilene; Winter, Sandra M.; Velozo, Craig A.; Bédard, Michel; Lanford, Desiree N.; Brumback, Babette; Lutz, Barbara J.

2010-01-01

OBJECTIVE We report on item development and validity testing of a self-report older adult safe driving behaviors measure (SDBM). METHOD On the basis of theoretical frameworks (Precede–Proceed Model of Health Promotion, Haddon’s matrix, and Michon’s model), existing driving measures, and previous research and guided by measurement theory, we developed items capturing safe driving behavior. Item development was further informed by focus groups. We established face validity using peer reviewers and content validity using expert raters. RESULTS Peer review indicated acceptable face validity. Initial expert rater review yielded a scale content validity index (CVI) rating of 0.78, with 44 of 60 items rated ≥0.75. Sixteen unacceptable items (≤0.5) required major revision or deletion. The next CVI scale average was 0.84, indicating acceptable content validity. CONCLUSION The SDBM has relevance as a self-report to rate older drivers. Future pilot testing of the SDBM comparing results with on-road testing will define criterion validity. PMID:20437917
Content validity and reliability of test of gross motor development in Chilean children

PubMed Central

Cano-Cappellacci, Marcelo; Leyton, Fernanda Aleitte; Carreño, Joshua Durán

2016-01-01

ABSTRACT OBJECTIVE To validate a Spanish version of the Test of Gross Motor Development (TGMD-2) for the Chilean population. METHODS Descriptive, transversal, non-experimental validity and reliability study. Four translators, three experts and 92 Chilean children, from five to 10 years, students from a primary school in Santiago, Chile, have participated. The Committee of Experts has carried out translation, back-translation and revision processes to determine the translinguistic equivalence and content validity of the test, using the content validity index in 2013. In addition, a pilot implementation was achieved to determine test reliability in Spanish, by using the intraclass correlation coefficient and Bland-Altman method. We evaluated whether the results presented significant differences by replacing the bat with a racket, using T-test. RESULTS We obtained a content validity index higher than 0.80 for language clarity and relevance of the TGMD-2 for children. There were significant differences in the object control subtest when comparing the results with bat and racket. The intraclass correlation coefficient for reliability inter-rater, intra-rater and test-retest reliability was greater than 0.80 in all cases. CONCLUSIONS The TGMD-2 has appropriate content validity to be applied in the Chilean population. The reliability of this test is within the appropriate parameters and its use could be recommended in this population after the establishment of normative data, setting a further precedent for the validation in other Latin American countries. PMID:26815160
Construction and Evaluation of Reliability and Validity of Reasoning Ability Test

ERIC Educational Resources Information Center

Bhat, Mehraj A.

2014-01-01

This paper is based on the construction and evaluation of reliability and validity of reasoning ability test at secondary school students. In this paper an attempt was made to evaluate validity, reliability and to determine the appropriate standards to interpret the results of reasoning ability test. The test includes 45 items to measure six types…

An exploratory study into the effect of time-restricted internet access on face-validity, construct validity and reliability of postgraduate knowledge progress testing

PubMed Central

2013-01-01

Background Yearly formative knowledge testing (also known as progress testing) was shown to have a limited construct-validity and reliability in postgraduate medical education. One way to improve construct-validity and reliability is to improve the authenticity of a test. As easily accessible internet has become inseparably linked to daily clinical practice, we hypothesized that allowing internet access for a limited amount of time during the progress test would improve the perception of authenticity (face-validity) of the test, which would in turn improve the construct-validity and reliability of postgraduate progress testing. Methods Postgraduate trainees taking the yearly knowledge progress test were asked to participate in a study where they could access the internet for 30 minutes at the end of a traditional pen and paper test. Before and after the test they were asked to complete a short questionnaire regarding the face-validity of the test. Results Mean test scores increased significantly for all training years. Trainees indicated that the face-validity of the test improved with internet access and that they would like to continue to have internet access during future testing. Internet access did not improve the construct-validity or reliability of the test. Conclusion Improving the face-validity of postgraduate progress testing, by adding the possibility to search the internet for a limited amount of time, positively influences test performance and face-validity. However, it did not change the reliability or the construct-validity of the test. PMID:24195696
Identification student’s misconception of heat and temperature using three-tier diagnostic test

NASA Astrophysics Data System (ADS)

Suliyanah; Putri, H. N. P. A.; Rohmawati, L.

2018-03-01

The objective of this research is to develop a Three-Tier Diagnostic Test (TTDT) to identify the student's misconception of heat and temperature. Stages of development include: analysis, planning, design, development, evaluation and revise. The results of this study show that (1) the quality of the three-tier type diagnostic test instrument developed has been expressed well with the following details: (a) Internal validity of 88.19% belonging to the valid category. (b) External validity of empirical construct validity test using Pearson Product Moment obtained 0.43 is classified and result of empirical construct validity test obtained false positives 6.1% and false negatives 5.9% then the instrument was valid. (c) Test reliability by using Cronbach’s Alpha of 0.98 which means acceptable. (d) The 80% difficulty level test is quite difficult. (2) Student misconceptions on the temperature of heat and displacement materials based on the II test the highest (84%), the lowest (21%), and the non-misconceptions (7%). (3) The highest cause of misconception among students is associative thinking (22%) and the lowest is caused by incomplete or incomplete reasoning (11%). Three-Tier Diagnostic Test (TTDT) could identify the student's misconception of heat and temperature.
Designing the Nuclear Energy Attitude Scale.

ERIC Educational Resources Information Center

Calhoun, Lawrence; And Others

1988-01-01

Presents a refined method for designing a valid and reliable Likert-type scale to test attitudes toward the generation of electricity from nuclear energy. Discusses various tests of validity that were used on the nuclear energy scale. Reports results of administration and concludes that the test is both reliable and valid. (CW)
The Teenage Nonviolence Test: Concurrent and Discriminant Validity.

ERIC Educational Resources Information Center

Konen, Kristopher; Mayton, Daniel M., II; Delva, Zenita; Sonnen, Melinda; Dahl, William; Montgomery, Richard

This study was designed to document the validity of the Teenage Nonviolence Test (TNT). In this study the concurrent validity of the TNT in various ways, the validity of the TNT using known groups, and the discriminant validity of the TNT by evaluating its relationships with other psychological constructs were assessed. The results showed that the…
Reliability, Validity, and Cross-Cultural Adaptation of the Turkish Version of the Bournemouth Questionnaire.

PubMed

Gunaydin, Gurkan; Citaker, Seyit; Meray, Jale; Cobanoglu, Gamze; Gunaydin, Ozge Ece; Hazar Kanik, Zeynep

2016-11-01

Validation of a self-report questionnaire. The purpose of this study was to investigate adaptation, validity, and reliability of the Turkish version of the Bournemouth Questionnaire. Low back pain is one of the most frequent disorders leading to activity limitation. This pain affects most of people in their lives. The most important point to evaluate patient's functional abilities and to decide a successful therapy procedure is to manage the assessment questionnaires precisely. One hundred ten patients with chronic low back pain were included in present study. To assess reliability, test-retest and internal consistency analyses were applied. The results of test-retest analysis were assessed by using Intraclass Correlation Coefficient method (95% confidence interval). For internal consistency, Cronbach alpha value was calculated. Validity of the questionnaire was assessed in terms of construct validity. For construct validity, factor analysis and convergent validity were tested. For convergent validity, total points of the Bournemouth Questionnaire were assessed with the total points of Quebec Back Pain Disability Scale and Roland Morris Disability Questionnaire by using Pearson correlation coefficient analysis. Cronbach alpha value was found 0.914, showing that this questionnaire has high internal consistency. The results of test-retest analysis were varying between 0.851 and 0.927, which shows that test-retest results are highly correlated. Factor analysis test indicated that this questionnaire had one factor. Pearson correlation coefficient of the Bournemouth Questionnaire with Roland Morris Disability Questionnaire was calculated 0.703 and it was found with Quebec Back Pain Disability Scale is 0.659. These results showed that the Bournemouth Questionnaire is very good correlated with Roland Morris Disability Questionnaire and Quebec Back Pain Disability Scale. The Turkish version of the Bournemouth Questionnaire is valid and reliable. 3.
Test Use and Abuse

ERIC Educational Resources Information Center

Tienken, Christopher H.

2015-01-01

The ubiquitous use of standardized test results to make varied judgments about educators, students, and schools within the public school system raises concerns of validity. If the test results have not been validated for making multiple determinations, then the decisions made about educators, students, schools, and school districts based on the…
Performance validation of the ANSER control laws for the F-18 HARV

NASA Technical Reports Server (NTRS)

Messina, Michael D.

1995-01-01

The ANSER control laws were implemented in Ada by NASA Dryden for flight test on the High Alpha Research Vehicle (HARV). The Ada implementation was tested in the hardware-in-the-loop (HIL) simulation, and results were compared to those obtained with the NASA Langley batch Fortran implementation of the control laws which are considered the 'truth model.' This report documents the performance validation test results between these implementations. This report contains the ANSER performance validation test plan, HIL versus batch time-history comparisons, simulation scripts used to generate checkcases, and detailed analysis of discrepancies discovered during testing.
Performance validation of the ANSER Control Laws for the F-18 HARV

NASA Technical Reports Server (NTRS)

Messina, Michael D.

1995-01-01

The ANSER control laws were implemented in Ada by NASA Dryden for flight test on the High Alpha Research Vehicle (HARV). The Ada implementation was tested in the hardware-in-the-loop (HIL) simulation, and results were compared to those obtained with the NASA Langley batch Fortran implementation of the control laws which are considered the 'truth model'. This report documents the performance validation test results between these implementations. This report contains the ANSER performance validation test plan, HIL versus batch time-history comparisons, simulation scripts used to generate checkcases, and detailed analysis of discrepancies discovered during testing.
Validation of the Information/Communications Technology Literacy Test

DTIC Science & Technology

2016-10-01

nested set. Table 11 presents the results of incremental validity analyses for job knowledge/performance criteria by MOS. Figure 7 presents much...Systems Operator-Analyst (25B) and Nodal Network Systems Operator-Maintainer (25N) MOS. This report documents technical procedures and results of the...research effort. Results suggest that the ICTL test has potential as a valid and highly efficient predictor of valued outcomes in Signal school MOS. Not
The Importance of Symptom Validity Testing in Adolescents and Young Adults Undergoing Assessments for Learning or Attention Difficulties

ERIC Educational Resources Information Center

Harrison, Allyson G.; Green, Paul; Flaro, Lloyd

2012-01-01

It is almost self-evident that test results will be unreliable and misleading if those undergoing assessments do not make a full effort on testing. Nevertheless, objective tests of effort have not typically been used with young adults to determine whether test results are valid or not. Because of the potential economic and/or recreational benefits…
The Chinese version of the Outcome Expectations for Exercise scale: validation study.

PubMed

Lee, Ling-Ling; Chiu, Yu-Yun; Ho, Chin-Chih; Wu, Shu-Chen; Watson, Roger

2011-06-01

Estimates of the reliability and validity of the English nine-item Outcome Expectations for Exercise (OEE) scale have been tested and found to be valid for use in various settings, particularly among older people, with good internal consistency and validity. Data on the use of the OEE scale among older Chinese people living in the community and how cultural differences might affect the administration of the OEE scale are limited. To test the validity and reliability of the Chinese version of the Outcome Expectations for Exercise scale among older people. A cross-sectional validation study was designed to test the Chinese version of the OEE scale (OEE-C). Reliability was examined by testing both the internal consistency for the overall scale and the squared multiple correlation coefficient for the single item measure. The validity of the scale was tested on the basis of both a traditional psychometric test and a confirmatory factor analysis using structural equation modelling. The Mokken Scaling Procedure (MSP) was used to investigate if there were any hierarchical, cumulative sets of items in the measure. The OEE-C scale was tested in a group of older people in Taiwan (n=108, mean age=77.1). There was acceptable internal consistency (alpha=.85) and model fit in the scale. Evidence of the validity of the measure was demonstrated by the tests for criterion-related validity and construct validity. There was a statistically significant correlation between exercise outcome expectations and exercise self-efficacy (r=.34, p<.01). An analysis of the Mokken Scaling Procedure found that nine items of the scale were all retained in the analysis and the resulting scale was reliable and statistically significant (p=.0008). The results obtained in the present study provided acceptable levels of reliability and validity evidence for the Chinese Outcome Expectations for Exercise scale when used with older people in Taiwan. Future testing of the OEE-C scale needs to be carried out to see whether these results are generalisable to older Chinese people living in urban areas. Copyright © 2010 Elsevier Ltd. All rights reserved.
Performance Evaluation of a Data Validation System

NASA Technical Reports Server (NTRS)

Wong, Edmond (Technical Monitor); Sowers, T. Shane; Santi, L. Michael; Bickford, Randall L.

2005-01-01

Online data validation is a performance-enhancing component of modern control and health management systems. It is essential that performance of the data validation system be verified prior to its use in a control and health management system. A new Data Qualification and Validation (DQV) Test-bed application was developed to provide a systematic test environment for this performance verification. The DQV Test-bed was used to evaluate a model-based data validation package known as the Data Quality Validation Studio (DQVS). DQVS was employed as the primary data validation component of a rocket engine health management (EHM) system developed under NASA's NGLT (Next Generation Launch Technology) program. In this paper, the DQVS and DQV Test-bed software applications are described, and the DQV Test-bed verification procedure for this EHM system application is presented. Test-bed results are summarized and implications for EHM system performance improvements are discussed.
Veggie and the VEG-01 Hardware Validation Test

NASA Technical Reports Server (NTRS)

Massa, Gioia; wheeler, Ray; Smith, Trent

2015-01-01

This presentation presents a brief overview of KSC plant science hardware for space and then details the Veggie hardware and the VEG-01 hardware validation test. The test results and future plans are discussed.
49 CFR 40.160 - What does the MRO do when a valid test result cannot be produced and a negative result is required?

Code of Federal Regulations, 2013 CFR

2013-10-01

... 49 Transportation 1 2013-10-01 2013-10-01 false What does the MRO do when a valid test result cannot be produced and a negative result is required? 40.160 Section 40.160 Transportation Office of the Secretary of Transportation PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Medical Review Officers and the Verification...
Correlation Results for a Mass Loaded Vehicle Panel Test Article Finite Element Models and Modal Survey Tests

NASA Technical Reports Server (NTRS)

Maasha, Rumaasha; Towner, Robert L.

2012-01-01

High-fidelity Finite Element Models (FEMs) were developed to support a recent test program at Marshall Space Flight Center (MSFC). The FEMs correspond to test articles used for a series of acoustic tests. Modal survey tests were used to validate the FEMs for five acoustic tests (a bare panel and four different mass-loaded panel configurations). An additional modal survey test was performed on the empty test fixture (orthogrid panel mounting fixture, between the reverb and anechoic chambers). Modal survey tests were used to test-validate the dynamic characteristics of FEMs used for acoustic test excitation. Modal survey testing and subsequent model correlation has validated the natural frequencies and mode shapes of the FEMs. The modal survey test results provide a basis for the analysis models used for acoustic loading response test and analysis comparisons
Derivation and Applicability of Asymptotic Results for Multiple Subtests Person-Fit Statistics

PubMed Central

Albers, Casper J.; Meijer, Rob R.; Tendeiro, Jorge N.

2016-01-01

In high-stakes testing, it is important to check the validity of individual test scores. Although a test may, in general, result in valid test scores for most test takers, for some test takers, test scores may not provide a good description of a test taker’s proficiency level. Person-fit statistics have been proposed to check the validity of individual test scores. In this study, the theoretical asymptotic sampling distribution of two person-fit statistics that can be used for tests that consist of multiple subtests is first discussed. Second, simulation study was conducted to investigate the applicability of this asymptotic theory for tests of finite length, in which the correlation between subtests and number of items in the subtests was varied. The authors showed that these distributions provide reasonable approximations, even for tests consisting of subtests of only 10 items each. These results have practical value because researchers do not have to rely on extensive simulation studies to simulate sampling distributions. PMID:29881053
Dynamic testing in schizophrenia: does training change the construct validity of a test?

PubMed

Wiedl, Karl H; Schöttke, Henning; Green, Michael F; Nuechterlein, Keith H

2004-01-01

Dynamic testing typically involves specific interventions for a test to assess the extent to which test performance can be modified, beyond level of baseline (static) performance. This study used a dynamic version of the Wisconsin Card Sorting Test (WCST) that is based on cognitive remediation techniques within a test-training-test procedure. From results of previous studies with schizophrenia patients, we concluded that the dynamic and static versions of the WCST should have different construct validity. This hypothesis was tested by examining the patterns of correlations with measures of executive functioning, secondary verbal memory, and verbal intelligence. Results demonstrated a specific construct validity of WCST dynamic (i.e., posttest) scores as an index of problem solving (Tower of Hanoi) and secondary verbal memory and learning (Auditory Verbal Learning Test), whereas the impact of general verbal capacity and selective attention (Verbal IQ, Stroop Test) was reduced. It is concluded that the construct validity of the test changes with dynamic administration and that this difference helps to explain why the dynamic version of the WCST predicts functional outcome better than the static version.
An Efficient Data Partitioning to Improve Classification Performance While Keeping Parameters Interpretable.

PubMed

Korjus, Kristjan; Hebart, Martin N; Vicente, Raul

2016-01-01

Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier's generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term "Cross-validation and cross-testing" improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.
34 CFR 462.11 - What must an application contain?

Code of Federal Regulations, 2010 CFR

2010-07-01

... the methodology and procedures used to measure the reliability of the test. (h) Construct validity... previous test, and results from validity, reliability, and equating or standard-setting studies undertaken... NRS educational functioning levels (content validity). Documentation of the extent to which the items...
Validity of Scientific Based Chemistry Android Module to Empower Science Process Skills (SPS) in Solubility Equilibrium

NASA Astrophysics Data System (ADS)

Antrakusuma, B.; Masykuri, M.; Ulfa, M.

2018-04-01

Evolution of Android technology can be applied to chemistry learning, one of the complex chemistry concept was solubility equilibrium. this concept required the science process skills (SPS). This study aims to: 1) Characteristic scientific based chemistry Android module to empowering SPS, and 2) Validity of the module based on content validity and feasibility test. This research uses a Research and Development approach (RnD). Research subjects were 135 s1tudents and three teachers at three high schools in Boyolali, Central of Java. Content validity of the module was tested by seven experts using Aiken’s V technique, and the module feasibility was tested to students and teachers in each school. Characteristics of chemistry module can be accessed using the Android device. The result of validation of the module contents got V = 0.89 (Valid), and the results of the feasibility test Obtained 81.63% (by the student) and 73.98% (by the teacher) indicates this module got good criteria.

The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda

PubMed Central

Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert

2008-01-01

Background The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. Methods A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. Results The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. Conclusion This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda. PMID:19055716
The Geant4 physics validation repository

NASA Astrophysics Data System (ADS)

Wenzel, H.; Yarba, J.; Dotti, A.

2015-12-01

The Geant4 collaboration regularly performs validation and regression tests. The results are stored in a central repository and can be easily accessed via a web application. In this article we describe the Geant4 physics validation repository which consists of a relational database storing experimental data and Geant4 test results, a java API and a web application. The functionality of these components and the technology choices we made are also described.
The validity of three tests of temperament in guppies (Poecilia reticulata).

PubMed

Burns, James G

2008-11-01

Differences in temperament (consistent differences among individuals in behavior) can have important effects on fitness-related activities such as dispersal and competition. However, evolutionary ecologists have put limited effort into validating their tests of temperament. This article attempts to validate three standard tests of temperament in guppies: the open-field test, emergence test, and novel-object test. Through multiple reliability trials, and comparison of results between different types of test, this study establishes the confidence that can be placed in these temperament tests. The open-field test is shown to be a good test of boldness and exploratory behavior; the open-field test was reliable when tested in multiple ways. There were problems with the emergence test and novel-object test, which leads one to conclude that the protocols used in this study should not be considered valid tests for this species. (PsycINFO Database Record (c) 2008 APA, all rights reserved).
Reliability and validity of the revised Gibson Test of Cognitive Skills, a computer-based test battery for assessing cognition across the lifespan.

PubMed

Moore, Amy Lawson; Miller, Terissa M

2018-01-01

The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.
How to test validity in orthodontic research: a mixed dentition analysis example.

PubMed

Donatelli, Richard E; Lee, Shin-Jae

2015-02-01

The data used to test the validity of a prediction method should be different from the data used to generate the prediction model. In this study, we explored whether an independent data set is mandatory for testing the validity of a new prediction method and how validity can be tested without independent new data. Several validation methods were compared in an example using the data from a mixed dentition analysis with a regression model. The validation errors of real mixed dentition analysis data and simulation data were analyzed for increasingly large data sets. The validation results of both the real and the simulation studies demonstrated that the leave-1-out cross-validation method had the smallest errors. The largest errors occurred in the traditional simple validation method. The differences between the validation methods diminished as the sample size increased. The leave-1-out cross-validation method seems to be an optimal validation method for improving the prediction accuracy in a data set with limited sample sizes. Copyright © 2015 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
The Predictive Validity of the Metropolitan Readiness Tests, 1976 Edition.

ERIC Educational Resources Information Center

Nagle, Richard J.

1979-01-01

A sample of 176 first-grade children was tested on the Metropolitan Readiness Tests, 1976 Edition (MRT), during the initial month of school and was retested eight months later on the Stanford Achievement Test. Results demonstrated substantial validity of the MRT for predicting first-grade achievement. (Author/CTM)
Testing for purchasing power parity in 21 African countries using several unit root tests

NASA Astrophysics Data System (ADS)

Choji, Niri Martha; Sek, Siok Kun

2017-04-01

Purchasing power parity is used as a basis for international income and expenditure comparison through the exchange rate theory. However, empirical studies show disagreement on the validity of PPP. In this paper, we conduct the testing on the validity of PPP using panel data approach. We apply seven different panel unit root tests to test the validity of the purchasing power parity (PPP) hypothesis based on the quarterly data on real effective exchange rate for 21 African countries from the period 1971: Q1-2012: Q4. All the results of the seven tests rejected the hypothesis of stationarity meaning that absolute PPP does not hold in those African Countries. This result confirmed the claim from previous studies that standard panel unit tests fail to support the PPP hypothesis.
Results from an Independent View on The Validation of Safety-Critical Space Systems

NASA Astrophysics Data System (ADS)

Silva, N.; Lopes, R.; Esper, A.; Barbosa, R.

2013-08-01

The Independent verification and validation (IV&V) has been a key process for decades, and is considered in several international standards. One of the activities described in the “ESA ISVV Guide” is the independent test verification (stated as Integration/Unit Test Procedures and Test Data Verification). This activity is commonly overlooked since customers do not really see the added value of checking thoroughly the validation team work (could be seen as testing the tester's work). This article presents the consolidated results of a large set of independent test verification activities, including the main difficulties, results obtained and advantages/disadvantages for the industry of these activities. This study will support customers in opting-in or opting-out for this task in future IV&V contracts since we provide concrete results from real case studies in the space embedded systems domain.
Validity and Reliability of the 8-Item Work Limitations Questionnaire.

PubMed

Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

2017-12-01

Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
The Geant4 physics validation repository

DOE PAGES

Wenzel, H.; Yarba, J.; Dotti, A.

2015-12-23

The Geant4 collaboration regularly performs validation and regression tests. The results are stored in a central repository and can be easily accessed via a web application. In this article we describe the Geant4 physics validation repository which consists of a relational database storing experimental data and Geant4 test results, a java API and a web application. Lastly, the functionality of these components and the technology choices we made are also described
An Efficient Data Partitioning to Improve Classification Performance While Keeping Parameters Interpretable

PubMed Central

Korjus, Kristjan; Hebart, Martin N.; Vicente, Raul

2016-01-01

Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier’s generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term “Cross-validation and cross-testing” improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do. PMID:27564393
Validation of laboratory-scale recycling test method of paper PSA label products

Treesearch

Carl Houtman; Karen Scallon; Richard Oldack

2008-01-01

Starting with test methods and a specification developed by the U.S. Postal Service (USPS) Environmentally Benign Pressure Sensitive Adhesive Postage Stamp Program, a laboratory-scale test method and a specification were developed and validated for pressure-sensitive adhesive labels, By comparing results from this new test method and pilot-scale tests, which have been...
Experimental investigation of an RNA sequence space

NASA Technical Reports Server (NTRS)

Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

1993-01-01

Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.
Applying modern psychometric techniques to melodic discrimination testing: Item response theory, computerised adaptive testing, and automatic item generation.

PubMed

Harrison, Peter M C; Collins, Tom; Müllensiefen, Daniel

2017-06-15

Modern psychometric theory provides many useful tools for ability testing, such as item response theory, computerised adaptive testing, and automatic item generation. However, these techniques have yet to be integrated into mainstream psychological practice. This is unfortunate, because modern psychometric techniques can bring many benefits, including sophisticated reliability measures, improved construct validity, avoidance of exposure effects, and improved efficiency. In the present research we therefore use these techniques to develop a new test of a well-studied psychological capacity: melodic discrimination, the ability to detect differences between melodies. We calibrate and validate this test in a series of studies. Studies 1 and 2 respectively calibrate and validate an initial test version, while Studies 3 and 4 calibrate and validate an updated test version incorporating additional easy items. The results support the new test's viability, with evidence for strong reliability and construct validity. We discuss how these modern psychometric techniques may also be profitably applied to other areas of music psychology and psychological science in general.
Improving Flight Software Module Validation Efforts : a Modular, Extendable Testbed Software Framework

NASA Technical Reports Server (NTRS)

Lange, R. Connor

2012-01-01

Ever since Explorer-1, the United States' first Earth satellite, was developed and launched in 1958, JPL has developed many more spacecraft, including landers and orbiters. While these spacecraft vary greatly in their missions, capabilities,and destination, they all have something in common. All of the components of these spacecraft had to be comprehensively tested. While thorough testing is important to mitigate risk, it is also a very expensive and time consuming process. Thankfully,since virtually all of the software testing procedures for SMAP are computer controlled, these procedures can be automated. Most people testing SMAP flight software (FSW) would only need to write tests that exercise specific requirements and then check the filtered results to verify everything occurred as planned. This gives developers the ability to automatically launch tests on the testbed, distill the resulting logs into only the important information, generate validation documentation, and then deliver the documentation to management. With many of the steps in FSW testing automated, developers can use their limited time more effectively and can validate SMAP FSW modules quicker and test them more rigorously. As a result of the various benefits of automating much of the testing process, management is considering this automated tools use in future FSW validation efforts.
Development and Validation of the Musical Ear Training Assessment (META)

ERIC Educational Resources Information Center

Wolf, Anna; Kopiez, Reinhard

2018-01-01

In the following study, we have developed an assessment instrument for the practice-dependent skill of analytical hearing following a strict test theoretical validation, resulting in the Musical Ear Training Assessment (META). By means of three pilot studies, a developmental study, and a validation study, we verified a one-dimensional test model…
Design and validation of a comprehensive fecal incontinence questionnaire.

PubMed

Macmillan, Alexandra K; Merrie, Arend E H; Marshall, Roger J; Parry, Bryan R

2008-10-01

Fecal incontinence can have a profound effect on quality of life. Its prevalence remains uncertain because of stigma, lack of consistent definition, and dearth of validated measures. This study was designed to develop a valid clinical and epidemiologic questionnaire, building on current literature and expertise. Patients and experts undertook face validity testing. Construct validity, criterion validity, and test-retest reliability was undertaken. Construct validity comprised factor analysis and internal consistency of the quality of life scale. The validity of known groups was tested against 77 control subjects by using regression models. Questionnaire results were compared with a stool diary for criterion validity. Test-retest reliability was calculated from repeated questionnaire completion. The questionnaire achieved good face validity. It was completed by 104 patients. The quality of life scale had four underlying traits (factor analysis) and high internal consistency (overall Cronbach alpha = 0.97). Patients and control subjects answered the questionnaire significantly differently (P < 0.01) in known-groups validity testing. Criterion validity assessment found mean differences close to zero. Median reliability for the whole questionnaire was 0.79 (range, 0.35-1). This questionnaire compares favorably with other available instruments, although the interpretation of stool consistency requires further research. Its sensitivity to treatment still needs to be investigated.
Performance Validity Testing in Neuropsychology: Scientific Basis and Clinical Application-A Brief Review.

PubMed

Greher, Michael R; Wodushek, Thomas R

2017-03-01

Performance validity testing refers to neuropsychologists' methodology for determining whether neuropsychological test performances completed in the course of an evaluation are valid (ie, the results of true neurocognitive function) or invalid (ie, overly impacted by the patient's effort/engagement in testing). This determination relies upon the use of either standalone tests designed for this sole purpose, or specific scores/indicators embedded within traditional neuropsychological measures that have demonstrated this utility. In response to a greater appreciation for the critical role that performance validity issues play in neuropsychological testing and the need to measure this variable to the best of our ability, the scientific base for performance validity testing has expanded greatly over the last 20 to 30 years. As such, the majority of current day neuropsychologists in the United States use a variety of measures for the purpose of performance validity testing as part of everyday forensic and clinical practice and address this issue directly in their evaluations. The following is the first article of a 2-part series that will address the evolution of performance validity testing in the field of neuropsychology, both in terms of the science as well as the clinical application of this measurement technique. The second article of this series will review performance validity tests in terms of methods for development of these measures, and maximizing of diagnostic accuracy.
Evaluating the dynamic response of in-flight thrust calculation techniques during throttle transients

NASA Technical Reports Server (NTRS)

Ray, Ronald J.

1994-01-01

New flight test maneuvers and analysis techniques for evaluating the dynamic response of in-flight thrust models during throttle transients have been developed and validated. The approach is based on the aircraft and engine performance relationship between thrust and drag. Two flight test maneuvers, a throttle step and a throttle frequency sweep, were developed and used in the study. Graphical analysis techniques, including a frequency domain analysis method, were also developed and evaluated. They provide quantitative and qualitative results. Four thrust calculation methods were used to demonstrate and validate the test technique. Flight test applications on two high-performance aircraft confirmed the test methods as valid and accurate. These maneuvers and analysis techniques were easy to implement and use. Flight test results indicate the analysis techniques can identify the combined effects of model error and instrumentation response limitations on the calculated thrust value. The methods developed in this report provide an accurate approach for evaluating, validating, or comparing thrust calculation methods for dynamic flight applications.
Test-Retest Reliability and Predictive Validity of the Implicit Association Test in Children

ERIC Educational Resources Information Center

Rae, James R.; Olson, Kristina R.

2018-01-01

The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many…

Readability Level of Standardized Test Items and Student Performance: The Forgotten Validity Variable

ERIC Educational Resources Information Center

Hewitt, Margaret A.; Homan, Susan P.

2004-01-01

Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…
Performance and Symptom Validity Testing as a Function of Medical Board Evaluation in U.S. Military Service Members with a History of Mild Traumatic Brain Injury.

PubMed

Armistead-Jehle, Patrick; Cole, Wesley R; Stegman, Robert L

2018-02-01

The study was designed to replicate and extend pervious findings demonstrating the high rates of invalid neuropsychological testing in military service members (SMs) with a history of mild traumatic brain injury (mTBI) assessed in the context of a medical evaluation board (MEB). Two hundred thirty-one active duty SMs (61 of which were undergoing an MEB) underwent neuropsychological assessment. Performance validity (Word Memory Test) and symptom validity (MMPI-2-RF) test data were compared across those evaluated within disability (MEB) and clinical contexts. As with previous studies, there were significantly more individuals in an MEB context that failed performance (MEB = 57%, non-MEB = 31%) and symptom validity testing (MEB = 57%, non-MEB = 22%) and performance validity testing had a notable affect on cognitive test scores. Performance and symptom validity test failure rates did not vary as a function of the reason for disability evaluation when divided into behavioral versus physical health conditions. These data are consistent with past studies, and extends those studies by including symptom validity testing and investigating the effect of reason for MEB. This and previous studies demonstrate that more than 50% of SMs seen in the context of an MEB will fail performance validity tests and over-report on symptom validity measures. These results emphasize the importance of using both performance and symptom validity testing when evaluating SMs with a history of mTBI, especially if they are being seen for disability evaluations, in order to ensure the accuracy of cognitive and psychological test data. Published by Oxford University Press 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Optimal test selection for prediction uncertainty reduction

DOE PAGES

Mullins, Joshua; Mahadevan, Sankaran; Urbina, Angel

2016-12-02

Economic factors and experimental limitations often lead to sparse and/or imprecise data used for the calibration and validation of computational models. This paper addresses resource allocation for calibration and validation experiments, in order to maximize their effectiveness within given resource constraints. When observation data are used for model calibration, the quality of the inferred parameter descriptions is directly affected by the quality and quantity of the data. This paper characterizes parameter uncertainty within a probabilistic framework, which enables the uncertainty to be systematically reduced with additional data. The validation assessment is also uncertain in the presence of sparse and imprecisemore » data; therefore, this paper proposes an approach for quantifying the resulting validation uncertainty. Since calibration and validation uncertainty affect the prediction of interest, the proposed framework explores the decision of cost versus importance of data in terms of the impact on the prediction uncertainty. Often, calibration and validation tests may be performed for different input scenarios, and this paper shows how the calibration and validation results from different conditions may be integrated into the prediction. Then, a constrained discrete optimization formulation that selects the number of tests of each type (calibration or validation at given input conditions) is proposed. Furthermore, the proposed test selection methodology is demonstrated on a microelectromechanical system (MEMS) example.« less
Comprehensive validation scheme for in situ fiber optics dissolution method for pharmaceutical drug product testing.

PubMed

Mirza, Tahseen; Liu, Qian Julie; Vivilecchia, Richard; Joshi, Yatindra

2009-03-01

There has been a growing interest during the past decade in the use of fiber optics dissolution testing. Use of this novel technology is mainly confined to research and development laboratories. It has not yet emerged as a tool for end product release testing despite its ability to generate in situ results and efficiency improvement. One potential reason may be the lack of clear validation guidelines that can be applied for the assessment of suitability of fiber optics. This article describes a comprehensive validation scheme and development of a reliable, robust, reproducible and cost-effective dissolution test using fiber optics technology. The test was successfully applied for characterizing the dissolution behavior of a 40-mg immediate-release tablet dosage form that is under development at Novartis Pharmaceuticals, East Hanover, New Jersey. The method was validated for the following parameters: linearity, precision, accuracy, specificity, and robustness. In particular, robustness was evaluated in terms of probe sampling depth and probe orientation. The in situ fiber optic method was found to be comparable to the existing manual sampling dissolution method. Finally, the fiber optic dissolution test was successfully performed by different operators on different days, to further enhance the validity of the method. The results demonstrate that the fiber optics technology can be successfully validated for end product dissolution/release testing. (c) 2008 Wiley-Liss, Inc. and the American Pharmacists Association
Validation and cross-cultural pilot testing of compliance with standard precautions scale: self-administered instrument for clinical nurses.

PubMed

Lam, Simon C

2014-05-01

To perform detailed psychometric testing of the compliance with standard precautions scale (CSPS) in measuring compliance with standard precautions of clinical nurses and to conduct cross-cultural pilot testing and assess the relevance of the CSPS on an international platform. A cross-sectional and correlational design with repeated measures. Nursing students from a local registered nurse training university, nurses from different hospitals in Hong Kong, and experts in an international conference. The psychometric properties of the CSPS were evaluated via internal consistency, 2-week and 3-month test-retest reliability, concurrent validation, and construct validation. The cross-cultural pilot testing and relevance check was examined by experts on infection control from various developed and developing regions. Among 453 participants, 193 were nursing students, 165 were enrolled nurses, and 95 were registered nurses. The results showed that the CSPS had satisfactory reliability (Cronbach α = 0.73; intraclass correlation coefficient, 0.79 for 2-week test-retest and 0.74 for 3-month test-retest) and validity (optimum correlation with criterion measure; r = 0.76, P < .001; satisfactory results on known-group method and hypothesis testing). A total of 19 experts from 16 countries assured that most of the CSPS findings were relevant and globally applicable. The CSPS demonstrated satisfactory results on the basis of the standard international criteria on psychometric testing, which ascertained the reliability and validity of this instrument in measuring the compliance of clinical nurses with standard precautions. The cross-cultural pilot testing further reinforced the instrument's relevance and applicability in most developed and developing regions.
Chandra X-ray Center Science Data Systems Regression Testing of CIAO

NASA Astrophysics Data System (ADS)

Lee, N. P.; Karovska, M.; Galle, E. C.; Bonaventura, N. R.

2011-07-01

The Chandra Interactive Analysis of Observations (CIAO) is a software system developed for the analysis of Chandra X-ray Observatory observations. An important component of a successful CIAO release is the repeated testing of the tools across various platforms to ensure consistent and scientifically valid results. We describe the procedures of the scientific regression testing of CIAO and the enhancements made to the testing system to increase the efficiency of run time and result validation.
Testing Reading Comprehension of Theoretical Discourse with Cloze.

ERIC Educational Resources Information Center

Greene, Benjamin B., Jr.

2001-01-01

Presents evidence from a large sample of reading test scores for the validity of cloze-based assessments of reading comprehension for the discourse typically encountered in introductory college economics textbooks. Notes that results provide strong evidence that appropriately designed cloze tests permit valid assessments of reading comprehension…
Inventory of Motive of Preference for Conventional Paper-and-Pencil Tests: A Study of Validity and Reliability

ERIC Educational Resources Information Center

Eser, Mehmet Taha; Dogan, Nuri

2017-01-01

Purpose: The objective of this study is to develop the Inventory of Motive of Preference for Conventional Paper-And-Pencil Tests and to evaluate students' motives for preferring written tests, short-answer tests, true/false tests or multiple-choice tests. This will add a measurement tool to the literature with valid and reliable results to help…
Clinical Functional Capacity Testing in Patients With Facioscapulohumeral Muscular Dystrophy: Construct Validity and Interrater Reliability of Antigravity Tests.

PubMed

Rijken, Noortje H; van Engelen, Baziel G; Weerdesteyn, Vivian; Geurts, Alexander C

2015-12-01

To evaluate the construct validity and interrater reliability of 4 simple antigravity tests in a small group of patients with facioscapulohumeral muscular dystrophy (FSHD). Case-control study. University medical center. Patients with various severity levels of FSHD (n=9) and healthy control subjects (n=10) were included (N=19). Not applicable. A 4-point ordinal scale was designed to grade performance on the following 4 antigravity tests: sit to stance, stance to sit, step up, and step down. In addition, the 6-minute walk test, 10-m walking test, Berg Balance Scale, and timed Up and Go test were administered as conventional tests. Construct validity was determined by linear regression analysis using the Clinical Severity Score (CSS) as the dependent variable. Interrater agreement was tested using a κ analysis. Patients with FSHD performed worse on all 4 antigravity tests compared with the controls. Stronger correlations were found within than between test categories (antigravity vs conventional). The antigravity tests revealed the highest explained variance with regard to the CSS (R(2)=.86, P=.014). Interrater agreement was generally good. The results of this exploratory study support the construct validity and interrater reliability of the proposed antigravity tests for the assessment of functional capacity in patients with FSHD taking into account the use of compensatory strategies. Future research should further validate these results in a larger sample of patients with FSHD. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Statistical methodology: II. Reliability and validity assessment in study design, Part B.

PubMed

Karras, D J

1997-02-01

Validity measures the correspondence between a test and other purported measures of the same or similar qualities. When a reference standard exists, a criterion-based validity coefficient can be calculated. If no such standard is available, the concepts of content and construct validity may be used, but quantitative analysis may not be possible. The Pearson and Spearman tests of correlation are often used to assess the correspondence between tests, but do not account for measurement biases and may yield misleading results. Techniques that measure interest differences may be more meaningful in validity assessment, and the kappa statistic is useful for analyzing categorical variables. Questionnaires often can be designed to allow quantitative assessment of reliability and validity, although this may be difficult. Inclusion of homogeneous questions is necessary to assess reliability. Analysis is enhanced by using Likert scales or similar techniques that yield ordinal data. Validity assessment of questionnaires requires careful definition of the scope of the test and comparison with previously validated tools.
Evaluating a technical university's placement test using the Rasch measurement model

NASA Astrophysics Data System (ADS)

Salleh, Tuan Salwani; Bakri, Norhayati; Zin, Zalhan Mohd

2016-10-01

This study discusses the process of validating a mathematics placement test at a technical university. The main objective is to produce a valid and reliable test to measure students' prerequisite knowledge to learn engineering technology mathematics. It is crucial to have a valid and reliable test as the results will be used in a critical decision making to assign students into different groups of Technical Mathematics 1. The placement test which consists of 50 mathematics questions were tested on 82 new diplomas in engineering technology students at a technical university. This study employed rasch measurement model to analyze the data through the Winsteps software. The results revealed that there are ten test questions lower than less able students' ability. Nevertheless, all the ten questions satisfied infit and outfit standard values. Thus, all the questions can be reused in the future placement test at the technical university.
Examinee Noneffort and the Validity of Program Assessment Results

ERIC Educational Resources Information Center

Wise, Steven L.; DeMars, Christine E.

2010-01-01

Educational program assessment studies often use data from low-stakes tests to provide evidence of program quality. The validity of scores from such tests, however, is potentially threatened by examinee noneffort. This study investigated the extent to which one type of noneffort--rapid-guessing behavior--distorted the results from three types of…
Validity of the Mayer-Salovey-Caruso Emotional Intelligence Test: Youth Version-Research Edition

ERIC Educational Resources Information Center

Peters, Christine; Kranzler, John H.; Rossen, Eric

2009-01-01

This study examines the criterion-related validity evidence of scores on the Mayer-Salovey-Caruso Emotional Intelligence Test: Youth Version-Research Version. The authors also investigate the relationship between scores on the MSCEIT-YV and chronological age. Results provide initial support for the construct validity of the MSCEIT-YV but also…
Testing for the validity of purchasing power parity theory both in the long-run and the short-run for ASEAN-5

NASA Astrophysics Data System (ADS)

Choji, Niri Martha; Sek, Siok Kun

2017-11-01

The purchasing power parity theory says that the trade rates among two nations ought to be equivalent to the proportion of the total price levels between the two nations. For more than a decade, there has been substantial interest in testing for the validity of the Purchasing Power Parity (PPP) empirically. This paper performs a series of tests to see if PPP is valid for ASEAN-5 nations for the period of 2000-2016 using monthly data. For this purpose, we conducted four different tests of stationarity, two cointegration tests (Pedroni and Westerlund), and also the VAR model. The stationarity (unit root) tests reveal that the variables are not stationary at levels however stationary at first difference. Cointegration test results did not reject the H0 of no cointegration implying the absence long-run association among the variables and results of the VAR model did not reveal a strong short-run relationship. Based on the data, we, therefore, conclude that PPP is not valid in long-and short-run for ASEAN-5 during 2000-2016.
Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project

PubMed Central

2011-01-01

Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048
Agility performance in high-level junior basketball players: the predictive value of anthropometrics and power qualities.

PubMed

Sisic, Nedim; Jelicic, Mario; Pehar, Miran; Spasic, Miodrag; Sekulic, Damir

2016-01-01

In basketball, anthropometric status is an important factor when identifying and selecting talents, while agility is one of the most vital motor performances. The aim of this investigation was to evaluate the influence of anthropometric variables and power capacities on different preplanned agility performances. The participants were 92 high-level, junior-age basketball players (16-17 years of age; 187.6±8.72 cm in body height, 78.40±12.26 kg in body mass), randomly divided into a validation and cross-validation subsample. The predictors set consisted of 16 anthropometric variables, three tests of power-capacities (Sargent-jump, broad-jump and medicine-ball-throw) as predictors. The criteria were three tests of agility: a T-Shape-Test; a Zig-Zag-Test, and a test of running with a 180-degree turn (T180). Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between observed and predicted scores, dependent samples t-test between predicted and observed scores; and Bland Altman graphics. Analysis of the variance identified centres being advanced in most of the anthropometric indices, and medicine-ball-throw (all at P<0.05); with no significant between-position-differences for other studied motor performances. Multiple regression models originally calculated for the validation subsample were then cross-validated, and confirmed for Zig-zag-Test (R of 0.71 and 0.72 for the validation and cross-validation subsample, respectively). Anthropometrics were not strongly related to agility performance, but leg length is found to be negatively associated with performance in basketball-specific agility. Power capacities are confirmed to be an important factor in agility. The results highlighted the importance of sport-specific tests when studying pre-planned agility performance in basketball. The improvement in power capacities will probably result in an improvement in agility in basketball athletes, while anthropometric indices should be used in order to identify those athletes who can achieve superior agility performance.
Validation of a Video-based Game-Understanding Test Procedure in Badminton.

ERIC Educational Resources Information Center

Blomqvist, Minna T.; Luhtanen, Pekka; Laakso, Lauri; Keskinen, Esko

2000-01-01

Reports the development and validation of video-based game-understanding tests in badminton for elementary and secondary students. The tests included different sequences that simulated actual game situations. Players had to solve tactical problems by selecting appropriate solutions and arguments for their decisions. Results suggest that the test…
Development and Validation of a Test for Bulimia.

ERIC Educational Resources Information Center

Smith, Marcia C.; Thelen, Mark H.

1984-01-01

Developed the Bulimia Test (BULIT) based on responses of clinically identified females (N=18) and normal female college students (N=119) to preliminary test items. Results showed that the BULIT provided an objective, reliable, and valid measure by which to identify individuals with symptoms of bulimia. (Instrument is appended.) (LLL)
Validity and reliability of the NAB Naming Test.

PubMed

Sachs, Bonnie C; Rush, Beth K; Pedraza, Otto

2016-05-01

Confrontation naming is commonly assessed in neuropsychological practice, but few standardized measures of naming exist and those that do are susceptible to the effects of education and culture. The Neuropsychological Assessment Battery (NAB) Naming Test is a 31-item measure used to assess confrontation naming. Despite adequate psychometric information provided by the test publisher, there has been limited independent validation of the test. In this study, we investigated the convergent and discriminant validity, internal consistency, and alternate forms reliability of the NAB Naming Test in a sample of adults (Form 1: n = 247, Form 2: n = 151) clinically referred for neuropsychological evaluation. Results indicate adequate-to-good internal consistency and alternate forms reliability. We also found strong convergent validity as demonstrated by relationships with other neurocognitive measures. We found preliminary evidence that the NAB Naming Test demonstrates a more pronounced ceiling effect than other commonly used measures of naming. To our knowledge, this represents the largest published independent validation study of the NAB Naming Test in a clinical sample. Our findings suggest that the NAB Naming Test demonstrates adequate validity and reliability and merits consideration in the test arsenal of clinical neuropsychologists.
Developing self-concept instrument for pre-service mathematics teachers

NASA Astrophysics Data System (ADS)

Afgani, M. W.; Suryadi, D.; Dahlan, J. A.

2018-01-01

This study aimed to develop self-concept instrument for undergraduate students of mathematics education in Palembang, Indonesia. Type of this study was development research of non-test instrument in questionnaire form. A Validity test of the instrument was performed with construct validity test by using Pearson product moment and factor analysis, while reliability test used Cronbach’s alpha. The instrument was tested by 65 undergraduate students of mathematics education in one of the universities at Palembang, Indonesia. The instrument consisted of 43 items with 7 aspects of self-concept, that were the individual concern, social identity, individual personality, view of the future, the influence of others who become role models, the influence of the environment inside or outside the classroom, and view of the mathematics. The result of validity test showed there was one invalid item because the value of Pearson’s r was 0.107 less than the critical value (0.244; α = 0.05). The item was included in social identity aspect. After the invalid item was removed, Construct validity test with factor analysis generated only one factor. The Kaiser-Meyer-Olkin (KMO) coefficient was 0.846 and reliability coefficient was 0.91. From that result, we concluded that the self-concept instrument for undergraduate students of mathematics education in Palembang, Indonesia was valid and reliable with 42 items.

Validation of an Active Gear, Flexible Aircraft Take-off and Landing analysis (AGFATL)

NASA Technical Reports Server (NTRS)

Mcgehee, J. R.

1984-01-01

The results of an analytical investigation using a computer program for active gear, flexible aircraft take off and landing analysis (AGFATL) are compared with experimental data from shaker tests, drop tests, and simulated landing tests to validate the AGFATL computer program. Comparison of experimental and analytical responses for both passive and active gears indicates good agreement for shaker tests and drop tests. For the simulated landing tests, the passive and active gears were influenced by large strut binding friction forces. The inclusion of these undefined forces in the analytical simulations was difficult, and consequently only fair to good agreement was obtained. An assessment of the results from the investigation indicates that the AGFATL computer program is a valid tool for the study and initial design of series hydraulic active control landing gear systems.
Contemporary Test Validity in Theory and Practice: A Primer for Discipline-Based Education Researchers

PubMed Central

Reeves, Todd D.; Marbach-Ad, Gili

2016-01-01

Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology—either quantitative or qualitative—on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations. PMID:26903498
Development of Two-Tier Diagnostic Test Pictorial-Based for Identifying High School Students Misconceptions on the Mole Concept

NASA Astrophysics Data System (ADS)

Siswaningsih, W.; Firman, H.; Zackiyah; Khoirunnisa, A.

2017-02-01

The aim of this study was to develop the two-tier pictorial-based diagnostic test for identifying student misconceptions on mole concept. The method of this study is used development and validation. The development of the test Obtained through four phases, development of any items, validation, determination key, and application test. Test was developed in the form of pictorial consisting of two tier, the first tier Consist of four possible answers and the second tier Consist of four possible reasons. Based on the results of content validity of 20 items using the CVR (Content Validity Ratio), a number of 18 items declared valid. Based on the results of the reliability test using SPSS, Obtained 17 items with Cronbach’s Alpha value of 0703, the which means that items have accepted. A total of 10 items was conducted to 35 students of senior high school students who have studied the mole concept on one of the high schools in Cimahi. Based on the results of the application test, student misconceptions were identified in each label concept in mole concept with the percentage of misconceptions on the label concept of mole (60.15%), Avogadro’s number (34.28%), relative atomic mass (62, 84%), relative molecule mass (77.08%), molar mass (68.53%), molar volume of gas (57.11%), molarity (71.32%), chemical equation (82.77%), limiting reactants (91.40%), and molecular formula (77.13%).
Characterizing the GOES-R (GOES-16) Geostationary Lightning Mapper (GLM) On-Orbit Performance

NASA Technical Reports Server (NTRS)

Rudlosky, Scott D.; Goodman, Steven J.; Koshak, William J.; Blakeslee, Richard J.; Buechler, Dennis E.; Mach, Douglas M.; Bateman, Monte

2017-01-01

Two overlapping efforts help to characterize the GLM performance, the Post Launch Test (PLT) phase to validate the predicted pre-launch instrument performance and the Post Launch Product Test (PLPT) phase to validate the lightning detection product used in forecast and warning decision-making. This paper documents the calibration and validation plans and activities for the first 6 months of GLM on-orbit testing and validation commencing with first light on 4 January 2017. The PLT phase addresses image quality, on-orbit calibration, RTEP threshold tuning, image navigation, noise filtering, and solar intrusion assessment, resulting in a GLM calibration parameter file. The PLPT includes four main activities, the Reference Data Comparisons (RDC), Algorithm Testing (AT), Instrument Navigation and Registration Testing (INRT), and Long Term Baseline Testing (LTBT). Field campaigns are also designed to contribute valuable insights into the GLM performance capabilities. The PLPT tests each contribute to the beta, provisional, and fully validated GLM data.
Validation of the Simple Shoulder Test in a Portuguese-Brazilian Population. Is the Latent Variable Structure and Validation of the Simple Shoulder Test Stable across Cultures?

PubMed Central

Neto, Jose Osni Bruggemann; Gesser, Rafael Lehmkuhl; Steglich, Valdir; Bonilauri Ferreira, Ana Paula; Gandhi, Mihir; Vissoci, João Ricardo Nickenig; Pietrobon, Ricardo

2013-01-01

Background The validation of widely used scales facilitates the comparison across international patient samples. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Objective The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Methods The Simple Shoulder Test was translated from English into Brazilian Portuguese, translated back into English, and evaluated for accuracy by an expert committee. It was then administered to 100 patients with shoulder conditions. Psychometric properties were analyzed including factor analysis, internal reliability, test-retest reliability at seven days, and construct validity in relation to the Short Form 36 health survey (SF-36). Results Factor analysis demonstrated a three factor solution. Cronbach’s alpha was 0.82. Test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.84. Associations were observed in the hypothesized direction with all subscales of SF-36 questionnaire. Conclusion The Simple Shoulder Test translation and cultural adaptation to Brazilian-Portuguese demonstrated adequate factor structure, internal reliability, and validity, ultimately allowing for its use in the comparison with international patient samples. PMID:23675436
Item validity vs. item discrimination index: a redundancy?

NASA Astrophysics Data System (ADS)

Panjaitan, R. L.; Irawati, R.; Sujana, A.; Hanifah, N.; Djuanda, D.

2018-03-01

In several literatures about evaluation and test analysis, it is common to find that there are calculations of item validity as well as item discrimination index (D) with different formula for each. Meanwhile, other resources said that item discrimination index could be obtained by calculating the correlation between the testee’s score in a particular item and the testee’s score on the overall test, which is actually the same concept as item validity. Some research reports, especially undergraduate theses tend to include both item validity and item discrimination index in the instrument analysis. It seems that these concepts might overlap for both reflect the test quality on measuring the examinees’ ability. In this paper, examples of some results of data processing on item validity and item discrimination index were compared. It would be discussed whether item validity and item discrimination index can be represented by one of them only or it should be better to present both calculations for simple test analysis, especially in undergraduate theses where test analyses were included.
Pretest information for a test to validate plume simulation procedures (FA-17)

NASA Technical Reports Server (NTRS)

Hair, L. M.

1978-01-01

The results of an effort to plan a final verification wind tunnel test to validate the recommended correlation parameters and application techniques were presented. The test planning effort was complete except for test site finalization and the associated coordination. Two suitable test sites were identified. Desired test conditions were shown. Subsequent sections of this report present the selected model and test site, instrumentation of this model, planned test operations, and some concluding remarks.
Validation of a clinical critical thinking skills test in nursing

PubMed Central

2015-01-01

Purpose: The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. Methods: This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Results: Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. Conclusion: From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability. PMID:25622716
Ada (Tradename) Compiler Validation Summary Report. Harris Corporation. HARRIS Ada Compiler, Version 1.0. Harris H1200 and H800.

DTIC Science & Technology

This Validations Summary Report (VSR) summarizes the results and conclusions of validation testing performed on the HARRIS Ada Compiler, Version 1.0...at compile time, at link time, or during execution. On-site testing was performed 28 APR 1986 through 30 APR 1986 at Harris Corporation, Ft. Lauderdale
Using the Rasch analysis for the psychometric validation of the Irregular Word Reading Test (TeLPI): A Portuguese test for the assessment of premorbid intelligence.

PubMed

Freitas, Sandra; Prieto, Gerardo; Simões, Mário R; Nogueira, Joana; Santana, Isabel; Martins, Cristina; Alves, Lara

2018-05-03

The present study aims to analyze the psychometric characteristics of the TeLPI (Irregular Words Reading Test), a Portuguese premorbid intelligence test, using the Rasch model for dichotomous items. The results reveal an overall adequacy and a good fit of values regarding both items and persons. A high variability of cognitive performance level and a good quality of the measurements were also found. The TeLPI has proved to be a unidimensional measure with reduced DIF effects. The present findings contribute to overcome an important gap in the psychometric validity of this instrument and provide good evidence of the overall psychometric validity of TeLPI results.
The Validity and Reliability of the Back Saver Sit-and-Reach Test in Middle School Girls and Boys.

ERIC Educational Resources Information Center

Patterson, Patricia; And Others

1996-01-01

This study examined the validity and reliability of the Back Saver Sit-and-Reach test for middle school students. Students completed the test during physical education class. Results indicated that the test was moderately related to hamstring flexibility, but its relationship to lower back flexibility was quite low for both sexes. (SM)
Finding Kids with Special Needs: the Background, Development, Field Test and Validation.

ERIC Educational Resources Information Center

Resource Management Systems, Inc., Carmel, CA.

Described are the development of "Findings Kids with Special Needs" (FKSN), a instrument to identify children's learning problems and gifted students; results of field testing with 24,825 children, kindergarten through grade 8, in 110 schools; and validation procedures. Discussed is test construction, including incorporation of 12…
Content Validity Index and Intra- and Inter-Rater Reliability of a New Muscle Strength/Endurance Test Battery for Swedish Soldiers

PubMed Central

Larsson, Helena; Tegern, Matthias; Monnier, Andreas; Skoglund, Jörgen; Helander, Charlotte; Persson, Emelie; Malm, Christer; Broman, Lisbet; Aasa, Ulrika

2015-01-01

The objective of this study was to examine the content validity of commonly used muscle performance tests in military personnel and to investigate the reliability of a proposed test battery. For the content validity investigation, thirty selected tests were those described in the literature and/or commonly used in the Nordic and North Atlantic Treaty Organization (NATO) countries. Nine selected experts rated, on a four-point Likert scale, the relevance of these tests in relation to five different work tasks: lifting, carrying equipment on the body or in the hands, climbing, and digging. Thereafter, a content validity index (CVI) was calculated for each work task. The result showed excellent CVI (≥0.78) for sixteen tests, which comprised of one or more of the military work tasks. Three of the tests; the functional lower-limb loading test (the Ranger test), dead-lift with kettlebells, and back extension, showed excellent content validity for four of the work tasks. For the development of a new muscle strength/endurance test battery, these three tests were further supplemented with two other tests, namely, the chins and side-bridge test. The inter-rater reliability was high (intraclass correlation coefficient, ICC2,1 0.99) for all five tests. The intra-rater reliability was good to high (ICC3,1 0.82–0.96) with an acceptable standard error of mean (SEM), except for the side-bridge test (SEM%>15). Thus, the final suggested test battery for a valid and reliable evaluation of soldiers’ muscle performance comprised the following four tests; the Ranger test, dead-lift with kettlebells, chins, and back extension test. The criterion-related validity of the test battery should be further evaluated for soldiers exposed to varying physical workload. PMID:26177030
Development of diagnostic test instruments to reveal level student conception in kinematic and dynamics

NASA Astrophysics Data System (ADS)

Handhika, J.; Cari, C.; Suparmi, A.; Sunarno, W.; Purwandari, P.

2018-03-01

The purpose of this research was to develop a diagnostic test instrument to reveal students' conceptions in kinematics and dynamics. The diagnostic test was developed based on the content indicator the concept of (1) displacement and distance, (2) instantaneous and average velocity, (3) zero and constant acceleration, (4) gravitational acceleration (5) Newton's first Law, (6) and Newton's third Law. The diagnostic test development model includes: Diagnostic test requirement analysis, formulating test-making objectives, developing tests, checking the validity of the content and the performance of reliability, and application of tests. The Content Validation Index (CVI) results in the category are highly relevant, with a value of 0.85. Three questions get negative Content Validation Ratio CVR) (-0.6), after revised distractors and clarify visual presentation; the CVR become 1 (highly relevant). This test was applied, obtained 16 valid test items, with Cronbach Alpha value of 0.80. It can conclude that diagnostic test can be used to reveal the level of students conception in kinematics and dynamics.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models

PubMed Central

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

2016-01-01

Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
The Reliability, Validity, and Normative Data of Interpupillary Distance and Pupil Diameter Using Eye-Tracking Technology

PubMed Central

Murray, Nicholas P.; Hunfalvay, Melissa; Bolte, Takumi

2017-01-01

Purpose The purpose of this study was to determine the reliability of interpupillary distance (IPD) and pupil diameter (PD) measures using an infrared eye tracker and central point stimuli. Validity of the test compared to known clinical tools was determined, and normative data was established against which individuals can measure themselves. Methods Participants (416) across various demographics were examined for normative data. Of these, 50 were examined for reliability and validity. Validity for IPD measured the test (RightEye IPD/PD) against the PL850 Pupilometer and the Essilor Digital CRP. For PD, the test was measured against the Rosenbaum Pocket Vision Screener (RPVS). Reliability was analyzed with intraclass correlation coefficients (ICC) between trials with Cronbach's alpha (CA) and the standard error of measurement for each ICC. Convergent validity was investigated by calculating the bivariate correlation coefficient. Results Reliability results were strong (CA > 0.7) for all measures. High positive significant correlations were found between the RightEye IPD test and the PL850 Pupilometer (P < 0.001) and Essilor Digital CRP (P < 0.001) and for the RightEye PD test and the RPVS (P < 0.001). Conclusions Using infrared eye tracking and the RightEye IPD/PD test stimuli, reliable and accurate measures of IPD and PD were found. Results from normative data showed an adequate comparison for people with normal vision development. Translational Relevance Results revealed a central point of fixation may remove variability in examining PD reliably using infrared eye tracking when consistent environmental and experimental procedures are conducted. PMID:28685104
Psychometric Evaluation of the Revised Michigan Diabetes Knowledge Test (V.2016) in Arabic: Translation and Validation

PubMed Central

Alhaiti, Ali Hassan; Alotaibi, Alanod Raffa; Jones, Linda Katherine; DaCosta, Cliff

2016-01-01

Objective. To translate the revised Michigan Diabetes Knowledge Test into the Arabic language and examine its psychometric properties. Setting. Of the 139 participants recruited through King Fahad Medical City in Riyadh, Saudi Arabia, 34 agreed to the second-round sample for retesting purposes. Methods. The translation process followed the World Health Organization's guidelines for the translation and adaptation of instruments. All translations were examined for their validity and reliability. Results. The translation process revealed excellent results throughout all stages. The Arabic version received 0.75 for internal consistency via Cronbach's alpha test and excellent outcomes in terms of the test-retest reliability of the instrument with a mean of 0.90 infraclass correlation coefficient. It also received positive content validity index scores. The item-level content validity index for all instrument scales fell between 0.83 and 1 with a mean scale-level index of 0.96. Conclusion. The Arabic version is proven to be a reliable and valid measure of patient's knowledge that is ready to be used in clinical practices. PMID:27995149
Validation of 2 commercial Neospora caninum antibody enzyme linked immunosorbent assays

PubMed Central

Wu, John T.Y.; Dreger, Sally; Chow, Eva Y.W.; Bowlby, Evelyn E.

2002-01-01

Abstract This is a validation study of 2 commercially available enzyme linked immunosorbent assays (ELISA) for the detection of antibodies against Neospora caninum in bovine serum. The results of the reference sera (n = 30) and field sera from an infected beef herd (n = 150) were tested by both ELISAs and the results were compared statistically. When the immunoblotting results of the reference bovine sera were compared to the ELISA results, the same identity score (96.67%) and kappa values (K) (0.93) were obtained for both ELISAs. The sensitivity and specificity values for the IDEXX test were 100% and 93.33% respectively. For the Biovet test 93.33% and 100% were obtained. The corresponding positive (PV+) and negative predictive (PV−) values for the 2 assays were 93.75% and 100% (IDEXX), and 100% and 93.75% (Biovet). In the 2nd study, competitive inhibition ELISA (c-ELISA) results on bovine sera from an infected herd were compared to the 2 sets of ELISA results. The identity scores of the 2 ELISAs were 98% (IDEXX) and 97.33% (Biovet). The K values calculated were 0.96 (IDEXX) and 0.95 (Biovet). For the IDEXX test the sensitivity and specificity were 97.56% and 98.53%, whereas for the Biovet assay 95.12% and 100% were recorded, respectively. The corresponding PV+ and PV− values were 98.77% and 97.1% (IDEXX), and 100% and 94.44% (Biovet). Our validation results showed that the 2 ELISAs worked equally well and there was no statistically significant difference between the performance of the 2 tests. Both tests showed high reproducibility, repeatability and substantial agreement with results from 2 other laboratories. A quality assurance based on the requirement of the ISO/IEC 17025 standards has been adopted throughout this project for test validation procedures. PMID:12418782
Economic analysis of model validation for a challenge problem

DOE PAGES

Paez, Paul J.; Paez, Thomas L.; Hasselman, Timothy K.

2016-02-19

It is now commonplace for engineers to build mathematical models of the systems they are designing, building, or testing. And, it is nearly universally accepted that phenomenological models of physical systems must be validated prior to use for prediction in consequential scenarios. Yet, there are certain situations in which testing only or no testing and no modeling may be economically viable alternatives to modeling and its associated testing. This paper develops an economic framework within which benefit–cost can be evaluated for modeling and model validation relative to other options. The development is presented in terms of a challenge problem. Asmore » a result, we provide a numerical example that quantifies when modeling, calibration, and validation yield higher benefit–cost than a testing only or no modeling and no testing option.« less
Evaluation of the methodological quality of studies of the performance of diagnostic tests for bovine tuberculosis using QUADAS.

PubMed

Downs, Sara H; More, Simon J; Goodchild, Anthony V; Whelan, Adam O; Abernethy, Darrell A; Broughan, Jennifer M; Cameron, Angus; Cook, Alasdair J; Ricardo de la Rua-Domenech, R; Greiner, Matthias; Gunn, Jane; Nuñez-Garcia, Javier; Rhodes, Shelley; Rolfe, Simon; Sharp, Michael; Upton, Paul; Watson, Eamon; Welsh, Michael; Woolliams, John A; Clifton-Hadley, Richard S; Parry, Jessica E

2018-05-01

There has been little assessment of the methodological quality of studies measuring the performance (sensitivity and/or specificity) of diagnostic tests for animal diseases. In a systematic review, 190 studies of tests for bovine tuberculosis (bTB) in cattle (published 1934-2009) were assessed by at least one of 18 reviewers using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies) checklist adapted for animal disease tests. VETQUADAS (VQ) included items measuring clarity in reporting (n = 3), internal validity (n = 9) and external validity (n = 2). A similar pattern for compliance was observed in studies of different diagnostic test types. Compliance significantly improved with year of publication for all items measuring clarity in reporting and external validity but only improved in four of the nine items measuring internal validity (p < 0.05). 107 references, of which 83 had performance data eligible for inclusion in a meta-analysis were reviewed by two reviewers. In these references, agreement between reviewers' responses was 71% for compliance, 32% for unsure and 29% for non-compliance. Mean compliance with reporting items was 2, 5.2 for internal validity and 1.5 for external validity. The index test result was described in sufficient detail in 80.1% of studies and was interpreted without knowledge of the reference standard test result in only 33.1%. Loss to follow-up was adequately explained in only 31.1% of studies. The prevalence of deficiencies observed may be due to inadequate reporting but may also reflect lack of attention to methodological issues that could bias the results of diagnostic test performance estimates. QUADAS was a useful tool for assessing and comparing the quality of studies measuring the performance of diagnostic tests but might be improved further by including explicit assessment of population sampling strategy. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

Calibration and validation of a spar-type floating offshore wind turbine model using the FAST dynamic simulation tool

DOE PAGES

Browning, J. R.; Jonkman, J.; Robertson, A.; ...

2014-12-16

In this study, high-quality computer simulations are required when designing floating wind turbines because of the complex dynamic responses that are inherent with a high number of degrees of freedom and variable metocean conditions. In 2007, the FAST wind turbine simulation tool, developed and maintained by the U.S. Department of Energy's (DOE's) National Renewable Energy Laboratory (NREL), was expanded to include capabilities that are suitable for modeling floating offshore wind turbines. In an effort to validate FAST and other offshore wind energy modeling tools, DOE funded the DeepCwind project that tested three prototype floating wind turbines at 1/50 th scalemore » in a wave basin, including a semisubmersible, a tension-leg platform, and a spar buoy. This paper describes the use of the results of the spar wave basin tests to calibrate and validate the FAST offshore floating simulation tool, and presents some initial results of simulated dynamic responses of the spar to several combinations of wind and sea states. Wave basin tests with the spar attached to a scale model of the NREL 5-megawatt reference wind turbine were performed at the Maritime Research Institute Netherlands under the DeepCwind project. This project included free-decay tests, tests with steady or turbulent wind and still water (both periodic and irregular waves with no wind), and combined wind/wave tests. The resulting data from the 1/50th model was scaled using Froude scaling to full size and used to calibrate and validate a full-size simulated model in FAST. Results of the model calibration and validation include successes, subtleties, and limitations of both wave basin testing and FAST modeling capabilities.« less
Translation, Cultural Adaptation and Validation of the Simple Shoulder Test to Spanish

PubMed Central

Arcuri, Francisco; Barclay, Fernando; Nacul, Ivan

2015-01-01

Background: The validation of widely used scales facilitates the comparison across international patient samples. Objective: The objective was to translate, culturally adapt and validate the Simple Shoulder Test into Argentinian Spanish. Methods: The Simple Shoulder Test was translated from English into Argentinian Spanish by two independent translators, translated back into English and evaluated for accuracy by an expert committee to correct the possible discrepancies. It was then administered to 50 patients with different shoulder conditions.Psycometric properties were analyzed including internal consistency, measured with Cronbach´s Alpha, test-retest reliability at 15 days with the interclass correlation coefficient. Results: The internal consistency, validation, was an Alpha of 0,808, evaluated as good. The test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.835, evaluated as excellent. Conclusion: The Simple Shoulder Test translation and it´s cultural adaptation to Argentinian-Spanish demonstrated adequate internal reliability and validity, ultimately allowing for its use in the comparison with international patient samples.
Development and validation of the Child Oral Health Impact Profile - Preschool version.

PubMed

Ruff, R R; Sischo, L; Chinn, C H; Broder, H L

2017-09-01

The Child Oral Health Impact Profile (COHIP) is a validated instrument created to measure the oral health-related quality of life of school-aged children. The purpose of this study was to develop and validate a preschool version of the COHIP (COHIP-PS) for children aged 2-5. The COHIP-PS was developed and validated using a multi-stage process consisting of item selection, face validity testing, item impact testing, reliability and validity testing, and factor analysis. A cross-sectional convenience sample of caregivers having children 2-5 years old from four groups completed item clarity and impact forms. Groups were recruited from pediatric health clinics or preschools/daycare centers, speech clinics, dental clinics, or cleft/craniofacial centers. Participants had a variety of oral health-related conditions, including caries, congenital orofacial anomalies, and speech/language deficiencies such as articulation and language disorders. COHIP-PS. The COHIP-PS was found to have acceptable internal validity (a = 0.71) and high test-retest reliability (0.87), though internal validity was below the accepted threshold for the community sample. While discriminant validity results indicated significant differences across study groups, the overall magnitude of differences was modest. Results from confirmatory factor analyses support the use of a four-factor model consisting of 11 items across oral health, functional well-being, social-emotional well-being, and self-image domains. Quality of life is an integral factor in understanding and assessing children's well-being. The COHIP-PS is a validated oral health-related quality of life measure for preschool children with cleft or other oral conditions. Copyright© 2017 Dennis Barber Ltd.
Screening for colon cancer: A test for occult blood.

PubMed

Khakimov, N; Khasanova, G; Ershova, K; Gibadullina, L; Vetkina, T; Lobisheva, G; Chumakova, A

2015-01-01

The relevance of the problem of colorectal cancer (CRC) is evident because of extremely high morbidity and mortality rates, associated with this disease. CRC is mostly diagnosed only at very advanced stages. The reduction of mortality can be achieved by the popularization of screening-methods for early identification of CRC and adenomatous polyps of the colon, which are proved to be precancerous condition. Fecal occult blood test is a well-known method of screening for CRC. The advantages of this method when compared, for example, with colonoscopy are its simplicity and cost-effectiveness.Two techniques are usually used for detection of occult blood in the stool: Hemoccult (Guaiac) test and immunochemical test for hemoglobin. There is no consensus among researchers regarding the validity of these tests for the diagnosis of colorectal cancer. For example, J.S. Mandel (1996) notes 60% sensitivity of Guaiac-test for the detection of the early forms of colorectal cancer, while O.I. Kit (2014) suggets that it is not higher than 30%. There are also various opinions about specificity of these two tests. To review the literature on the validity of the fecal occult blood tests for the diagnosis of CRC. We looked for articles (electronic versions) available for free in the full-text versions, published from June 1, 1990 to December 31, 2014 in Russian or English. The following databases were used for search: E-LIBRARY; Cochrane; MEDLINE; EMBASE; Google search. Only original research papers were analyzed. Literature reviews or systematic reviews were not taken for analyses. 1) use of Guaiac and/or immunochemical fecal occult blood test as screening-tests for the detection of colorectal cancer and/or colon polyps (1 cm or more in diameter) in people older than 45 years; 2) comparing of results with the results of colonoscopy (colonoscopy is counted by majority of the authors as a "gold standard" for the diagnosis of CRC and adenomatous polyps). Initial keyword search returned 803 000 results, of which 449 sources were selected. After reading the abstracts, 29 articles that met inclusion criteria were kept. 10 other articles were excluded after that because they did not contain enough data for extraction or did not contain a control group. At the final step 19 articles were used for meta-analysis.Forest plot and Rock curve, which were developed with inclusion of the data from all studies, showed heterogeneity of the data. Additional analyzes were performed in subgroups with different diagnoses and various tests.The sensitivity of the Guaiac test for the diagnosis of colorectal cancer varied from 0.13 to 1.00, and specificity - from 0.69 to 0.99. The sensitivity of the immunochemical test for the diagnosis of CRC ranged from 0.42 to 0.94 with specificity ranging from 0.40 to 1.00.The sensitivity of the Guaiac test for the diagnosis of the colon polyps was between 0.05 and 0.69, and its specificity - from 0.67 to 0.98. The sensitivity of the immunochemical test for the diagnosis of polyps was from 0.24 to 0.75, and its specificity - from 0.40 to 0.97.Bivariate analysis of the validity of Guaiac test and immunochemical method for the diagnosis of colorectal cancer showed better results for the immunochemical test compared to Guaiac test. The tests showed very similar results when used for the diagnosis of polyposis. Bivariate analysis, comparing the validity of tests for the diagnosis of colorectal cancer versus polyposis demonstrated better results for CRC.Multivariate analysis of the validity of the Guaiac and immunochemical tests for the diagnosis of colorectal cancer and polyps also showed better results for detection of colorectal cancer compared with the polyps for both tests. At the same time the highest validity for the diagnosis of CRC was demonstrated for immunochemical analysis. 1. The sensitivity of the Guaiac test for occult blood in stool is lower than its specificity.2. Broad dispersion of the validity characteristics of the fecal occult blood tests was observed.3. The validity of tests for occult blood was higher when they were used for detection of colorectal cancer than of colon polyposis.4. The highest validity rate has been demonstrated for the immunochemical test when it was used for colon cancer screening.
A structured interview for the DSM-III personality disorders. A preliminary report.

PubMed

Stangl, D; Pfohl, B; Zimmerman, M; Bowers, W; Corenthal, C

1985-06-01

With few exceptions, published studies fail to indicate that the DSM-III personality disorders can be distinguished from each other with respect to etiology, prognosis, treatment response, or family history. The Structured Interview for the DSM-III Personality Disorders (SIDP) was developed to improve axis II diagnostic reliability, and hence allow validity testing of axis II. Sixty-three subjects were independently rated by two interviewers using the SIDP. The kappa coefficients for interrater agreement reached .70 or higher for histrionic, borderline, and dependent personalities. While it is impossible to separate the validity testing of the SIDP from validity testing of the DSM-III personality criteria themselves, preliminary results from 102 inpatient SIDP interviews suggest some criterion-based validity with respect to standard personality rating scales and some construct validity with respect to the dexamethasone suppression test.
The adolescent child health and illness profile. A population-based measure of health.

PubMed

Starfield, B; Riley, A W; Green, B F; Ensminger, M E; Ryan, S A; Kelleher, K; Kim-Harris, S; Johnston, D; Vogel, K

1995-05-01

This study was designed to test the reliability and validity of an instrument to assess adolescent health status. Reliability and validity were examined by administration to adolescents (ages 11-17 years) in eight schools in two urban areas, one area in Appalachia, and one area in the rural South. Integrity of the domains and subdomains and construct validity were tested in all areas. Test/retest stability, criterion validity, and convergent and discriminant validity were tested in the two urban areas. Iterative testing has resulted in the final form of the CHIP-AE (Child Health and Illness Profile-Adolescent Edition) having 6 domains with 20 subdomains. The domains are Discomfort, Disorders, Satisfaction with Health, Achievement (of age-appropriate social roles), Risks, and Resilience. Tested aspects of reliability and validity have achieved acceptable levels for all retained subdomains. The CHIP-AE in its current form is suitable for assessing the health status of populations and subpopulations of adolescents. Evidence from test-retest stability analyses suggests that the CHIP-AE also can be used to assess changes occurring over time or in response to health services interventions targeted at groups of adolescents.
Evaluation of tools used to measure calcium and/or dairy consumption in children and adolescents.

PubMed

Magarey, Anthea; Yaxley, Alison; Markow, Kylie; Baulderstone, Lauren; Miller, Michelle

2014-08-01

To identify and critique tools that assess Ca and/or dairy intake in children to ascertain the most accurate and reliable tools available. A systematic review of the literature was conducted using defined inclusion and exclusion criteria. Articles were included on the basis that they reported on a tool measuring Ca and/or dairy intake in children in Western countries and reported on originally developed tools or tested the validity or reliability of existing tools. Defined criteria for reporting reliability and validity properties were applied. Studies in Western countries. Children. Eighteen papers reporting on two tools that assessed dairy intake, ten that assessed Ca intake and five that assessed both dairy and Ca were identified. An examination of tool testing revealed high reliance on lower-order tests such as correlation and failure to differentiate between statistical and clinically meaningful significance. Only half of the tools were tested for reliability and results indicated that only one Ca tool and one dairy tool were reliable. Validation studies showed acceptable levels of agreement (<100 mg difference) and/or sensitivity (62-83 %) and specificity (55-77 %) in three Ca tools. With reference to the testing methodology and results, no tools were considered both valid and reliable for the assessment of dairy intake and only one tool proved valid and reliable for the assessment of Ca intake. These results clearly indicate the need for development and rigorous testing of tools to assess Ca and/or dairy intake in children and adolescents.
Validity and Reliability of Field-Based Measures for Assessing Movement Skill Competency in Lifelong Physical Activities: A Systematic Review.

PubMed

Hulteen, Ryan M; Lander, Natalie J; Morgan, Philip J; Barnett, Lisa M; Robertson, Samuel J; Lubans, David R

2015-10-01

It has been suggested that young people should develop competence in a variety of 'lifelong physical activities' to ensure that they can be active across the lifespan. The primary aim of this systematic review is to report the methodological properties, validity, reliability, and test duration of field-based measures that assess movement skill competency in lifelong physical activities. A secondary aim was to clearly define those characteristics unique to lifelong physical activities. A search of four electronic databases (Scopus, SPORTDiscus, ProQuest, and PubMed) was conducted between June 2014 and April 2015 with no date restrictions. Studies addressing the validity and/or reliability of lifelong physical activity tests were reviewed. Included articles were required to assess lifelong physical activities using process-oriented measures, as well as report either one type of validity or reliability. Assessment criteria for methodological quality were adapted from a checklist used in a previous review of sport skill outcome assessments. Movement skill assessments for eight different lifelong physical activities (badminton, cycling, dance, golf, racquetball, resistance training, swimming, and tennis) in 17 studies were identified for inclusion. Methodological quality, validity, reliability, and test duration (time to assess a single participant), for each article were assessed. Moderate to excellent reliability results were found in 16 of 17 studies, with 71% reporting inter-rater reliability and 41% reporting intra-rater reliability. Only four studies in this review reported test-retest reliability. Ten studies reported validity results; content validity was cited in 41% of these studies. Construct validity was reported in 24% of studies, while criterion validity was only reported in 12% of studies. Numerous assessments for lifelong physical activities may exist, yet only assessments for eight lifelong physical activities were included in this review. Generalizability of results may be more applicable if more heterogeneous samples are used in future research. Moderate to excellent levels of inter- and intra-rater reliability were reported in the majority of studies. However, future work should look to establish test-retest reliability. Validity was less commonly reported than reliability, and further types of validity other than content validity need to be established in future research. Specifically, predictive validity of 'lifelong physical activity' movement skill competency is needed to support the assertion that such activities provide the foundation for a lifetime of activity.
[Comparison of the Wechsler Memory Scale-III and the Spain-Complutense Verbal Learning Test in acquired brain injury: construct validity and ecological validity].

PubMed

Luna-Lario, P; Pena, J; Ojeda, N

2017-04-16

To perform an in-depth examination of the construct validity and the ecological validity of the Wechsler Memory Scale-III (WMS-III) and the Spain-Complutense Verbal Learning Test (TAVEC). The sample consists of 106 adults with acquired brain injury who were treated in the Area of Neuropsychology and Neuropsychiatry of the Complejo Hospitalario de Navarra and displayed memory deficit as the main sequela, measured by means of specific memory tests. The construct validity is determined by examining the tasks required in each test over the basic theoretical models, comparing the performance according to the parameters offered by the tests, contrasting the severity indices of each test and analysing their convergence. The external validity is explored through the correlation between the tests and by using regression models. According to the results obtained, both the WMS-III and the TAVEC have construct validity. The TAVEC is more sensitive and captures not only the deficits in mnemonic consolidation, but also in the executive functions involved in memory. The working memory index of the WMS-III is useful for predicting the return to work at two years after the acquired brain injury, but none of the instruments anticipates the disability and dependence at least six months after the injury. We reflect upon the construct validity of the tests and their insufficient capacity to predict functionality when the sequelae become chronic.
Validating a UAV artificial intelligence control system using an autonomous test case generator

NASA Astrophysics Data System (ADS)

Straub, Jeremy; Huber, Justin

2013-05-01

The validation of safety-critical applications, such as autonomous UAV operations in an environment which may include human actors, is an ill posed problem. To confidence in the autonomous control technology, numerous scenarios must be considered. This paper expands upon previous work, related to autonomous testing of robotic control algorithms in a two dimensional plane, to evaluate the suitability of similar techniques for validating artificial intelligence control in three dimensions, where a minimum level of airspeed must be maintained. The results of human-conducted testing are compared to this automated testing, in terms of error detection, speed and testing cost.
Reliability and validity of the closed kinetic chain upper extremity stability test.

PubMed

Lee, Dong-Rour; Kim, Laurentius Jongsoon

2015-04-01

[Purpose] The purpose of this study was to examine the reliability and validity of the Closed Kinetic Chain Upper Extremity Stability (CKCUES) test. [Subjects and Methods] A sample of 40 subjects (20 males, 20 females) with and without pain in the upper limbs was recruited. The subjects were tested twice, three days apart to assess the reliability of the CKCUES test. The CKCUES test was performed four times, and the average was calculated using the data of the last 3 tests. In order to test the validity of the CKCUES test, peak torque of internal/external shoulder rotation was measured using an isokinetic dynamometer, and maximum grip strength was measured using a hand dynamometer, and their Pearson correlation coefficients with the average values of the CKCUES test were calculated. [Results] The reliability of the CKCUES test was very high (ICC=0.97). The correlations between the CKCUES test and maximum grip strength (r=0.78-0.79), and the peak torque of internal/external shoulder rotation (r=0.87-0.94) were high indicating its validity. [Conclusion] The reliability and validity of the CKCUES test were high. The CKCUES test is expected to be used for clinical tests on upper limb stability at low price.
Coverage of the Test of Memory Malingering, Victoria Symptom Validity Test, and Word Memory Test on the Internet: is test security threatened?

PubMed

Bauer, Lyndsey; McCaffrey, Robert J

2006-01-01

In forensic neuropsychological settings, maintaining test security has become critically important, especially in regard to symptom validity tests (SVTs). Coaching, which can entail providing patients or litigants with information about the cognitive sequelae of head injury, or teaching them test-taking strategies to avoid detection of symptom dissimulation has been examined experimentally in many research studies. Emerging evidence supports that coaching strategies affect psychological and neuropsychological test performance to differing degrees depending on the coaching paradigm and the tests administered. The present study sought to examine Internet coverage of SVTs because it is potentially another source of coaching, or information that is readily available. Google searches were performed on the Test of Memory Malingering, the Victoria Symptom Validity Test, and the Word Memory Test. Results indicated that there is a variable amount of information available about each test that could threaten test security and validity should inappropriately interested parties find it. Steps that could be taken to improve this situation and limitations to this exploration are discussed.
Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board

PubMed Central

2014-01-01

Background Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children’s movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Methods Fifty-four 10–14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Results Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Conclusion Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB. PMID:24913461
The Validity of the Modified Sit-and-Reach Test in College-Age Students.

ERIC Educational Resources Information Center

Minkler, Sharin; Patterson, Patricia

1994-01-01

Reports a study that examined the criterion-related validity of the modified sit-and-reach test against criterion measures of hamstring and low back flexibility in college students. Results indicated the modified sit-and-reach test moderately related to hamstring flexibility, but its relation to low back flexibility was low. (SM)
A New Method for Analyzing Content Validity Data Using Multidimensional Scaling

ERIC Educational Resources Information Center

Li, Xueming; Sireci, Stephen G.

2013-01-01

Validity evidence based on test content is of essential importance in educational testing. One source for such evidence is an alignment study, which helps evaluate the congruence between tested objectives and those specified in the curriculum. However, the results of an alignment study do not always sufficiently capture the degree to which a test…
Validation of antibiotic residue tests for dairy goats.

PubMed

Zeng, S S; Hart, S; Escobar, E N; Tesfai, K

1998-03-01

The SNAP test, LacTek test (B-L and CEF), Charm Bacillus sterothermophilus var. calidolactis disk assay (BsDA), and Charm II Tablet Beta-lactam sequential test were validated using antibiotic-fortified and -incurred goat milk following the protocol for test kit validations of the U.S. Food and Drug Administration Center for Veterinary Medicine. SNAP, Charm BsDA, and Charm II Tablet Sequential tests were sensitive and reliable in detecting antibiotic residues in goat milk. All three assays showed greater than 90% sensitivity and specificity at tolerance and detection levels. However, caution should be taken in interpreting test results at detection levels. Because of the high sensitivity of these three tests, false-violative results could be obtained in goat milk containing antibiotic residues below the tolerance level. Goat milk testing positive by these tests must be confirmed using a more sophisticated methodology, such as high-performance liquid chromatography, before the milk is condemned. LacTek B-L test did not detect several antibiotics, including penicillin G, in goat milk at tolerance levels. However, LacTek CEF was excellent in detecting ceftiofur residue in goat milk.
Reliability and validity of the Spanish Language Wechsler Adult Intelligence Scale (3rd Edition) in a sample of American, urban, Spanish-speaking Hispanics.

PubMed

Renteria, Laura; Li, Susan Tinsley; Pliskin, Neil H

2008-05-01

The utility of the Spanish WAIS-III was investigated by examining its reliability and validity among 100 Spanish-speaking participants. Results indicated that the internal consistency of the subtests was satisfactory, but inadequate for Letter Number Sequencing. Criterion validity was adequate. Convergent and discriminant validity results were generally similar to the North American normative sample. Paired sample t-tests suggested that the WAIS-III may underestimate ability when compared to the criterion measures that were utilized to assess validity. This study provides support for the use of the Spanish WAIS-III in urban Hispanic populations, but also suggests that caution be used when administering specific subtests, due to the nature of the Latin America alphabet and potential test bias.
Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.

PubMed

Schlaberg, Robert; Chiu, Charles Y; Miller, Steve; Procop, Gary W; Weinstock, George

2017-06-01

- Metagenomic sequencing can be used for detection of any pathogens using unbiased, shotgun next-generation sequencing (NGS), without the need for sequence-specific amplification. Proof-of-concept has been demonstrated in infectious disease outbreaks of unknown causes and in patients with suspected infections but negative results for conventional tests. Metagenomic NGS tests hold great promise to improve infectious disease diagnostics, especially in immunocompromised and critically ill patients. - To discuss challenges and provide example solutions for validating metagenomic pathogen detection tests in clinical laboratories. A summary of current regulatory requirements, largely based on prior guidance for NGS testing in constitutional genetics and oncology, is provided. - Examples from 2 separate validation studies are provided for steps from assay design, and validation of wet bench and bioinformatics protocols, to quality control and assurance. - Although laboratory and data analysis workflows are still complex, metagenomic NGS tests for infectious diseases are increasingly being validated in clinical laboratories. Many parallels exist to NGS tests in other fields. Nevertheless, specimen preparation, rapidly evolving data analysis algorithms, and incomplete reference sequence databases are idiosyncratic to the field of microbiology and often overlooked.
Validation of EncephalApp, Smartphone-Based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy.

PubMed

Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

2015-10-01

Detection of covert hepatic encephalopathy (CHE) is difficult, but point-of-care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test-retest reliability, and external validity. Patients with cirrhosis (n = 167; 38% with overt HE [OHE]; mean age, 55 years; mean Model for End-Stage Liver Disease score, 12) and controls (n = 114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test-retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intrahepatic portosystemic shunt placement, and before and after correction for hyponatremia, to determine external validity. All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cutoffs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic value of 0.91; the area under the receiver operator characteristic value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test-retest reliability was high (intraclass coefficient, 0.83) among 30 patients retested 1-3 months apart. OffTime+OnTime increased significantly (206 vs 255 seconds, P = .007) among 10 patients retested 33 ± 7 days after transjugular intrahepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225 seconds, P = .03) in 7 patients tested before and after correction for hyponatremia (126 ± 3 to 132 ± 4 meq/L, P = .01) 10 ± 5 days apart. A smartphone app called EncephalApp has good face validity, test-retest reliability, and external validity for the diagnosis of CHE. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
Initial Development and Validation of the BullyHARM: The Bullying, Harassment, and Aggression Receipt Measure.

PubMed

Hall, William J

2016-11-01

This article describes the development and preliminary validation of the Bullying, Harassment, and Aggression Receipt Measure (BullyHARM). The development of the BullyHARM involved a number of steps and methods, including a literature review, expert review, cognitive testing, readability testing, data collection from a large sample, reliability testing, and confirmatory factor analysis. A sample of 275 middle school students was used to examine the psychometric properties and factor structure of the BullyHARM, which consists of 22 items and 6 subscales: physical bullying, verbal bullying, social/relational bullying, cyber-bullying, property bullying, and sexual bullying. First-order and second-order factor models were evaluated. Results demonstrate that the first-order factor model had superior fit. Results of reliability testing indicate that the BullyHARM scale and subscales have very good internal consistency reliability. Findings indicate that the BullyHARM has good properties regarding content validation and respondent-related validation and is a promising instrument for measuring bullying victimization in school.

Initial Development and Validation of the BullyHARM: The Bullying, Harassment, and Aggression Receipt Measure

PubMed Central

Hall, William J.

2017-01-01

This article describes the development and preliminary validation of the Bullying, Harassment, and Aggression Receipt Measure (BullyHARM). The development of the BullyHARM involved a number of steps and methods, including a literature review, expert review, cognitive testing, readability testing, data collection from a large sample, reliability testing, and confirmatory factor analysis. A sample of 275 middle school students was used to examine the psychometric properties and factor structure of the BullyHARM, which consists of 22 items and 6 subscales: physical bullying, verbal bullying, social/relational bullying, cyber-bullying, property bullying, and sexual bullying. First-order and second-order factor models were evaluated. Results demonstrate that the first-order factor model had superior fit. Results of reliability testing indicate that the BullyHARM scale and subscales have very good internal consistency reliability. Findings indicate that the BullyHARM has good properties regarding content validation and respondent-related validation and is a promising instrument for measuring bullying victimization in school. PMID:28194041
The Hyper-X Flight Systems Validation Program

NASA Technical Reports Server (NTRS)

Redifer, Matthew; Lin, Yohan; Bessent, Courtney Amos; Barklow, Carole

2007-01-01

For the Hyper-X/X-43A program, the development of a comprehensive validation test plan played an integral part in the success of the mission. The goal was to demonstrate hypersonic propulsion technologies by flight testing an airframe-integrated scramjet engine. Preparation for flight involved both verification and validation testing. By definition, verification is the process of assuring that the product meets design requirements; whereas validation is the process of assuring that the design meets mission requirements for the intended environment. This report presents an overview of the program with emphasis on the validation efforts. It includes topics such as hardware-in-the-loop, failure modes and effects, aircraft-in-the-loop, plugs-out, power characterization, antenna pattern, integration, combined systems, captive carry, and flight testing. Where applicable, test results are also discussed. The report provides a brief description of the flight systems onboard the X-43A research vehicle and an introduction to the ground support equipment required to execute the validation plan. The intent is to provide validation concepts that are applicable to current, follow-on, and next generation vehicles that share the hybrid spacecraft and aircraft characteristics of the Hyper-X vehicle.
Fatigue Failure of Space Shuttle Main Engine Turbine Blades

NASA Technical Reports Server (NTRS)

Swanson, Gregrory R.; Arakere, Nagaraj K.

2000-01-01

Experimental validation of finite element modeling of single crystal turbine blades is presented. Experimental results from uniaxial high cycle fatigue (HCF) test specimens and full scale Space Shuttle Main Engine test firings with the High Pressure Fuel Turbopump Alternate Turbopump (HPFTP/AT) provide the data used for the validation. The conclusions show the significant contribution of the crystal orientation within the blade on the resulting life of the component, that the analysis can predict this variation, and that experimental testing demonstrates it.
Vacuum decay container closure integrity leak test method development and validation for a lyophilized product-package system.

PubMed

Patel, Jayshree; Mulhall, Brian; Wolf, Heinz; Klohr, Steven; Guazzo, Dana Morton

2011-01-01

A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated for container-closure integrity verification of a lyophilized product in a parenteral vial package system. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Method development and optimization challenge studies incorporated artificially defective packages representing a range of glass vial wall and sealing surface defects, as well as various elastomeric stopper defects. Method validation required 3 days of random-order replicate testing of a test sample population of negative-control, no-defect packages and positive-control, with-defect packages. Positive-control packages were prepared using vials each with a single hole laser-drilled through the glass vial wall. Hole creation and hole size certification was performed by Lenox Laser. Validation study results successfully demonstrated the vacuum decay leak test method's ability to accurately and reliably detect those packages with laser-drilled holes greater than or equal to approximately 5 μm in nominal diameter. All development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work. A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated to detect defects in stoppered vial packages containing lyophilized product for injection. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Test method validation study results proved the method capable of detecting holes laser-drilled through the glass vial wall greater than or equal to 5 μm in nominal diameter. Total test time is less than 1 min per package. All method development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work.
Development of “OQALE” Based Reference Module for School Geometry Subject and Analysis of Mathematical Creative Thinking Skills

NASA Astrophysics Data System (ADS)

Wulandari, N. A. D.; Sukestiyarno, Y. L.

2017-04-01

This research aims to develop an OQALE based reference module for school geometry subject that meets the criteria of a valid and practical. OQALE approach is learning by of O = observation, Q = question, A = Analyze, L = Logic, E = Express. Geometry subject presented in the module are a triangle, the Pythagorean theorem, and rectangular. Mathematical skills of creative thinking shown from four aspects: fluency, flexibility, originality and elaboration. Research procedures in the development of reference module using a strategy of the investigation and development described by [2], which is limited to the sixth stage is leading field testing. The focus of this research is to develop a reference module that is valid, practical and able to increase the mathematical creative thinking skills of students. The testing is limited to three teachers, nine students and two mathematic readers using purposive sampling technique. The data validity, practicality, and creative thinking skills upgrading collected through questionnaires, observations, and interviews and analysed with a valid test, practical test, gain test and qualitative descriptive. The results were obtained (1) the validity of the module = 4.52, which is 4.20 ≤ Vm< 5.00 included in the category of very valid; (2) the results of the questionnaire responses of teachers = 4.53, which is 4.20 ≤ Rg< 5.00 included in the category of very good; (3) the results of the survey responses of students = 3.13, which is 2.80 ≤ Rpd< 3.40 included in the category of good with an average percentage of 78%; and (4) increasing skills of creative thinking mathematically nine students through the test of the gain included in the high and medium category. The conclusions of this research are the generated OQALE based reference module for school geometry subjectis valid and practical.
A systematic review of the reliability and validity of discrete choice experiments in valuing non-market environmental goods.

PubMed

Rakotonarivo, O Sarobidy; Schaafsma, Marije; Hockley, Neal

2016-12-01

While discrete choice experiments (DCEs) are increasingly used in the field of environmental valuation, they remain controversial because of their hypothetical nature and the contested reliability and validity of their results. We systematically reviewed evidence on the validity and reliability of environmental DCEs from the past thirteen years (Jan 2003-February 2016). 107 articles met our inclusion criteria. These studies provide limited and mixed evidence of the reliability and validity of DCE. Valuation results were susceptible to small changes in survey design in 45% of outcomes reporting reliability measures. DCE results were generally consistent with those of other stated preference techniques (convergent validity), but hypothetical bias was common. Evidence supporting theoretical validity (consistency with assumptions of rational choice theory) was limited. In content validity tests, 2-90% of respondents protested against a feature of the survey, and a considerable proportion found DCEs to be incomprehensible or inconsequential (17-40% and 10-62% respectively). DCE remains useful for non-market valuation, but its results should be used with caution. Given the sparse and inconclusive evidence base, we recommend that tests of reliability and validity are more routinely integrated into DCE studies and suggest how this might be achieved. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
A Framework for Testing Scientific Software: A Case Study of Testing Amsterdam Discrete Dipole Approximation Software

NASA Astrophysics Data System (ADS)

Shao, Hongbing

Software testing with scientific software systems often suffers from test oracle problem, i.e., lack of test oracles. Amsterdam discrete dipole approximation code (ADDA) is a scientific software system that can be used to simulate light scattering of scatterers of various types. Testing of ADDA suffers from "test oracle problem". In this thesis work, I established a testing framework to test scientific software systems and evaluated this framework using ADDA as a case study. To test ADDA, I first used CMMIE code as the pseudo oracle to test ADDA in simulating light scattering of a homogeneous sphere scatterer. Comparable results were obtained between ADDA and CMMIE code. This validated ADDA for use with homogeneous sphere scatterers. Then I used experimental result obtained for light scattering of a homogeneous sphere to validate use of ADDA with sphere scatterers. ADDA produced light scattering simulation comparable to the experimentally measured result. This further validated the use of ADDA for simulating light scattering of sphere scatterers. Then I used metamorphic testing to generate test cases covering scatterers of various geometries, orientations, homogeneity or non-homogeneity. ADDA was tested under each of these test cases and all tests passed. The use of statistical analysis together with metamorphic testing is discussed as a future direction. In short, using ADDA as a case study, I established a testing framework, including use of pseudo oracles, experimental results and the metamorphic testing techniques to test scientific software systems that suffer from test oracle problems. Each of these techniques is necessary and contributes to the testing of the software under test.
Design and Development Computer-Based E-Learning Teaching Material for Improving Mathematical Understanding Ability and Spatial Sense of Junior High School Students

NASA Astrophysics Data System (ADS)

Nurjanah; Dahlan, J. A.; Wibisono, Y.

2017-02-01

This paper aims to make a design and development computer-based e-learning teaching material for improving mathematical understanding ability and spatial sense of junior high school students. Furthermore, the particular aims are (1) getting teaching material design, evaluation model, and intrument to measure mathematical understanding ability and spatial sense of junior high school students; (2) conducting trials computer-based e-learning teaching material model, asessment, and instrument to develop mathematical understanding ability and spatial sense of junior high school students; (3) completing teaching material models of computer-based e-learning, assessment, and develop mathematical understanding ability and spatial sense of junior high school students; (4) resulting research product is teaching materials of computer-based e-learning. Furthermore, the product is an interactive learning disc. The research method is used of this study is developmental research which is conducted by thought experiment and instruction experiment. The result showed that teaching materials could be used very well. This is based on the validation of computer-based e-learning teaching materials, which is validated by 5 multimedia experts. The judgement result of face and content validity of 5 validator shows that the same judgement result to the face and content validity of each item test of mathematical understanding ability and spatial sense. The reliability test of mathematical understanding ability and spatial sense are 0,929 and 0,939. This reliability test is very high. While the validity of both tests have a high and very high criteria.
FUNCTIONAL PERFORMANCE TESTING OF THE HIP IN ATHLETES: A SYSTEMATIC REVIEW FOR RELIABILITY AND VALIDITY

PubMed Central

Martin, RobRoy L.

2012-01-01

Purpose/Background: The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. Methods: A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. Results: The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Conclusions: Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. Level of Evidence: 2b (Systematic Review of Literature) PMID:22893860
Test Score Stability and Construct Validity of the Adult Manifest Anxiety Scale-College Version Scores among College Students: A Brief Report

ERIC Educational Resources Information Center

Lowe, Patricia A.; Papanastasiou, Elena C.; DeRuyck, Kimberly A.; Reynolds, Cecil R.

2005-01-01

In this study, the authors investigated the temporal stability and construct validity of the Adult Manifest Anxiety Scale-College Version (AMAS-C; C. R. Reynolds, B. O. Richmond, & P. A. Lowe, 2003b) scores. Results indicated that the AMAS-C scores had adequate to excellent test score stability, and evidence supported the construct validity of the…
Temperature and heat flux datasets of a complex object in a fire plume for the validation of fire and thermal response codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jernigan, Dann A.; Blanchat, Thomas K.

It is necessary to improve understanding and develop temporally- and spatially-resolved integral scale validation data of the heat flux incident to a complex object in addition to measuring the thermal response of said object located within the fire plume for the validation of the SIERRA/FUEGO/SYRINX fire and SIERRA/CALORE codes. To meet this objective, a complex calorimeter with sufficient instrumentation to allow validation of the coupling between FUEGO/SYRINX/CALORE has been designed, fabricated, and tested in the Fire Laboratory for Accreditation of Models and Experiments (FLAME) facility. Validation experiments are specifically designed for direct comparison with the computational predictions. Making meaningful comparisonmore » between the computational and experimental results requires careful characterization and control of the experimental features or parameters used as inputs into the computational model. Validation experiments must be designed to capture the essential physical phenomena, including all relevant initial and boundary conditions. This report presents the data validation steps and processes, the results of the penlight radiant heat experiments (for the purpose of validating the CALORE heat transfer modeling of the complex calorimeter), and the results of the fire tests in FLAME.« less
Constructing Aligned Assessments Using Automated Test Construction

ERIC Educational Resources Information Center

Porter, Andrew; Polikoff, Morgan S.; Barghaus, Katherine M.; Yang, Rui

2013-01-01

We describe an innovative automated test construction algorithm for building aligned achievement tests. By incorporating the algorithm into the test construction process, along with other test construction procedures for building reliable and unbiased assessments, the result is much more valid tests than result from current test construction…
Determination of the criterion-related validity of hip joint angle test for estimating hamstring flexibility using a contemporary statistical approach.

PubMed

Sainz de Baranda, Pilar; Rodríguez-Iniesta, María; Ayala, Francisco; Santonja, Fernando; Cejudo, Antonio

2014-07-01

To examine the criterion-related validity of the horizontal hip joint angle (H-HJA) test and vertical hip joint angle (V-HJA) test for estimating hamstring flexibility measured through the passive straight-leg raise (PSLR) test using contemporary statistical measures. Validity study. Controlled laboratory environment. One hundred thirty-eight professional trampoline gymnasts (61 women and 77 men). Hamstring flexibility. Each participant performed 2 trials of H-HJA, V-HJA, and PSLR tests in a randomized order. The criterion-related validity of H-HJA and V-HJA tests was measured through the estimation equation, typical error of the estimate (TEEST), validity correlation (β), and their respective confidence limits. The findings from this study suggest that although H-HJA and V-HJA tests showed moderate to high validity scores for estimating hamstring flexibility (standardized TEEST = 0.63; β = 0.80), the TEEST statistic reported for both tests was not narrow enough for clinical purposes (H-HJA = 10.3 degrees; V-HJA = 9.5 degrees). Subsequently, the predicted likely thresholds for the true values that were generated were too wide (H-HJA = predicted value ± 13.2 degrees; V-HJA = predicted value ± 12.2 degrees). The results suggest that although the HJA test showed moderate to high validity scores for estimating hamstring flexibility, the prediction intervals between the HJA and PSLR tests are not strong enough to suggest that clinicians and sport medicine practitioners should use the HJA and PSLR tests interchangeably as gold standard measurement tools to evaluate and detect short hamstring muscle flexibility.
Improving the quality of discrete-choice experiments in health: how can we assess validity and reliability?

PubMed

Janssen, Ellen M; Marshall, Deborah A; Hauber, A Brett; Bridges, John F P

2017-12-01

The recent endorsement of discrete-choice experiments (DCEs) and other stated-preference methods by regulatory and health technology assessment (HTA) agencies has placed a greater focus on demonstrating the validity and reliability of preference results. Areas covered: We present a practical overview of tests of validity and reliability that have been applied in the health DCE literature and explore other study qualities of DCEs. From the published literature, we identify a variety of methods to assess the validity and reliability of DCEs. We conceptualize these methods to create a conceptual model with four domains: measurement validity, measurement reliability, choice validity, and choice reliability. Each domain consists of three categories that can be assessed using one to four procedures (for a total of 24 tests). We present how these tests have been applied in the literature and direct readers to applications of these tests in the health DCE literature. Based on a stakeholder engagement exercise, we consider the importance of study characteristics beyond traditional concepts of validity and reliability. Expert commentary: We discuss study design considerations to assess the validity and reliability of a DCE, consider limitations to the current application of tests, and discuss future work to consider the quality of DCEs in healthcare.
Comment on Hall et al. (2017), "How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial".

PubMed

Sabour, Siamak

2018-03-08

The purpose of this letter, in response to Hall, Mehta, and Fackrell (2017), is to provide important knowledge about methodology and statistical issues in assessing the reliability and validity of an audiologist-administered tinnitus loudness matching test and a patient-reported tinnitus loudness rating. The author uses reference textbooks and published articles regarding scientific assessment of the validity and reliability of a clinical test to discuss the statistical test and the methodological approach in assessing validity and reliability in clinical research. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess reliability and validity. The qualitative variables of sensitivity, specificity, positive predictive value, negative predictive value, false positive and false negative rates, likelihood ratio positive and likelihood ratio negative, as well as odds ratio (i.e., ratio of true to false results), are the most appropriate estimates to evaluate validity of a test compared to a gold standard. In the case of quantitative variables, depending on distribution of the variable, Pearson r or Spearman rho can be applied. Diagnostic accuracy (validity) and diagnostic precision (reliability or agreement) are two completely different methodological issues. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess validity.
Impact on Participation and Autonomy: Test of Validity and Reliability for Older Persons.

PubMed

Hammar, Isabelle Ottenvall; Ekelund, Christina; Wilhelmson, Katarina; Eklund, Kajsa

2014-11-06

In research and healthcare it is important to measure older persons' self-determination in order to improve their possibilities to decide for themselves in daily life. The questionnaire Impact on Participation and Autonomy (IPA) assesses self-determination, but is not constructed for older persons. The aim of this study was to examine the validity and reliability of the IPA-S questionnaire for persons aged 70 years and older. The study was performed in two steps; first a validity test of the Swedish version of the questionnaire, IPA-S, followed by a reliability test-retest of an adjusted version. The validity was tested with focus groups and individual interviews on persons aged 77-88 years, and the reliability on persons aged 70-99 years. The validity test result showed that IPA-S is valid for older persons but it was too extensive and the phrasing of the items needed adjustments. The reliability test-retest on the adjusted questionnaire, IPA- Older persons (IPA-O), showed that 15 of 22 items had high agreement. IPA-O can be used to measure older persons' self-determination in their care and rehabilitation.
Initial validation of a web-based self-administered neuropsychological test battery for older adults and seniors

PubMed Central

Hansen, Tor Ivar; Haferstrom, Elise Christina D.; Brunner, Jan F.; Lehn, Hanne; Håberg, Asta Kristine

2015-01-01

Introduction: Computerized neuropsychological tests are effective in assessing different cognitive domains, but are often limited by the need of proprietary hardware and technical staff. Web-based tests can be more accessible and flexible. We aimed to investigate validity, effects of computer familiarity, education, and age, and the feasibility of a new web-based self-administered neuropsychological test battery (Memoro) in older adults and seniors. Method: A total of 62 (37 female) participants (mean age 60.7 years) completed the Memoro web-based neuropsychological test battery and a traditional battery composed of similar tests intended to measure the same cognitive constructs. Participants were assessed on computer familiarity and how they experienced the two batteries. To properly test the factor structure of Memoro, an additional factor analysis in 218 individuals from the HUNT population was performed. Results: Comparing Memoro to traditional tests, we observed good concurrent validity (r = .49–.63). The performance on the traditional and Memoro test battery was consistent, but differences in raw scores were observed with higher scores on verbal memory and lower in spatial memory in Memoro. Factor analysis indicated two factors: verbal and spatial memory. There were no correlations between test performance and computer familiarity after adjustment for age or age and education. Subjects reported that they preferred web-based testing as it allowed them to set their own pace, and they did not feel scrutinized by an administrator. Conclusions: Memoro showed good concurrent validity compared to neuropsychological tests measuring similar cognitive constructs. Based on the current results, Memoro appears to be a tool that can be used to assess cognitive function in older and senior adults. Further work is necessary to ascertain its validity and reliability. PMID:26009791
10 CFR 26.75 - Sanctions.

Code of Federal Regulations, 2012 CFR

2012-01-01

... initial test results for marijuana or cocaine metabolites from a specimen that is reported to be valid on... respect to positive initial drug test results from a licensee testing facility for marijuana and cocaine...
10 CFR 26.75 - Sanctions.

Code of Federal Regulations, 2014 CFR

2014-01-01

... initial test results for marijuana or cocaine metabolites from a specimen that is reported to be valid on... respect to positive initial drug test results from a licensee testing facility for marijuana and cocaine...
10 CFR 26.75 - Sanctions.

Code of Federal Regulations, 2013 CFR

2013-01-01

... initial test results for marijuana or cocaine metabolites from a specimen that is reported to be valid on... respect to positive initial drug test results from a licensee testing facility for marijuana and cocaine...

The Chimera of Validity

ERIC Educational Resources Information Center

Baker, Eva L.

2013-01-01

Background/Context: Education policy over the past 40 years has focused on the importance of accountability in school improvement. Although much of the scholarly discourse around testing and assessment is technical and statistical, understanding of validity by a non-specialist audience is essential as long as test results drive our educational…
A method for validation of finite element forming simulation on basis of a pointwise comparison of distance and curvature

NASA Astrophysics Data System (ADS)

Dörr, Dominik; Joppich, Tobias; Schirmaier, Fabian; Mosthaf, Tobias; Kärger, Luise; Henning, Frank

2016-10-01

Thermoforming of continuously fiber reinforced thermoplastics (CFRTP) is ideally suited to thin walled and complex shaped products. By means of forming simulation, an initial validation of the producibility of a specific geometry, an optimization of the forming process and the prediction of fiber-reorientation due to forming is possible. Nevertheless, applied methods need to be validated. Therefor a method is presented, which enables the calculation of error measures for the mismatch between simulation results and experimental tests, based on measurements with a conventional coordinate measuring device. As a quantitative measure, describing the curvature is provided, the presented method is also suitable for numerical or experimental sensitivity studies on wrinkling behavior. The applied methods for forming simulation, implemented in Abaqus explicit, are presented and applied to a generic geometry. The same geometry is tested experimentally and simulation and test results are compared by the proposed validation method.
Does the Defining Issues Test measure ethical judgment ability or political position?

PubMed

Bailey, Charles D

2011-01-01

This article addresses the construct validity of the Defining Issues Test of ethical judgment (DIT/DIT-2). Alleging a political bias in the test, Emler and colleagues (1983, 1998, 1999, 2007), show that conservatives score higher when asked to fake as liberals, implying that they understand the reasoning associated with "higher" moral development but avoid items they see as liberally biased. DIT proponents challenge the internal validity of faking studies, advocating an explained-variance validation. This study takes a new approach: Adult participants complete the DIT-2, then evaluate the raw responses of others to discern political orientation and ethical development. Results show that individuals scoring higher on the DIT-2 rank others' ethical judgment in a way consistent with DIT-2-based rankings. Accuracy at assessing political orientation, however, is low. Results support the DIT-2's validity as a measure of ethical development, not an expression of political position.
Consequential Validity and the Transformation of Tests from Measurement Tools to Policy Tools

ERIC Educational Resources Information Center

Welner, Kevin G.

2013-01-01

Background/Context: Recent U.S. policy has brought a shift in assessment use, from measurement tools to policy levers. In particular, testing has become a core part of teacher evaluation policies in many states, with test results becoming akin to a job evaluation. Purpose: To explore the notion of consequential validity in assessment use and…
Vision Test Validation Study for the Health Examination Survey Among Youths 12-17 years.

ERIC Educational Resources Information Center

Roberts, Jean

A validation study of the vision test battery used in the Health Examination Survey of 1966-1970 was conducted among 210 youths 12-17 years-old who had been part of the larger survey. The study was designed to discover the degree of correspondence between survey test results and clinical examination by an opthalmologist in determining the…
Development of a Three-Tier Test as a Valid Diagnostic Tool for Identification of Misconceptions Related to Carbohydrates

ERIC Educational Resources Information Center

Milenkovic, Dusica D.; Hrin, Tamara N.; Segedinac, Mirjana D.; Horvat, Sasa

2016-01-01

This study describes the development and application of a three-tier test as a valid and reliable tool in diagnosing students' misconceptions regarding some basic concepts about carbohydrates. The test was administrated to students of the Pharmacy Department at the University of Bijeljina (Serb Republic). The results denoted construct and content…
Impact of syncope on quality of life: validation of a measure in patients undergoing tilt testing.

PubMed

Nave-Leal, Elisabete; Oliveira, Mário; Pais-Ribeiro, José; Santos, Sofia; Oliveira, Eunice; Alves, Teresa; Cruz Ferreira, Rui

2015-03-01

Recurrent syncope has a significant impact on quality of life. The development of measurement scales to assess this impact that are easy to use in clinical settings is crucial. The objective of the present study is a preliminary validation of the Impact of Syncope on Quality of Life questionnaire for the Portuguese population. The instrument underwent a process of translation, validation, analysis of cultural appropriateness and cognitive debriefing. A population of 39 patients with a history of recurrent syncope (>1 year) who underwent tilt testing, aged 52.1 ± 16.4 years (21-83), 43.5% male, most in active employment (n=18) or retired (n=13), constituted a convenience sample. The resulting Portuguese version is similar to the original, with 12 items in a single aggregate score, and underwent statistical validation, with assessment of reliability, validity and stability over time. With regard to reliability, the internal consistency of the scale is 0.9. Assessment of convergent and discriminant validity showed statistically significant results (p<0.01). Regarding stability over time, a test-retest of this instrument at six months after tilt testing with 22 patients of the sample who had not undergone any clinical intervention found no statistically significant changes in quality of life. The results indicate that this instrument is of value for assessing quality of life in patients with recurrent syncope in Portugal. Copyright © 2014 Sociedade Portuguesa de Cardiologia. Published by Elsevier España. All rights reserved.
The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda.

PubMed

Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert

2008-12-02

The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda.
Comparison of the Incremental Validity of the Old and New MCAT.

ERIC Educational Resources Information Center

Wolf, Fredric M.; And Others

The predictive and incremental validity of both the Old and New Medical College Admission Test (MCAT) was examined and compared with a sample of over 300 medical students. Results of zero order and incremental validity coefficients, as well as prediction models resulting from all possible subsets regression analyses using Mallow's Cp criterion,…
Translation, cultural adaptation and validation of the Diabetes Attitudes Scale - third version into Brazilian Portuguese 1

PubMed Central

Vieira, Gisele de Lacerda Chaves; Pagano, Adriana Silvino; Reis, Ilka Afonso; Rodrigues, Júlia Santos Nunes; Torres, Heloísa de Carvalho

2018-01-01

ABSTRACT Objective: to perform the translation, adaptation and validation of the Diabetes Attitudes Scale - third version instrument into Brazilian Portuguese. Methods: methodological study carried out in six stages: initial translation, synthesis of the initial translation, back-translation, evaluation of the translated version by the Committee of Judges (27 Linguists and 29 health professionals), pre-test and validation. The pre-test and validation (test-retest) steps included 22 and 120 health professionals, respectively. The Content Validity Index, the analyses of internal consistency and reproducibility were performed using the R statistical program. Results: in the content validation, the instrument presented good acceptance among the Judges with a mean Content Validity Index of 0.94. The scale presented acceptable internal consistency (Cronbach’s alpha = 0.60), while the correlation of the total score at the test and retest moments was considered high (Polychoric Correlation Coefficient = 0.86). The Intra-class Correlation Coefficient, for the total score, presented a value of 0.65. Conclusion: the Brazilian version of the instrument (Escala de Atitudes dos Profissionais em relação ao Diabetes Mellitus) was considered valid and reliable for application by health professionals in Brazil. PMID:29319739
Criterion-Related Validity of the Distance- and Time-Based Walk/Run Field Tests for Estimating Cardiorespiratory Fitness: A Systematic Review and Meta-Analysis

PubMed Central

Mayorga-Vega, Daniel; Bocanegra-Parrilla, Raúl; Ornelas, Martha; Viciana, Jesús

2016-01-01

Objectives The main purpose of the present meta-analysis was to examine the criterion-related validity of the distance- and time-based walk/run tests for estimating cardiorespiratory fitness among apparently healthy children and adults. Materials and Methods Relevant studies were searched from seven electronic bibliographic databases up to August 2015 and through other sources. The Hunter-Schmidt’s psychometric meta-analysis approach was conducted to estimate the population criterion-related validity of the following walk/run tests: 5,000 m, 3 miles, 2 miles, 3,000 m, 1.5 miles, 1 mile, 1,000 m, ½ mile, 600 m, 600 yd, ¼ mile, 15 min, 12 min, 9 min, and 6 min. Results From the 123 included studies, a total of 200 correlation values were analyzed. The overall results showed that the criterion-related validity of the walk/run tests for estimating maximum oxygen uptake ranged from low to moderate (rp = 0.42–0.79), with the 1.5 mile (rp = 0.79, 0.73–0.85) and 12 min walk/run tests (rp = 0.78, 0.72–0.83) having the higher criterion-related validity for distance- and time-based field tests, respectively. The present meta-analysis also showed that sex, age and maximum oxygen uptake level do not seem to affect the criterion-related validity of the walk/run tests. Conclusions When the evaluation of an individual’s maximum oxygen uptake attained during a laboratory test is not feasible, the 1.5 mile and 12 min walk/run tests represent useful alternatives for estimating cardiorespiratory fitness. As in the assessment with any physical fitness field test, evaluators must be aware that the performance score of the walk/run field tests is simply an estimation and not a direct measure of cardiorespiratory fitness. PMID:26987118
Evaluating instruments for quality: testing convergent validity of the consumer emergency care satisfaction scale.

PubMed

Davis, Barbara A; Kiesel, Cynthia K; McFarland, Julie; Collard, Adressa; Coston, Kyle; Keeton, Ada

2005-01-01

Having reliable and valid instruments is a necessity for nurses and others measuring concepts such as patient satisfaction. The purpose of this article is to describe the use of convergence to test the construct validity of the Davis Consumer Emergency Care Satisfaction Scale (CECSS). Results indicate convergence of the CECSS with the Risser Patient Satisfaction Scale and 2 single-item visual analogue scales, therefore supporting construct validity. Persons measuring patient satisfaction with nurse behaviors in the emergency department can confidently use the CECSS.
Personality traits in companion dogs-Results from the VIDOPET.

PubMed

Turcsán, Borbála; Wallis, Lisa; Virányi, Zsófia; Range, Friederike; Müller, Corsin A; Huber, Ludwig; Riemer, Stefanie

2018-01-01

Individual behavioural differences in pet dogs are of great interest from a basic and applied research perspective. Most existing dog personality tests have specific (practical) goals in mind and so focused only on a limited aspect of dogs' personality, such as identifying problematic (aggressive or fearful) behaviours, assessing suitability as working dogs, or improving the results of adoption. Here we aimed to create a comprehensive test of personality in pet dogs that goes beyond traditional practical evaluations by exposing pet dogs to a range of situations they might encounter in everyday life. The Vienna Dog Personality Test (VIDOPET) consists of 15 subtests and was performed on 217 pet dogs. A two-step data reduction procedure (principal component analysis on each subtest followed by an exploratory factor analysis on the subtest components) yielded five factors: Sociability-obedience, Activity-independence, Novelty seeking, Problem orientation, and Frustration tolerance. A comprehensive evaluation of reliability and validity measures demonstrated excellent inter- and intra-observer reliability and adequate internal consistency of all factors. Moreover the test showed good temporal consistency when re-testing a subsample of dogs after an average of 3.8 years-a considerably longer test-retest interval than assessed for any other dog personality test, to our knowledge. The construct validity of the test was investigated by analysing the correlations between the results of video coding and video rating methods and the owners' assessment via a dog personality questionnaire. The results demonstrated good convergent as well as discriminant validity. To conclude, the VIDOPET is not only a highly reliable and valid tool for measuring dog personality, but also the first test to show consistent behavioural traits related to problem solving ability and frustration tolerance in pet dogs.
Personality traits in companion dogs—Results from the VIDOPET

PubMed Central

Wallis, Lisa; Virányi, Zsófia; Range, Friederike; Müller, Corsin A.; Huber, Ludwig; Riemer, Stefanie

2018-01-01

Individual behavioural differences in pet dogs are of great interest from a basic and applied research perspective. Most existing dog personality tests have specific (practical) goals in mind and so focused only on a limited aspect of dogs’ personality, such as identifying problematic (aggressive or fearful) behaviours, assessing suitability as working dogs, or improving the results of adoption. Here we aimed to create a comprehensive test of personality in pet dogs that goes beyond traditional practical evaluations by exposing pet dogs to a range of situations they might encounter in everyday life. The Vienna Dog Personality Test (VIDOPET) consists of 15 subtests and was performed on 217 pet dogs. A two-step data reduction procedure (principal component analysis on each subtest followed by an exploratory factor analysis on the subtest components) yielded five factors: Sociability-obedience, Activity-independence, Novelty seeking, Problem orientation, and Frustration tolerance. A comprehensive evaluation of reliability and validity measures demonstrated excellent inter- and intra-observer reliability and adequate internal consistency of all factors. Moreover the test showed good temporal consistency when re-testing a subsample of dogs after an average of 3.8 years—a considerably longer test-retest interval than assessed for any other dog personality test, to our knowledge. The construct validity of the test was investigated by analysing the correlations between the results of video coding and video rating methods and the owners’ assessment via a dog personality questionnaire. The results demonstrated good convergent as well as discriminant validity. To conclude, the VIDOPET is not only a highly reliable and valid tool for measuring dog personality, but also the first test to show consistent behavioural traits related to problem solving ability and frustration tolerance in pet dogs. PMID:29634747
Validation of the German version of the Nurse-Work Instability Scale: baseline survey findings of a prospective study of a cohort of geriatric care workers

PubMed Central

2013-01-01

Background A prospective study of a cohort of nursing staff from nursing homes was undertaken to validate the Nurse-Work Instability Scale (Nurse-WIS). Baseline investigation data was used to test reliability, construct validity and criterion validity. Method A survey of nursing staff from nursing homes was conducted using a questionnaire containing the Nurse-WIS along with other survey instruments (including SF-12, WAI, SPE). The self-reported number of days’ sick leave taken and if a pension for reduced work capacity was drawn were recorded. The reliability of the scale was checked by item difficulty (P), item discrimination (rjt) and by internal consistency according to Cronbach’s coefficient. The hypotheses for checking construct validity were tested on the basis of correlations. Pearson’s chi-square was used to test concurrent criterion validity; discriminant validity was tested by means of binary logistic regression. Results 396 persons answered the questionnaire (21.3% response rate). More than 80% were female and mostly work full-time in a rotating shift pattern. Following the test for item discrimination, two items were removed from the Nurse-WIS test. According to Cronbach’s (0.927) the scale provides a high degree of measuring accuracy. All hypotheses and assumptions used to test validity were confirmed: As the Nurse-WIS risk increases, health-related quality of life, work ability and job satisfaction decline. Depressive symptoms and a poor subjective prognosis of earning capacity are also more frequent. Musculoskeletal disorders and impairments of psychological well-being are more frequent. Age also influences the Nurse-WIS result. While 12.0% of those below the age of 35 had an increased risk, the figure for those aged over 55 was 50%. Conclusion This study is the first validation study of the Nurse-WIS to date. The Nurse-WIS shows good reliability, good validity and a good level of measuring accuracy. It appears to be suitable for recording prevention and rehabilitation needs among health care workers. If, in the follow-up, the Nurse-WIS likewise proves to be a reliable screening instrument with good predictive validity, it could ensure that suitable action is taken at an early stage, thereby helping to counteract early retirement and the anticipated shortage of health care workers. PMID:24330532
Ride qualities criteria validation/pilot performance study: Flight test results

NASA Technical Reports Server (NTRS)

Nardi, L. U.; Kawana, H. Y.; Greek, D. C.

1979-01-01

Pilot performance during a terrain following flight was studied for ride quality criteria validation. Data from manual and automatic terrain following operations conducted during low level penetrations were analyzed to determine the effect of ride qualities on crew performance. The conditions analyzed included varying levels of turbulence, terrain roughness, and mission duration with a ride smoothing system on and off. Limited validation of the B-1 ride quality criteria and some of the first order interactions between ride qualities and pilot/vehicle performance are highlighted. An earlier B-1 flight simulation program correlated well with the flight test results.
Performance Validation of Version 152.0 ANSER Control Laws for the F-18 HARV

NASA Technical Reports Server (NTRS)

Messina, Michael D.

1996-01-01

The Actuated Nose Strakes for Enhanced Rolling (ANSER) Control Laws were modified as a result of Phase 3 F/A-18 High Alpha Research Vehicle (HARV) flight testing. The control law modifications for the next software release were designated version 152.0. The Ada implementation was tested in the Hardware-In-the-Loop (HIL) simulation and results were compared to those obtained with the NASA Langley batch Fortran implementation of the control laws which are considered the 'truth model.' This report documents the performance validation test results between these implementations for ANSER control law version 152.0.
TESTING BALANCE AND FALL RISK IN PERSONS WITH PARKINSON DISEASE, AN ARGUMENT FOR ECOLOGICALLY VALID TESTING

PubMed Central

Foreman, K. Bo; Addison, Odessa; Kim, Han S.; Dibble, Leland E.

2010-01-01

Introduction Despite clear deficits in postural control, most clinical examination tools lack accuracy in identifying persons with Parkinson disease (PD) who have fallen or are at risk for falls. We assert that this is in part due to the lack of ecological validity of the testing. Methods To test this assertion, we examined the responsiveness and predictive validity of the Functional Gait Assessment (FGA), the Pull test, and the Timed up and Go (TUG) during clinically defined ON and OFF medication states. To address responsiveness, ON/OFF medication performance was compared. To address predictive validity, areas under the curve (AUC) of receiver operating characteristic (ROC) curves were compared. Comparisons were made using separate non-parametric tests. Results Thirty-six persons (24 male, 12 female) with PD (22 fallers, 14 non-fallers) participated. Only the FGA was able to detect differences between fallers and non-fallers for both ON/OFF medication testing. The predictive validity of the FGA and the TUG for fall identification was higher during OFF medication compared to ON medication testing. The predictive validity of the FGA was higher than the TUG and the Pull test during ON and OFF medication testing. Discussion In order to most accurately identify fallers, clinicians should test persons with PD in ecologically relevant conditions and tasks. In this study, interpretation of the OFF medication performance and use of the FGA provided more accurate prediction of those who would fall. PMID:21215674
Development and validation of challenge materials for double-blind, placebo-controlled food challenges in children.

PubMed

Vlieg-Boerstra, Berber J; Bijleveld, Charles M A; van der Heide, Sicco; Beusekamp, Berta J; Wolt-Plompen, Saskia A A; Kukler, Jeanet; Brinkman, Joep; Duiverman, Eric J; Dubois, Anthony E J

2004-02-01

The use of double-blind, placebo-controlled food challenges (DBPCFCs) is considered the gold standard for the diagnosis of food allergy. Despite this, materials and methods used in DBPCFCs have not been standardized. The purpose of this study was to develop and validate recipes for use in DBPCFCs in children by using allergenic foods, preferably in their usual edible form. Recipes containing milk, soy, cooked egg, raw whole egg, peanut, hazelnut, and wheat were developed. For each food, placebo and active test food recipes were developed that met the requirements of acceptable taste, allowance of a challenge dose high enough to elicit reactions in an acceptable volume, optimal matrix ingredients, and good matching of sensory properties of placebo and active test food recipes. Validation was conducted on the basis of sensory tests for difference by using the triangle test and the paired comparison test. Recipes were first tested by volunteers from the hospital staff and subsequently by a professional panel of food tasters in a food laboratory designed for sensory testing. Recipes were considered to be validated if no statistically significant differences were found. Twenty-seven recipes were developed and found to be valid by the volunteer panel. Of these 27 recipes, 17 could be validated by the professional panel. Sensory testing with appropriate statistical analysis allows for objective validation of challenge materials. We recommend the use of professional tasters in the setting of a food laboratory for best results.
A verification library for multibody simulation software

NASA Technical Reports Server (NTRS)

Kim, Sung-Soo; Haug, Edward J.; Frisch, Harold P.

1989-01-01

A multibody dynamics verification library, that maintains and manages test and validation data is proposed, based on RRC Robot arm and CASE backhoe validation and a comparitive study of DADS, DISCOS, and CONTOPS that are existing public domain and commercial multibody dynamic simulation programs. Using simple representative problems, simulation results from each program are cross checked, and the validation results are presented. Functionalities of the verification library are defined, in order to automate validation procedure.

The Predictive Validity of Projective Measures.

ERIC Educational Resources Information Center

Suinn, Richard M.; Oskamp, Stuart

Written for use by clinical practitioners as well as psychological researchers, this book surveys recent literature (1950-1965) on projective test validity by reviewing and critically evaluating studies which shed light on what may reliably be predicted from projective test results. Two major instruments are covered: the Rorschach and the Thematic…
ExEP yield modeling tool and validation test results

NASA Astrophysics Data System (ADS)

Morgan, Rhonda; Turmon, Michael; Delacroix, Christian; Savransky, Dmitry; Garrett, Daniel; Lowrance, Patrick; Liu, Xiang Cate; Nunez, Paul

2017-09-01

EXOSIMS is an open-source simulation tool for parametric modeling of the detection yield and characterization of exoplanets. EXOSIMS has been adopted by the Exoplanet Exploration Programs Standards Definition and Evaluation Team (ExSDET) as a common mechanism for comparison of exoplanet mission concept studies. To ensure trustworthiness of the tool, we developed a validation test plan that leverages the Python-language unit-test framework, utilizes integration tests for selected module interactions, and performs end-to-end crossvalidation with other yield tools. This paper presents the test methods and results, with the physics-based tests such as photometry and integration time calculation treated in detail and the functional tests treated summarily. The test case utilized a 4m unobscured telescope with an idealized coronagraph and an exoplanet population from the IPAC radial velocity (RV) exoplanet catalog. The known RV planets were set at quadrature to allow deterministic validation of the calculation of physical parameters, such as working angle, photon counts and integration time. The observing keepout region was tested by generating plots and movies of the targets and the keepout zone over a year. Although the keepout integration test required the interpretation of a user, the test revealed problems in the L2 halo orbit and the parameterization of keepout applied to some solar system bodies, which the development team was able to address. The validation testing of EXOSIMS was performed iteratively with the developers of EXOSIMS and resulted in a more robust, stable, and trustworthy tool that the exoplanet community can use to simulate exoplanet direct-detection missions from probe class, to WFIRST, up to large mission concepts such as HabEx and LUVOIR.
Federal COBOL Compiler Testing Service Compiler Validation Request Information.

DTIC Science & Technology

1977-05-09

background of the Federal COBOL Compiler Testing Service which was set up by a memorandum of agreement between the National Bureau of Standards and the...Federal Standard, and the requirement of COBOL compiler validation in the procurement process. It also contains a list of all software products...produced by the software Development Division in support of the FCCTS as well as the Validation Summary Reports produced as a result of discharging the
Prevalence of Invalid Performance on Baseline Testing for Sport-Related Concussion by Age and Validity Indicator.

PubMed

Abeare, Christopher A; Messa, Isabelle; Zuccato, Brandon G; Merker, Bradley; Erdodi, Laszlo

2018-03-12

Estimated base rates of invalid performance on baseline testing (base rates of failure) for the management of sport-related concussion range from 6.1% to 40.0%, depending on the validity indicator used. The instability of this key measure represents a challenge in the clinical interpretation of test results that could undermine the utility of baseline testing. To determine the prevalence of invalid performance on baseline testing and to assess whether the prevalence varies as a function of age and validity indicator. This retrospective, cross-sectional study included data collected between January 1, 2012, and December 31, 2016, from a clinical referral center in the Midwestern United States. Participants included 7897 consecutively tested, equivalently proportioned male and female athletes aged 10 to 21 years, who completed baseline neurocognitive testing for the purpose of concussion management. Baseline assessment was conducted with the Immediate Postconcussion Assessment and Cognitive Testing (ImPACT), a computerized neurocognitive test designed for assessment of concussion. Base rates of failure on published ImPACT validity indicators were compared within and across age groups. Hypotheses were developed after data collection but prior to analyses. Of the 7897 study participants, 4086 (51.7%) were male, mean (SD) age was 14.71 (1.78) years, 7820 (99.0%) were primarily English speaking, and the mean (SD) educational level was 8.79 (1.68) years. The base rate of failure ranged from 6.4% to 47.6% across individual indicators. Most of the sample (55.7%) failed at least 1 of 4 validity indicators. The base rate of failure varied considerably across age groups (117 of 140 [83.6%] for those aged 10 years to 14 of 48 [29.2%] for those aged 21 years), representing a risk ratio of 2.86 (95% CI, 2.60-3.16; P < .001). The results for base rate of failure were surprisingly high overall and varied widely depending on the specific validity indicator and the age of the examinee. The strong age association, with 3 of 4 participants aged 10 to 12 years failing validity indicators, suggests that the clinical interpretation and utility of baseline testing in this age group is questionable. These findings underscore the need for close scrutiny of performance validity indicators on baseline testing across age groups.
Verification and Validation of the General Mission Analysis Tool (GMAT)

NASA Technical Reports Server (NTRS)

Hughes, Steven P.; Qureshi, Rizwan H.; Cooley, D. Steven; Parker, Joel J. K.; Grubb, Thomas G.

2014-01-01

This paper describes the processes and results of Verification and Validation (V&V) efforts for the General Mission Analysis Tool (GMAT). We describe the test program and environments, the tools used for independent test data, and comparison results. The V&V effort produced approximately 13,000 test scripts that are run as part of the nightly buildtest process. In addition, we created approximately 3000 automated GUI tests that are run every two weeks. Presenting all test results are beyond the scope of a single paper. Here we present high-level test results in most areas, and detailed test results for key areas. The final product of the V&V effort presented in this paper was GMAT version R2013a, the first Gold release of the software with completely updated documentation and greatly improved quality. Release R2013a was the staging release for flight qualification performed at Goddard Space Flight Center (GSFC) ultimately resulting in GMAT version R2013b.
Crack Growth Behavior in the Threshold Region for High Cycle Fatigue Loading

NASA Technical Reports Server (NTRS)

Forman, R. G.; Zanganeh, M.

2014-01-01

This paper describes the results of a research program conducted to improve the understanding of fatigue crack growth rate behavior in the threshold growth rate region and to answer a question on the validity of threshold region test data. The validity question relates to the view held by some experimentalists that using the ASTM load shedding test method does not produce valid threshold test results and material properties. The question involves the fanning behavior observed in threshold region of da/dN plots for some materials in which the low R-ratio data fans out from the high R-ratio data. This fanning behavior or elevation of threshold values in the low R-ratio tests is generally assumed to be caused by an increase in crack closure in the low R-ratio tests. Also, the increase in crack closure is assumed by some experimentalists to result from using the ASTM load shedding test procedure. The belief is that this procedure induces load history effects which cause remote closure from plasticity and/or roughness changes in the surface morphology. However, experimental studies performed by the authors have shown that the increase in crack closure is a result of extensive crack tip bifurcations that can occur in some materials, particularly in aluminum alloys, when the crack tip cyclic yield zone size becomes less than the grain size of the alloy. This behavior is related to the high stacking fault energy (SFE) property of aluminum alloys which results in easier slip characteristics. Therefore, the fanning behavior which occurs in aluminum alloys is a function of intrinsic dislocation property of the alloy, and therefore, the fanned data does represent the true threshold properties of the material. However, for the corrosion sensitive steel alloys tested in laboratory air, the occurrence of fanning results from fretting corrosion at the crack tips, and these results should not be considered to be representative of valid threshold properties because the fanning is eliminated when testing is performed in dry air.
Testing Standard Reliability Criteria

ERIC Educational Resources Information Center

Sherry, David

2017-01-01

Maul's paper, "Rethinking Traditional Methods of Survey Validation" (Andrew Maul), contains two stages. First he presents empirical results that cast doubt on traditional methods for validating psychological measurement instruments. These results motivate the second stage, a critique of current conceptions of psychological measurement…
Validation of Blockage Interference Corrections in the National Transonic Facility

NASA Technical Reports Server (NTRS)

Walker, Eric L.

2007-01-01

A validation test has recently been constructed for wall interference methods as applied to the National Transonic Facility (NTF). The goal of this study was to begin to address the uncertainty of wall-induced-blockage interference corrections, which will make it possible to address the overall quality of data generated by the facility. The validation test itself is not specific to any particular modeling. For this present effort, the Transonic Wall Interference Correction System (TWICS) as implemented at the NTF is the mathematical model being tested. TWICS uses linear, potential boundary conditions that must first be calibrated. These boundary conditions include three different classical, linear. homogeneous forms that have been historically used to approximate the physical behavior of longitudinally slotted test section walls. Results of the application of the calibrated wall boundary conditions are discussed in the context of the validation test.
Validation of a clinical critical thinking skills test in nursing.

PubMed

Shin, Sujin; Jung, Dukyoo; Kim, Sungeun

2015-01-27

The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability.
Psychometrics of the Home Safety Self-Assessment Tool (HSSAT) to prevent falls in community-dwelling older adults.

PubMed

Tomita, Machiko R; Saharan, Sumandeep; Rajendran, Sheela; Nochajski, Susan M; Schweitzer, Jo A

2014-01-01

OBJECTIVE. To identify psychometric properties of the Home Safety Self-Assessment Tool (HSSAT) to prevent falls in community-dwelling older adults. METHOD. We tested content validity, test-retest reliability, interrater reliability, construct validity, convergent and discriminant validity, and responsiveness to change. RESULTS. The content validity index was .98, the intraclass correlation coefficient for test-retest reliability was .97, and the interrater reliability was .89. The difference on identified risk factors between the use and nonuse of the HSSAT was significant (p = .005). Convergent validity with the Centers for Disease Control and Prevention Home Safety Checklist was high (r = .65), and discriminant validity with fear of falling was very low (r = .10). The responsiveness to change was moderate (standardized response mean = 0.57). CONCLUSION. The HSSAT is a reliable and valid instrument to identify fall risks in a home environment, and the HSSAT booklet is effective as educational material leading to improvement in home safety. Copyright © 2014 by the American Occupational Therapy Association, Inc.
The paced auditory serial addition test for working memory assessment: Psychometric properties

PubMed Central

Nikravesh, Maryam; Jafari, Zahra; Mehrpour, Masoud; Kazemi, Roozbeh; Amiri Shavaki, Younes; Hossienifar, Shamim; Azizi, Mohamad Parsa

2017-01-01

Background: The paced auditory serial addition test (PASAT) was primarily developed to assess the effects of traumatic brain injury on cognitive functioning. Working memory (WM) is one of the most important aspects of cognitive function, and WM impairment is one of the clinically remarkable signs of aphasia. To develop the Persian version of PASAT, an initial version was used in individuals with aphasia (IWA). Methods: In this study, 25 individuals with aphasia (29-60 years) and 85 controls (18-60 years) were included. PASAT was presented in the form of recorded 61 single-digit numbers (1 to 9). The participants repeatedly added the 2 recent digits. The psychometric properties of PASAT including convergent validity (using the digit memory span tasks), divergent validity (using results in the control group and IWA group), and face validity were investigated. Test-retest reliability was considered as well. Results: The relationship between the PASAT and digit memory span tests was moderate to strong in the control group (forward digit memory span test: r= 0.52, p< 0.0001; backward digit memory span test: r = 0.48, p< 0.0001). A strong relationship was found in IWA (forward digit memory span test: r= 0.72, p< 0.0001; backward digit memory span test: r= 0.53, p= 0.006). Also, strong testretest reliability (intraclass correlation= 0.95, p< 0.0001) was observed. Conclusion: According to our results, the PASAT is a valid and reliable test to assess working memory, particularly in IWA. It could be used as a feasible tool for clinical and research applications. PMID:29445690
Recent Developments in Language Assessment and the Case of Four Large-Scale Tests of ESOL Ability

ERIC Educational Resources Information Center

Stoynoff, Stephen

2009-01-01

This review article surveys recent developments and validation activities related to four large-scale tests of L2 English ability: the iBT TOEFL, the IELTS, the FCE, and the TOEIC. In addition to describing recent changes to these tests, the paper reports on validation activities that were conducted on the measures. The results of this research…
A Monte Carlo Simulation Investigating the Validity and Reliability of Ability Estimation in Item Response Theory with Speeded Computer Adaptive Tests

ERIC Educational Resources Information Center

Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M.

2010-01-01

Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Validation of EncephalApp, Smartphone-based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy

PubMed Central

Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

2014-01-01

Background & Aims Detection of covert hepatic encephalopathy (CHE) is difficult but point of care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test–retest reliability, and external validity. Methods Patients with cirrhosis (n=167; 38% with overt HE [OHE]; mean age, 55 years; mean model for end-stage liver disease score, 12) and controls (n=114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were: OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test–retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intra-hepatic portosystemic shunt placement, before and after correction for hyponatremia, to determine external validity. Results All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cut-offs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic (AUROC) value of 0.91; the AUROC value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test–retest reliability was high (intra-class coefficient, 0.83) among 30 patients retested 1–3 months apart. OffTime+OnTime increased significantly (206 vs 255, P=.007) among 10 patients retested 33±7 days after transjugular intra-hepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225, P=.03) in 7 patients tested before and after correction for hyponatremia (126±3 to 132±4 meq/L, P=.01), 10±5 days apart. Conclusions A smartphone app called EncephalApp has good face validity, test–retest reliability, and external validity for the diagnosis of CHE. PMID:24846278
Proposal and validation of a clinical trunk control test in individuals with spinal cord injury.

PubMed

Quinzaños, J; Villa, A R; Flores, A A; Pérez, R

2014-06-01

One of the problems that arise in spinal cord injury (SCI) is alteration in trunk control. Despite the need for standardized scales, these do not exist for evaluating trunk control in SCI. To propose and validate a trunk control test in individuals with SCI. National Institute of Rehabilitation, Mexico. The test was developed and later evaluated for reliability and criteria, content, and construct validity. We carried out 531 tests on 177 patients and found high inter- and intra-rater reliability. In terms of criterion validity, analysis of variance demonstrated a statistically significant difference in the test score of patients with adequate or inadequate trunk control according to the assessment of a group of experts. A receiver operating characteristic curve was plotted for optimizing the instrument's cutoff point, which was determined at 13 points, with a sensitivity of 98% and a specificity of 92.2%. With regard to construct validity, the correlation between the proposed test and the spinal cord independence measure (SCIM) was 0.873 (P=0.001) and that with the evolution time was 0.437 (P=0.001). For testing the hypothesis with qualitative variables, the Kruskal-Wallis test was performed, which resulted in a statistically significant difference between the scores in the proposed scale of each group defined by these variables. It was proven experimentally that the proposed trunk control test is valid and reliable. Furthermore, the test can be used for all patients with SCI despite the type and level of injury.
The Drug Abuse Screening Test preserves its excellent psychometric properties in psychiatric patients evaluated in an emergency setting.

PubMed

Giguère, Charles-Édouard; Potvin, Stéphane

2017-01-01

Substance use disorders (SUDs) are significant risk factors for psychiatric relapses and hospitalizations in psychiatric populations. Unfortunately, no instrument has been validated for the screening of SUDs in psychiatric emergency settings. The Drug Abuse Screening Test (DAST) is widely used in the addiction field, but is has not been validated in that particular context. The objective of the current study is to examine the psychometric properties of the DAST administered to psychiatric populations evaluated in an emergency setting. The DAST was administered to 912 psychiatric patients in an emergency setting, of which 119 had a SUD (excluding those misusing alcohol only). The internal consistency, the construct validity, the test-retest reliability and the predictive validity (using SUD diagnoses) of the DAST were examined. The convergent validity was also examined, using a validated impulsivity scale. Regarding the internal consistency of the DAST, the Cronbach's alpha was 0.88. The confirmatory factor analysis showed that the DAST has one underlying factor. The test-retest reliability analysis produced a correlation coefficient of 0.86. ROC curve analyses produced an area under the curve of 0.799. Interestingly, a sex effect was observed. Finally, the convergent validity analysis showed that the DAST total score is specifically correlated with the sensation seeking dimension of impulsivity. The results of this validation study shows that the DAST preserves its excellent psychometric properties in psychiatric populations evaluated in an emergency setting. These results should encourage the use of the DAST in this unstable clinical situation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Validity and Reliability of Published Comprehensive Theory of Mind Tests for Normal Preschool Children: A Systematic Review

PubMed Central

Ziatabar Ahmadi, Seyyede Zohreh; Jalaie, Shohreh; Ashayeri, Hassan

2015-01-01

Objective: Theory of mind (ToM) or mindreading is an aspect of social cognition that evaluates mental states and beliefs of oneself and others. Validity and reliability are very important criteria when evaluating standard tests; and without them, these tests are not usable. The aim of this study was to systematically review the validity and reliability of published English comprehensive ToM tests developed for normal preschool children. Method: We searched MEDLINE (PubMed interface), Web of Science, Science direct, PsycINFO, and also evidence base Medicine (The Cochrane Library) databases from 1990 to June 2015. Search strategy was Latin transcription of ‘Theory of Mind’ AND test AND children. Also, we manually studied the reference lists of all final searched articles and carried out a search of their references. Inclusion criteria were as follows: Valid and reliable diagnostic ToM tests published from 1990 to June 2015 for normal preschool children; and exclusion criteria were as follows: the studies that only used ToM tests and single tasks (false belief tasks) for ToM assessment and/or had no description about structure, validity or reliability of their tests. Methodological quality of the selected articles was assessed using the Critical Appraisal Skills Programme (CASP). Result: In primary searching, we found 1237 articles in total databases. After removing duplicates and applying all inclusion and exclusion criteria, we selected 11 tests for this systematic review. Conclusion: There were a few valid, reliable and comprehensive ToM tests for normal preschool children. However, we had limitations concerning the included articles. The defined ToM tests were different in populations, tasks, mode of presentations, scoring, mode of responses, times and other variables. Also, they had various validities and reliabilities. Therefore, it is recommended that the researchers and clinicians select the ToM tests according to their psychometric characteristics, validity and reliability. PMID:27006666
Benchmark tests for a Formula SAE Student car prototyping

NASA Astrophysics Data System (ADS)

Mariasiu, Florin

2011-12-01

Aerodynamic characteristics of a vehicle are important elements in its design and construction. A low drag coefficient brings significant fuel savings and increased engine power efficiency. In designing and developing vehicles trough computer simulation process to determine the vehicles aerodynamic characteristics are using dedicated CFD (Computer Fluid Dynamics) software packages. However, the results obtained by this faster and cheaper method, are validated by experiments in wind tunnels tests, which are expensive and were complex testing equipment are used in relatively high costs. Therefore, the emergence and development of new low-cost testing methods to validate CFD simulation results would bring great economic benefits for auto vehicles prototyping process. This paper presents the initial development process of a Formula SAE Student race-car prototype using CFD simulation and also present a measurement system based on low-cost sensors through which CFD simulation results were experimentally validated. CFD software package used for simulation was Solid Works with the FloXpress add-on and experimental measurement system was built using four piezoresistive force sensors FlexiForce type.
Valid methods: the quality assurance of test method development, validation, approval, and transfer for veterinary testing laboratories.

PubMed

Wiegers, Ann L

2003-07-01

Third-party accreditation is a valuable tool to demonstrate a laboratory's competence to conduct testing. Accreditation, internationally and in the United States, has been discussed previously. However, accreditation is only I part of establishing data credibility. A validated test method is the first component of a valid measurement system. Validation is defined as confirmation by examination and the provision of objective evidence that the particular requirements for a specific intended use are fulfilled. The international and national standard ISO/IEC 17025 recognizes the importance of validated methods and requires that laboratory-developed methods or methods adopted by the laboratory be appropriate for the intended use. Validated methods are therefore required and their use agreed to by the client (i.e., end users of the test results such as veterinarians, animal health programs, and owners). ISO/IEC 17025 also requires that the introduction of methods developed by the laboratory for its own use be a planned activity conducted by qualified personnel with adequate resources. This article discusses considerations and recommendations for the conduct of veterinary diagnostic test method development, validation, evaluation, approval, and transfer to the user laboratory in the ISO/IEC 17025 environment. These recommendations are based on those of nationally and internationally accepted standards and guidelines, as well as those of reputable and experienced technical bodies. They are also based on the author's experience in the evaluation of method development and transfer projects, validation data, and the implementation of quality management systems in the area of method development.
Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board.

PubMed

Larsen, Lisbeth Runge; Jørgensen, Martin Grønbech; Junge, Tina; Juul-Kristensen, Birgit; Wedderkopp, Niels

2014-06-10

Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children's movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Fifty-four 10-14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB.

Langley 16- Ft. Transonic Tunnel Pressure Sensitive Paint System

NASA Technical Reports Server (NTRS)

Sprinkle, Danny R.; Obara, Clifford J.; Amer, Tahani R.; Leighty, Bradley D.; Carmine, Michael T.; Sealey, Bradley S.; Burkett, Cecil G.

2001-01-01

This report describes the NASA Langley 16-Ft. Transonic Tunnel Pressure Sensitive Paint (PSP) System and presents results of a test conducted June 22-23, 2000 in the tunnel to validate the PSP system. The PSP system provides global surface pressure measurements on wind tunnel models. The system was developed and installed by PSP Team personnel of the Instrumentation Systems Development Branch and the Advanced Measurement and Diagnostics Branch. A discussion of the results of the validation test follows a description of the system and a description of the test.
Creating a flipbook as a medium of instruction based on the research on activity test of kencur extract

NASA Astrophysics Data System (ADS)

Monika, Icha; Yeni, Laili Fitri; Ariyati, Eka

2016-02-01

This research aimed to reveal the validity of the flipbook as a medium of learning for the sub-material of environmental pollution in the tenth grade based on the results of the activity test of kencur (Kaempferia galanga) extract to control the growth of the Fusarium oxysporum fungus. The research consisted of two stages. First, testing the validity of the medium of flipbook through validation by seven assessors and analyzed based on the total average score of all aspects. Second, testing the activity of the kencur extract against the growth of Fusarium oxysporum by using the experimental method with 10 treatments and 3 repetitions which were analyzed using one-way analysis of variance (ANOVA) test. The making of the flipbook medium was done through the stages of analysis for the potential and problems, data collection, design, validation, and revision. The validation analysis on the flipbook received an average score of 3.7 and was valid to a certain extent, so it could be used in the teaching and learning process especially in the sub-material of environmental pollution in the tenth grade of the senior high school.
MEASURING SPORT-SPECIFIC PHYSICAL ABILITIES IN MALE GYMNASTS: THE MEN'S GYMNASTICS FUNCTIONAL MEASUREMENT TOOL

PubMed Central

Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel

2016-01-01

Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
Validation of biological activity testing procedure of recombinant human interleukin-7.

PubMed

Lutsenko, T N; Kovalenko, M V; Galkin, O Yu

2017-01-01

Validation procedure for method of monitoring the biological activity of reсombinant human interleukin-7 has been developed and conducted according to the requirements of national and international recommendations. This method is based on the ability of recombinant human interleukin-7 to induce proliferation of T lymphocytes. It has been shown that to control the biological activity of recombinant human interleukin-7 peripheral blood mononuclear cells (PBMCs) derived from blood or cell lines can be used. Validation characteristics that should be determined depend on the method, type of product or object test/measurement and biological test systems used in research. The validation procedure for the method of control of biological activity of recombinant human interleukin-7 in peripheral blood mononuclear cells showed satisfactory results on all parameters tested such as specificity, accuracy, precision and linearity.
Development and Psychometric Testing of a Sexual Concerns Questionnaire for Kidney Transplant Recipients.

PubMed

Muehrer, Rebecca J; Lanuza, Dorothy M; Brown, Roger L; Djamali, Arjang

2015-01-01

This study describes the development and psychometric testing of the Sexual Concerns Questionnaire (SCQ) in kidney transplant (KTx) recipients. Construct validity was assessed using the Kroonenberg and Lewis exploratory/confirmatory procedure and testing hypothesized relationships with established questionnaires. Configural and weak invariance were examined across gender, dialysis history, relationship status, and transplant type. Reliability was assessed with Cronbach's alpha, composite reliability, and test-retest reliability. Factor analysis resulted in a 7-factor solution and suggests good model fit. Construct validity was also supported by the tests of hypothesized relationships. Configural and weak invariance were supported for all subgroups. Reliability of the SCQ was also supported. Findings indicate the SCQ is a valid and reliable measure of KTx recipients' sexual concerns.
Non-Nuclear Validation Test Results of a Closed Brayton Cycle Test-Loop

NASA Astrophysics Data System (ADS)

Wright, Steven A.

2007-01-01

Both NASA and DOE have programs that are investigating advanced power conversion cycles for planetary surface power on the moon or Mars, or for next generation nuclear power plants on earth. Although open Brayton cycles are in use for many applications (combined cycle power plants, aircraft engines), only a few closed Brayton cycles have been tested. Experience with closed Brayton cycles coupled to nuclear reactors is even more limited and current projections of Brayton cycle performance are based on analytic models. This report describes and compares experimental results with model predictions from a series of non-nuclear tests using a small scale closed loop Brayton cycle available at Sandia National Laboratories. A substantial amount of testing has been performed, and the information is being used to help validate models. In this report we summarize the results from three kinds of tests. These tests include: 1) test results that are useful for validating the characteristic flow curves of the turbomachinery for various gases ranging from ideal gases (Ar or Ar/He) to non-ideal gases such as CO2, 2) test results that represent shut down transients and decay heat removal capability of Brayton loops after reactor shut down, and 3) tests that map a range of operating power versus shaft speed curve and turbine inlet temperature that are useful for predicting stable operating conditions during both normal and off-normal operating behavior. These tests reveal significant interactions between the reactor and balance of plant. Specifically these results predict limited speed up behavior of the turbomachinery caused by loss of load, the conditions for stable operation, and for direct cooled reactors, the tests reveal that the coast down behavior during loss of power events can extend for hours provided the ultimate heat sink remains available.
Reliability and validity of a talent identification test battery for seated and standing Paralympic throws.

PubMed

Spathis, Jemima Grace; Connick, Mark James; Beckman, Emma Maree; Newcombe, Peter Anthony; Tweedy, Sean Michael

2015-01-01

Paralympic throwing events for athletes with physical impairments comprise seated and standing javelin, shot put, discus and seated club throwing. Identification of talented throwers would enable prediction of future success and promote participation; however, a valid and reliable talent identification battery for Paralympic throwing has not been reported. This study evaluates the reliability and validity of a talent identification battery for Paralympic throws. Participants were non-disabled so that impairment would not confound analyses, and results would provide an indication of normative performance. Twenty-eight non-disabled participants (13 M; 15 F) aged 23.6 years (±5.44) performed five kinematically distinct criterion throws (three seated, two standing) and nine talent identification tests (three anthropometric, six motor); 23 were tested a second time to evaluate test-retest reliability. Talent identification test-retest reliability was evaluated using Intra-class Correlation Coefficient (ICC) and Bland-Altman plots (Limits of Agreement). Spearman's correlation assessed strength of association between criterion throws and talent identification tests. Reliability was generally acceptable (mean ICC = 0.89), but two seated talent identification tests require more extensive familiarisation. Correlation strength (mean rs = 0.76) indicated that the talent identification tests can be used to validly identify individuals with competitively advantageous attributes for each of the five kinematically distinct throwing activities. Results facilitate further research in this understudied area.
Validity and reliability of Patient-Reported Outcomes Measurement Information System (PROMIS) Instruments in Osteoarthritis

PubMed Central

Broderick, Joan E.; Schneider, Stefan; Junghaenel, Doerte U.; Schwartz, Joseph E.; Stone, Arthur A.

2013-01-01

Objective Evaluation of known group validity, ecological validity, and test-retest reliability of four domain instruments from the Patient Reported Outcomes Measurement System (PROMIS) in osteoarthritis (OA) patients. Methods Recruitment of an osteoarthritis sample and a comparison general population (GP) through an Internet survey panel. Pain intensity, pain interference, physical functioning, and fatigue were assessed for 4 consecutive weeks with PROMIS short forms on a daily basis and compared with same-domain Computer Adaptive Test (CAT) instruments that use a 7-day recall. Known group validity (comparison of OA and GP), ecological validity (comparison of aggregated daily measures with CATs), and test-retest reliability were evaluated. Results The recruited samples matched (age, sex, race, ethnicity) the demographic characteristics of the U.S. sample for arthritis and the 2009 Census for the GP. Compliance with repeated measurements was excellent: > 95%. Known group validity for CATs was demonstrated with large effect sizes (pain intensity: 1.42, pain interference: 1.25, and fatigue: .85). Ecological validity was also established through high correlations between aggregated daily measures and weekly CATs (≥ .86). Test-retest validity (7-day) was very good (≥ .80). Conclusion PROMIS CAT instruments demonstrated known group and ecological validity in a comparison of osteoarthritis patients with a general population sample. Adequate test-retest reliability was also observed. These data provide encouraging initial data on the utility of these PROMIS instruments for clinical and research outcomes in osteoarthritis patients. PMID:23592494
Validation of Self-Report on Smoking among University Students in Korea

ERIC Educational Resources Information Center

Lee, Chung Yul; Shin, Sunmi; Lee, Hyeon Kyeong; Hong, Yoon Mi

2009-01-01

Objective: To validate the self-reported smoking status of Korean university students. Methods: Subjects included 322 Korean university in Korea, who participated in an annual health screening. Data on smoking were collected through a self-reported questionnaire and urine test. The data were analyzed by the McNemar test. Results: In the…
Background, College Experiences, and the ACT-COMP Exam: Using Construct Validity to Evaluate Assessment Instruments.

ERIC Educational Resources Information Center

Pike, Gary R.

1989-01-01

A study investigated the appropriateness of the American College Testing Program's College Outcome Measures Program, conducted at the University of Tennessee, Knoxville, by applying the criterion of construct validity. Results indicated that while the test primarily measures individual differences, it is also sensitive to the effects of higher…
The Reliability and Validity of a Performance Task for Evaluating Science Process Skills.

ERIC Educational Resources Information Center

Adams, Cheryll M.; Callahan, Carolyn M.

1995-01-01

The Diet Cola Test was designed as a process assessment of science aptitude in intermediate grade students. Investigations of the instrument's reliability and validity indicated that data did not support use of the instrument for identifying individual students' aptitude. However, results suggested the test's appropriateness for evaluating…
Preliminary Validation of Composite Material Constitutive Characterization

Treesearch

John G. Michopoulos; Athanasios lliopoulos; John C. Hermanson; Adrian C. Orifici; Rodney S. Thomson

2012-01-01

This paper is describing the preliminary results of an effort to validate a methodology developed for composite material constitutive characterization. This methodology involves using massive amounts of data produced from multiaxially tested coupons via a 6-DoF robotic system called NRL66.3 developed at the Naval Research Laboratory. The testing is followed by...
Domestic violence on children: development and validation of an instrument to evaluate knowledge of health professionals 1

PubMed Central

Oliveira, Lanuza Borges; Soares, Fernanda Amaral; Silveira, Marise Fagundes; de Pinho, Lucinéia; Caldeira, Antônio Prates; Leite, Maísa Tavares de Souza

2016-01-01

ABSTRACT Objective: to develop and validate an instrument to evaluate the knowledge of health professionals about domestic violence on children. Method: this was a study conducted with 194 physicians, nurses and dentists. A literature review was performed for preparation of the items and identification of the dimensions. Apparent and content validation was performed using analysis of three experts and 27 professors of the pediatric health discipline. For construct validation, Cronbach's alpha was used, and the Kappa test was applied to verify reproducibility. The criterion validation was conducted using the Student's t-test. Results: the final instrument included 56 items; the Cronbach alpha was 0.734, the Kappa test showed a correlation greater than 0.6 for most items, and the Student t-test showed a statistically significant value to the level of 5% for the two selected variables: years of education and using the Family Health Strategy. Conclusion: the instrument is valid and can be used as a promising tool to develop or direct actions in public health and evaluate knowledge about domestic violence on children. PMID:27556878
NIH Toolbox Cognition Battery (NIHTB-CB): list sorting test to measure working memory.

PubMed

Tulsky, David S; Carlozzi, Noelle; Chiaravalloti, Nancy D; Beaumont, Jennifer L; Kisala, Pamela A; Mungas, Dan; Conway, Kevin; Gershon, Richard

2014-07-01

The List Sorting Working Memory Test was designed to assess working memory (WM) as part of the NIH Toolbox Cognition Battery. List Sorting is a sequencing task requiring children and adults to sort and sequence stimuli that are presented visually and auditorily. Validation data are presented for 268 participants ages 20 to 85 years. A subset of participants (N=89) was retested 7 to 21 days later. As expected, the List Sorting Test had moderately high correlations with other measures of working memory and executive functioning (convergent validity) but a low correlation with a test of receptive vocabulary (discriminant validity). Furthermore, List Sorting demonstrates expected changes over the age span and has excellent test-retest reliability. Collectively, these results provide initial support for the construct validity of the List Sorting Working Memory Measure as a measure of working memory. However, the relationship between the List Sorting Test and general executive function has yet to be determined.
NIH Toolbox Cognition Battery (NIHTB-CB): The List Sorting Test to Measure Working Memory

PubMed Central

Tulsky, David S.; Carlozzi, Noelle; Chiaravalloti, Nancy D.; Beaumont, Jennifer L.; Kisala, Pamela A.; Mungas, Dan; Conway, Kevin; Gershon, Richard

2015-01-01

The List Sorting Working Memory Test was designed to assess working memory (WM) as part of the NIH Toolbox Cognition Battery. List Sorting is a sequencing task requiring children and adults to sort and sequence stimuli that are presented visually and auditorily. Validation data are presented for 268 participants ages 20 to 85 years. A subset of participants (N=89) was retested 7 to 21 days later. As expected, the List Sorting Test had moderately high correlations with other measures of working memory and executive functioning (convergent validity) but a low correlation with a test of receptive vocabulary (discriminant validity). Furthermore, List Sorting demonstrates expected changes over the age span and has excellent test-retest reliability. Collectively, these results provide initial support the construct validity of the List Sorting Working Memory Measure as a measure of working memory. However, the relation between the List Sorting Test and general executive function has yet to be determined. PMID:24959983
Validation of the Narrowing Beam Walking Test in Lower Limb Prosthesis Users.

PubMed

Sawers, Andrew; Hafner, Brian

2018-04-11

To evaluate the content, construct, and discriminant validity of the Narrowing Beam Walking Test (NBWT), a performance-based balance test for lower limb prosthesis users. Cross-sectional study. Research laboratory and prosthetics clinic. Unilateral transtibial and transfemoral prosthesis users (N=40). Not applicable. Content validity was examined by quantifying the percentage of participants receiving maximum or minimum scores (ie, ceiling and floor effects). Convergent construct validity was examined using correlations between participants' NBWT scores and scores or times on existing clinical balance tests regularly administered to lower limb prosthesis users. Known-groups construct validity was examined by comparing NBWT scores between groups of participants with different fall histories, amputation levels, amputation etiologies, and functional levels. Discriminant validity was evaluated by analyzing the area under each test's receiver operating characteristic (ROC) curve. No minimum or maximum scores were recorded on the NBWT. NBWT scores demonstrated strong correlations (ρ=.70‒.85) with scores/times on performance-based balance tests (timed Up and Go test, Four Square Step Test, and Berg Balance Scale) and a moderate correlation (ρ=.49) with the self-report Activities-specific Balance Confidence scale. NBWT performance was significantly lower among participants with a history of falls (P=.003), transfemoral amputation (P=.011), and a lower mobility level (P<.001). The NBWT also had the largest area under the ROC curve (.81) and was the only test to exhibit an area that was statistically significantly >.50 (ie, chance). The results provide strong evidence of content, construct, and discriminant validity for the NBWT as a performance-based test of balance ability. The evidence supports its use to assess balance impairments and fall risk in unilateral transtibial and transfemoral prosthesis users. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Development of the Patient Education Materials Assessment Tool (PEMAT): A new measure of understandability and actionability for print and audiovisual patient information

PubMed Central

Shoemaker, Sarah J.; Wolf, Michael S.; Brach, Cindy

2016-01-01

Objective To develop a reliable and valid instrument to assess the understandability and actionability of print and audiovisual materials. Methods We compiled items from existing instruments/guides that the expert panel assessed for face/content validity. We completed four rounds of reliability testing, and produced evidence of construct validity with consumers and readability assessments. Results The experts deemed the PEMAT items face/content valid. Four rounds of reliability testing and refinement were conducted using raters untrained on the PEMAT. Agreement improved across rounds. The final PEMAT showed moderate agreement per Kappa (Average K = 0.57) and strong agreement per Gwet’s AC1 (Average = 0.74). Internal consistency was strong (α = 0.71; Average Item-Total Correlation = 0.62). For construct validation with consumers (n = 47), we found significant differences between actionable and poorly-actionable materials in comprehension scores (76% vs. 63%, p < 0.05) and ratings (8.9 vs. 7.7, p < 0.05). For understandability, there was a significant difference for only one of two topics on consumer numeric scores. For actionability, there were significant positive correlations between PEMAT scores and consumer-testing results, but no relationship for understandability. There were, however, strong, negative correlations between grade-level and both consumer-testing results and PEMAT scores. Conclusions The PEMAT demonstrated strong internal consistency, reliability, and evidence of construct validity. Practice implications The PEMAT can help professionals judge the quality of materials (available at: http://www.ahrq.gov/pemat). PMID:24973195
Evaluating construct validity of the second version of the Copenhagen Psychosocial Questionnaire through analysis of differential item functioning and differential item effect.

PubMed

Bjorner, Jakob Bue; Pejtersen, Jan Hyld

2010-02-01

To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE). We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a one-year register based follow up for long-term sickness absence. DIF was evaluated against age, gender, education, social class, public/private sector employment, and job type using ordinal logistic regression. DIE was evaluated against job satisfaction and self-rated health (using ordinal logistic regression), against depressive symptoms, burnout, and stress (using multiple linear regression), and against long-term sick leave (using a proportional hazards model). We used a cross-validation approach to counter the risk of significant results due to multiple testing. Out of 1,052 tests, we found 599 significant instances of DIF/DIE, 69 of which showed both practical and statistical significance across two independent samples. Most DIF occurred for job type (in 20 cases), while we found little DIF for age, gender, education, social class and sector. DIE seemed to pertain to particular items, which showed DIE in the same direction for several outcome variables. The results allowed a preliminary identification of items that have a positive impact on construct validity and items that have negative impact on construct validity. These results can be used to develop better shortform measures and to improve the conceptual framework, items and scales of the COPSOQ II. We conclude that tests of DIF and DIE are useful for evaluating construct validity.
Validity of the MCAT in Predicting Performance in the First Two Years of Medical School.

ERIC Educational Resources Information Center

Jones, Robert F.; Thomae-Forgues, Maria

1984-01-01

The first systematic summary of predictive validity research on the new Medical College Admission Test (MCAT) is presented. The results show that MCAT scores have significant predictive validity with respect to first- and second-year medical school course grades. Further directions for MCAT validity research are described. (Author/MLW)
Implementation of the validation testing in MPPG 5.a "Commissioning and QA of treatment planning dose calculations-megavoltage photon and electron beams".

PubMed

Jacqmin, Dustin J; Bredfeldt, Jeremy S; Frigo, Sean P; Smilowitz, Jennifer B

2017-01-01

The AAPM Medical Physics Practice Guideline (MPPG) 5.a provides concise guidance on the commissioning and QA of beam modeling and dose calculation in radiotherapy treatment planning systems. This work discusses the implementation of the validation testing recommended in MPPG 5.a at two institutions. The two institutions worked collaboratively to create a common set of treatment fields and analysis tools to deliver and analyze the validation tests. This included the development of a novel, open-source software tool to compare scanning water tank measurements to 3D DICOM-RT Dose distributions. Dose calculation algorithms in both Pinnacle and Eclipse were tested with MPPG 5.a to validate the modeling of Varian TrueBeam linear accelerators. The validation process resulted in more than 200 water tank scans and more than 50 point measurements per institution, each of which was compared to a dose calculation from the institution's treatment planning system (TPS). Overall, the validation testing recommended in MPPG 5.a took approximately 79 person-hours for a machine with four photon and five electron energies for a single TPS. Of the 79 person-hours, 26 person-hours required time on the machine, and the remainder involved preparation and analysis. The basic photon, electron, and heterogeneity correction tests were evaluated with the tolerances in MPPG 5.a, and the tolerances were met for all tests. The MPPG 5.a evaluation criteria were used to assess the small field and IMRT/VMAT validation tests. Both institutions found the use of MPPG 5.a to be a valuable resource during the commissioning process. The validation testing in MPPG 5.a showed the strengths and limitations of the TPS models. In addition, the data collected during the validation testing is useful for routine QA of the TPS, validation of software upgrades, and commissioning of new algorithms. © 2016 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

Improving machine learning reproducibility in genetic association studies with proportional instance cross validation (PICV).

PubMed

Piette, Elizabeth R; Moore, Jason H

2018-01-01

Machine learning methods and conventions are increasingly employed for the analysis of large, complex biomedical data sets, including genome-wide association studies (GWAS). Reproducibility of machine learning analyses of GWAS can be hampered by biological and statistical factors, particularly so for the investigation of non-additive genetic interactions. Application of traditional cross validation to a GWAS data set may result in poor consistency between the training and testing data set splits due to an imbalance of the interaction genotypes relative to the data as a whole. We propose a new cross validation method, proportional instance cross validation (PICV), that preserves the original distribution of an independent variable when splitting the data set into training and testing partitions. We apply PICV to simulated GWAS data with epistatic interactions of varying minor allele frequencies and prevalences and compare performance to that of a traditional cross validation procedure in which individuals are randomly allocated to training and testing partitions. Sensitivity and positive predictive value are significantly improved across all tested scenarios for PICV compared to traditional cross validation. We also apply PICV to GWAS data from a study of primary open-angle glaucoma to investigate a previously-reported interaction, which fails to significantly replicate; PICV however improves the consistency of testing and training results. Application of traditional machine learning procedures to biomedical data may require modifications to better suit intrinsic characteristics of the data, such as the potential for highly imbalanced genotype distributions in the case of epistasis detection. The reproducibility of genetic interaction findings can be improved by considering this variable imbalance in cross validation implementation, such as with PICV. This approach may be extended to problems in other domains in which imbalanced variable distributions are a concern.
Real-Time Sensor Validation, Signal Reconstruction, and Feature Detection for an RLV Propulsion Testbed

NASA Technical Reports Server (NTRS)

Jankovsky, Amy L.; Fulton, Christopher E.; Binder, Michael P.; Maul, William A., III; Meyer, Claudia M.

1998-01-01

A real-time system for validating sensor health has been developed in support of the reusable launch vehicle program. This system was designed for use in a propulsion testbed as part of an overall effort to improve the safety, diagnostic capability, and cost of operation of the testbed. The sensor validation system was designed and developed at the NASA Lewis Research Center and integrated into a propulsion checkout and control system as part of an industry-NASA partnership, led by Rockwell International for the Marshall Space Flight Center. The system includes modules for sensor validation, signal reconstruction, and feature detection and was designed to maximize portability to other applications. Review of test data from initial integration testing verified real-time operation and showed the system to perform correctly on both hard and soft sensor failure test cases. This paper discusses the design of the sensor validation and supporting modules developed at LeRC and reviews results obtained from initial test cases.
The validation of science virtual test to assess 7th grade students’ critical thinking on matter and heat topic (SVT-MH)

NASA Astrophysics Data System (ADS)

Sya’bandari, Y.; Firman, H.; Rusyati, L.

2018-05-01

The method used in this research was descriptive research for profiling the validation of SVT-MH to measure students’ critical thinking on matter and heat topic in junior high school. The subject is junior high school students of 7th grade (13 years old) while science teacher and expert as the validators. The instruments that used as a tool to obtain the data are rubric expert judgment (content, media, education) and rubric of readability test. There are four steps to validate SVT-MH in 7th grade Junior High School. These steps are analysis of core competence and basic competence based on Curriculum 2013, expert judgment (content, media, education), readability test and trial test (limited and larger trial test). The instrument validation resulted 30 items that represent 8 elements and 21 sub-elements to measure students’ critical thinking based on Inch in matter and heat topic. The alpha Cronbach (α) is 0.642 which means that the instrument is sufficient to measure students’ critical thinking matter and heat topic.
Development and validation of climate change system thinking instrument (CCSTI) for measuring system thinking on climate change content

NASA Astrophysics Data System (ADS)

Meilinda; Rustaman, N. Y.; Firman, H.; Tjasyono, B.

2018-05-01

The Climate Change System Thinking Instrument (CCSTI) is developed to measure a system thinking ability in the concept of climate change. CCSTI is developed in four phase’s development including instrument draft development, validation and evaluation including readable material test, expert validation, and field test. The result of field test is analyzed by looking at the readability score in Cronbach’s alpha test. Draft instrument is tested on college students majoring in Biology Education, Physics Education, and Chemistry Education randomly with a total number of 80 college students. Score of Content Validation Index at 0.86, which means that the CCSTI developed are categorized as very appropriate with question indicators and Cronbach’s alpha about 0.605 which mean categorized undesirable to minimal acceptable. From 45 questions of system thinking, there are 37 valid questions spread in four indicators of system thinking, which are system thinking phase I (pre-requirement), system thinking phase II (basic), system thinking phase III (intermediate), and system thinking phase IV (coherent expert).
Validity and test-retest reliability in assessing current body size with figure drawings in Chinese adolescents.

PubMed

Lo, Wing-Sze; Ho, Sai-Yin; Wong, Bonny Yee-Man; Mak, Kwok-Kei; Lam, Tai-Hing

2011-06-01

The reliability and validity of Stunkard's Figure Rating Scale (FRS) as a measure of current body size (CBS) was established in Western adolescent girls but not in non-Western population. We examined the validity and test-retest reliability of Stunkard's FRS in assessing CBS among Chinese adolescents. Methods. In a school-based survey in Hong Kong, 5666 adolescents (boys: 45.1%; mean age 14.7 years) provided data on self-reported height and weight, CBS, perceived weight status, and health-related quality of life using the Medical Outcomes Study Short-Form version 2 (SF-12v2). Height and weight were also objectively measured. Spearman's correlation was used to assess construct validity, concurrent validity and test-retest reliability. Convergent and discriminant validity were good: CBS correlated strongly with weight and self-reported/measured BMI, but only weakly with SF-12v2. CBS correlated strongly with perceived weight status, showing concurrent validity. Spearman's correlation (r) for CBS was 0.78 for girls and 0.72 for boys indicating good test-retest reliability. Validity and reliability results did not differ significantly between senior and junior grade adolescents. Our findings support the use of Stunkard's FRS to measure body size among Chinese adolescents.
Validation of the Spanish Addiction Severity Index Multimedia Version (S-ASI-MV).

PubMed

Butler, Stephen F; Redondo, José Pedro; Fernandez, Kathrine C; Villapiano, Albert

2009-01-01

This study aimed to develop and test the reliability and validity of a Spanish adaptation of the ASI-MV, a computer administered version of the Addiction Severity Index, called the S-ASI-MV. Participants were 185 native Spanish-speaking adult clients from substance abuse treatment facilities serving Spanish-speaking clients in Florida, New Mexico, California, and Puerto Rico. Participants were administered the S-ASI-MV as well as Spanish versions of the general health subscale of the SF-36, the work and family unit subscales of the Social Adjustment Scale Self-Report, the Michigan Alcohol Screening Test, the alcohol and drug subscales of the Personality Assessment Inventory, and the Hopkins Symptom Checklist-90. Three-to-five-day test-retest reliability was examined along with criterion validity, convergent/discriminant validity, and factorial validity. Measurement invariance between the English and Spanish versions of the ASI-MV was also examined. The S-ASI-MV demonstrated good test-retest reliability (ICCs for composite scores between .59 and .93), criterion validity (rs for composite scores between .66 and .87), and convergent/discriminant validity. Factorial validity and measurement invariance were demonstrated. These results compared favorably with those reported for the original interviewer version of the ASI and the English version of the ASI-MV.
Developing Statistical Physics Course Handout on Distribution Function Materials Based on Science, Technology, Engineering, and Mathematics

NASA Astrophysics Data System (ADS)

Riandry, M. A.; Ismet, I.; Akhsan, H.

2017-09-01

This study aims to produce a valid and practical statistical physics course handout on distribution function materials based on STEM. Rowntree development model is used to produce this handout. The model consists of three stages: planning, development and evaluation stages. In this study, the evaluation stage used Tessmer formative evaluation. It consists of 5 stages: self-evaluation, expert review, one-to-one evaluation, small group evaluation and field test stages. However, the handout is limited to be tested on validity and practicality aspects, so the field test stage is not implemented. The data collection technique used walkthroughs and questionnaires. Subjects of this study are students of 6th and 8th semester of academic year 2016/2017 Physics Education Study Program of Sriwijaya University. The average result of expert review is 87.31% (very valid category). One-to-one evaluation obtained the average result is 89.42%. The result of small group evaluation is 85.92%. From one-to-one and small group evaluation stages, averagestudent response to this handout is 87,67% (very practical category). Based on the results of the study, it can be concluded that the handout is valid and practical.
Validation of NHB 8060.1C, Test 18 Arc Tracking, September 30, 1991

NASA Technical Reports Server (NTRS)

Linley, Larry

2005-01-01

A test project was conducted to validate Test 18 of NASA Handbook (NHB) 8060.1C and, if necessary, identify and recommend improvements in the procedures or criteria of the test. The NHB 8060.1C, Test 18 test system was modified to produce better discrimination of test results. Changes, and their effects on test results, in the graphite immersion-depth, test timing sequence, and atmospheric conditions were investigated for the wire-insulation constructions tested. Based on the test results, the graphite immersion-depths (between 0.8 mm and 1.6 mm), the timing sequence, and the change in the test conditions from ambient to three environments common in manned spaceflight did not significantly affect test results. The criteria used in Test 18 of NHB 8060.1C was found to be appropriate for qualifying arc-tracking and arc-propagation characteristics of wire-insulation materials, Using the Test 18 criteria, Kapton and ETFE were considered inappropriate for use, while PTFE was considered appropriate. Recommendations from this test project for Test 18 of NHB 8060.1C include changing the experimental setup and configurational tests and performing qualification testing in air rather than in the three environments common in manned spaceflight.
Virtual test: A student-centered software to measure student's critical thinking on human disease

NASA Astrophysics Data System (ADS)

Rusyati, Lilit; Firman, Harry

2016-02-01

The study "Virtual Test: A Student-Centered Software to Measure Student's Critical Thinking on Human Disease" is descriptive research. The background is importance of computer-based test that use element and sub element of critical thinking. Aim of this study is development of multiple choices to measure critical thinking that made by student-centered software. Instruments to collect data are (1) construct validity sheet by expert judge (lecturer and medical doctor) and professional judge (science teacher); and (2) test legibility sheet by science teacher and junior high school student. Participants consisted of science teacher, lecturer, and medical doctor as validator; and the students as respondent. Result of this study are describe about characteristic of virtual test that use to measure student's critical thinking on human disease, analyze result of legibility test by students and science teachers, analyze result of expert judgment by science teachers and medical doctor, and analyze result of trial test of virtual test at junior high school. Generally, result analysis shown characteristic of multiple choices to measure critical thinking was made by eight elements and 26 sub elements that developed by Inch et al.; complete by relevant information; and have validity and reliability more than "enough". Furthermore, specific characteristic of multiple choices to measure critical thinking are information in form science comic, table, figure, article, and video; correct structure of language; add source of citation; and question can guide student to critical thinking logically.
Validation of the Social Appearance Anxiety Scale: factor, convergent, and divergent validity.

PubMed

Levinson, Cheri A; Rodebaugh, Thomas L

2011-09-01

The Social Appearance Anxiety Scale (SAAS) was created to assess fear of overall appearance evaluation. Initial psychometric work indicated that the measure had a single-factor structure and exhibited excellent internal consistency, test-retest reliability, and convergent validity. In the current study, the authors further examined the factor, convergent, and divergent validity of the SAAS in two samples of undergraduates. In Study 1 (N = 323), the authors tested the factor structure, convergent, and divergent validity of the SAAS with measures of the Big Five personality traits, negative affect, fear of negative evaluation, and social interaction anxiety. In Study 2 (N = 118), participants completed a body evaluation that included measurements of height, weight, and body fat content. The SAAS exhibited excellent convergent and divergent validity with self-report measures (i.e., self-esteem, trait anxiety, ethnic identity, and sympathy), predicted state anxiety experienced during the body evaluation, and predicted body fat content. In both studies, results confirmed a single-factor structure as the best fit to the data. These results lend additional support for the use of the SAAS as a valid measure of social appearance anxiety.
Performance Tested Method multiple laboratory validation study of ELISA-based assays for the detection of peanuts in food.

PubMed

Park, Douglas L; Coates, Scott; Brewer, Vickery A; Garber, Eric A E; Abouzied, Mohamed; Johnson, Kurt; Ritter, Bruce; McKenzie, Deborah

2005-01-01

Performance Tested Method multiple laboratory validations for the detection of peanut protein in 4 different food matrixes were conducted under the auspices of the AOAC Research Institute. In this blind study, 3 commercially available ELISA test kits were validated: Neogen Veratox for Peanut, R-Biopharm RIDASCREEN FAST Peanut, and Tepnel BioKits for Peanut Assay. The food matrixes used were breakfast cereal, cookies, ice cream, and milk chocolate spiked at 0 and 5 ppm peanut. Analyses of the samples were conducted by laboratories representing industry and international and U.S governmental agencies. All 3 commercial test kits successfully identified spiked and peanut-free samples. The validation study required 60 analyses on test samples at the target level 5 microg peanut/g food and 60 analyses at a peanut-free level, which was designed to ensure that the lower 95% confidence limit for the sensitivity and specificity would not be <90%. The probability that a test sample contains an allergen given a prevalence rate of 5% and a positive test result using a single test kit analysis with 95% sensitivity and 95% specificity, which was demonstrated for these test kits, would be 50%. When 2 test kits are run simultaneously on all samples, the probability becomes 95%. It is therefore recommended that all field samples be analyzed with at least 2 of the validated kits.
Validation of new psychosocial factors questionnaires: a Colombian national study.

PubMed

Villalobos, Gloria H; Vargas, Angélica M; Rondón, Martin A; Felknor, Sarah A

2013-01-01

The study of workers' health problems possibly associated with stressful conditions requires valid and reliable tools for monitoring risk factors. The present study validates two questionnaires to assess psychosocial risk factors for stress-related illnesses within a sample of Colombian workers. The validation process was based on a representative sample survey of 2,360 Colombian employees, aged 18-70 years. Worker response rate was 90%; 46% of the responders were women. Internal consistency was calculated, construct validity was tested with factor analysis and concurrent validity was tested with Spearman correlations. The questionnaires demonstrated adequate reliability (0.88-0.95). Factor analysis confirmed the dimensions proposed in the measurement model. Concurrent validity resulted in significant correlations with stress and health symptoms. "Work and Non-work Psychosocial Factors Questionnaires" were found to be valid and reliable for the assessment of workers' psychosocial factors, and they provide information for research and intervention. Copyright © 2012 Wiley Periodicals, Inc.
Computer simulation of Cerebral Arteriovenous Malformation-validation analysis of hemodynamics parameters.

PubMed

Kumar, Y Kiran; Mehta, Shashi Bhushan; Ramachandra, Manjunath

2017-01-01

The purpose of this work is to provide some validation methods for evaluating the hemodynamic assessment of Cerebral Arteriovenous Malformation (CAVM). This article emphasizes the importance of validating noninvasive measurements for CAVM patients, which are designed using lumped models for complex vessel structure. The validation of the hemodynamics assessment is based on invasive clinical measurements and cross-validation techniques with the Philips proprietary validated software's Qflow and 2D Perfursion. The modeling results are validated for 30 CAVM patients for 150 vessel locations. Mean flow, diameter, and pressure were compared between modeling results and with clinical/cross validation measurements, using an independent two-tailed Student t test. Exponential regression analysis was used to assess the relationship between blood flow, vessel diameter, and pressure between them. Univariate analysis is used to assess the relationship between vessel diameter, vessel cross-sectional area, AVM volume, AVM pressure, and AVM flow results were performed with linear or exponential regression. Modeling results were compared with clinical measurements from vessel locations of cerebral regions. Also, the model is cross validated with Philips proprietary validated software's Qflow and 2D Perfursion. Our results shows that modeling results and clinical results are nearly matching with a small deviation. In this article, we have validated our modeling results with clinical measurements. The new approach for cross-validation is proposed by demonstrating the accuracy of our results with a validated product in a clinical environment.
The bogus taste test: Validity as a measure of laboratory food intake.

PubMed

Robinson, Eric; Haynes, Ashleigh; Hardman, Charlotte A; Kemps, Eva; Higgs, Suzanne; Jones, Andrew

2017-09-01

Because overconsumption of food contributes to ill health, understanding what affects how much people eat is of importance. The 'bogus' taste test is a measure widely used in eating behaviour research to identify factors that may have a causal effect on food intake. However, there has been no examination of the validity of the bogus taste test as a measure of food intake. We conducted a participant level analysis of 31 published laboratory studies that used the taste test to measure food intake. We assessed whether the taste test was sensitive to experimental manipulations hypothesized to increase or decrease food intake. We examined construct validity by testing whether participant sex, hunger and liking of taste test food were associated with the amount of food consumed in the taste test. In addition, we also examined whether BMI (body mass index), trait measures of dietary restraint and over-eating in response to palatable food cues were associated with food consumption. Results indicated that the taste test was sensitive to experimental manipulations hypothesized to increase or decrease food intake. Factors that were reliably associated with increased consumption during the taste test were being male, have a higher baseline hunger, liking of the taste test food and a greater tendency to overeat in response to palatable food cues, whereas trait dietary restraint and BMI were not. These results indicate that the bogus taste test is likely to be a valid measure of food intake and can be used to identify factors that have a causal effect on food intake. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Measuring the Sensitivity and Construct Validity of 6 Utility Instruments in 7 Disease Areas.

PubMed

Richardson, Jeff; Iezzi, Angelo; Khan, Munir A; Chen, Gang; Maxwell, Aimee

2016-02-01

Health services that affect quality of life (QoL) are increasingly evaluated using cost utility analyses (CUA). These commonly employ one of a small number of multiattribute utility instruments (MAUI) to assess the effects of the health service on utility. However, the MAUI differ significantly, and the choice of instrument may alter the outcome of an evaluation. The present article has 2 objectives: 1) to compare the results of 3 measures of the sensitivity of 6 MAUI and the results of 6 tests of construct validity in 7 disease areas and 2) to rank the MAUI by each of the test results in each disease area and by an overall composite index constructed from the tests. Patients and the general public were administered a battery of instruments, which included the 6 MAUI, disease-specific QoL instruments (DSI), and 6 other comparator instruments. In each disease area, instrument sensitivity was measured 3 ways: by the unadjusted mean difference in utility between public and patient groups, by the value of the effect size, and by the correlation between MAUI and DSI scores. Content and convergent validity were tested by comparison of MAUI utilities and scores from the 6 comparator instruments. These included 2 measures of health state preferences, measures of subjective well-being and capabilities, and generic measures of physical and mental QoL derived from the SF-36. The apparent sensitivity of instruments varied significantly with the measurement method and by disease area. Validation test results varied with the comparator instruments. Notwithstanding this variability, the 15D, AQoL-8D, and the SF-6D generally achieved better test results than the QWB and EQ-5D-5L. © The Author(s) 2015.
Radiant Energy Measurements from a Scaled Jet Engine Axisymmetric Exhaust Nozzle for a Baseline Code Validation Case

NASA Technical Reports Server (NTRS)

Baumeister, Joseph F.

1994-01-01

A non-flowing, electrically heated test rig was developed to verify computer codes that calculate radiant energy propagation from nozzle geometries that represent aircraft propulsion nozzle systems. Since there are a variety of analysis tools used to evaluate thermal radiation propagation from partially enclosed nozzle surfaces, an experimental benchmark test case was developed for code comparison. This paper briefly describes the nozzle test rig and the developed analytical nozzle geometry used to compare the experimental and predicted thermal radiation results. A major objective of this effort was to make available the experimental results and the analytical model in a format to facilitate conversion to existing computer code formats. For code validation purposes this nozzle geometry represents one validation case for one set of analysis conditions. Since each computer code has advantages and disadvantages based on scope, requirements, and desired accuracy, the usefulness of this single nozzle baseline validation case can be limited for some code comparisons.
The predictive validity of the BioMedical Admissions Test for pre-clinical examination performance.

PubMed

Emery, Joanne L; Bell, John F

2009-06-01

Some medical courses in the UK have many more applicants than places and almost all applicants have the highest possible previous and predicted examination grades. The BioMedical Admissions Test (BMAT) was designed to assist in the student selection process specifically for a number of 'traditional' medical courses with clear pre-clinical and clinical phases and a strong focus on science teaching in the early years. It is intended to supplement the information provided by examination results, interviews and personal statements. This paper reports on the predictive validity of the BMAT and its predecessor, the Medical and Veterinary Admissions Test. Results from the earliest 4 years of the test (2000-2003) were matched to the pre-clinical examination results of those accepted onto the medical course at the University of Cambridge. Correlation and logistic regression analyses were performed for each cohort. Section 2 of the test ('Scientific Knowledge') correlated more strongly with examination marks than did Section 1 ('Aptitude and Skills'). It also had a stronger relationship with the probability of achieving the highest examination class. The BMAT and its predecessor demonstrate predictive validity for the pre-clinical years of the medical course at the University of Cambridge. The test identifies important differences in skills and knowledge between candidates, not shown by their previous attainment, which predict their examination performance. It is thus a valid source of additional admissions information for medical courses with a strong scientific emphasis when previous attainment is very high.
Validity and reliability of a self-report instrument to assess social support and physical environmental correlates of physical activity in adolescents

PubMed Central

2012-01-01

Background The purpose of this study was to examine the internal consistency, test-retest reliability, construct validity and predictive validity of a new German self-report instrument to assess the influence of social support and the physical environment on physical activity in adolescents. Methods Based on theoretical consideration, the short scales on social support and physical environment were developed and cross-validated in two independent study samples of 9 to 17 year-old girls and boys. The longitudinal sample of Study I (n = 196) was recruited from a German comprehensive school, and subjects in this study completed the questionnaire twice with a between-test interval of seven days. Cronbach’s alphas were computed to determine the internal consistency of the factors. Test-retest reliability of the latent factors was assessed using intra-class coefficients. Factorial validity of the scales was assessed using principle components analysis. Construct validity was determined using a cross-validation technique by performing confirmatory factor analysis with the independent nationwide cross-sectional sample of Study II (n = 430). Correlations between factors and three measures of physical activity (objectively measured moderate-to-vigorous physical activity (MVPA), self-reported habitual MVPA and self-reported recent MVPA) were calculated to determine the predictive validity of the instrument. Results Construct validity of the social support scale (two factors: parental support and peer support) and the physical environment scale (four factors: convenience, public recreation facilities, safety and private sport providers) was shown. Both scales had moderate test-retest reliability. The factors of the social support scale also had good internal consistency and predictive validity. Internal consistency and predictive validity of the physical environment scale were low to acceptable. Conclusions The results of this study indicate moderate to good reliability and construct validity of the social support scale and physical environment scale. Predictive validity was only confirmed for the social support scale but not for the physical environment scale. Hence, it remains unclear if a person’s physical environment has a direct or an indirect effect on physical activity behavior or a moderation function. PMID:22928865
Development and Validation of a Mobile Device-based External Ventricular Drain Simulator.

PubMed

Morone, Peter J; Bekelis, Kimon; Root, Brandon K; Singer, Robert J

2017-10-01

Multiple external ventricular drain (EVD) simulators have been created, yet their cost, bulky size, and nonreusable components limit their accessibility to residency programs. To create and validate an animated EVD simulator that is accessible on a mobile device. We developed a mobile-based EVD simulator that is compatible with iOS (Apple Inc., Cupertino, California) and Android-based devices (Google, Mountain View, California) and can be downloaded from the Apple App and Google Play Store. Our simulator consists of a learn mode, which teaches users the procedure, and a test mode, which assesses users' procedural knowledge. Twenty-eight participants, who were divided into expert and novice categories, completed the simulator in test mode and answered a postmodule survey. This was graded using a 5-point Likert scale, with 5 representing the highest score. Using the survey results, we assessed the module's face and content validity, whereas construct validity was evaluated by comparing the expert and novice test scores. Participants rated individual survey questions pertaining to face and content validity a median score of 4 out of 5. When comparing test scores, generated by the participants completing the test mode, the experts scored higher than the novices (mean, 71.5; 95% confidence interval, 69.2 to 73.8 vs mean, 48; 95% confidence interval, 44.2 to 51.6; P < .001). We created a mobile-based EVD simulator that is inexpensive, reusable, and accessible. Our results demonstrate that this simulator is face, content, and construct valid. Copyright © 2017 by the Congress of Neurological Surgeons
Differential validity of the Defense Mechanism Manual for the TAT between Asian Americans and Whites. Thematic Apperception Test.

PubMed

Hibbard, S; Tang, P C; Latko, R; Park, J H; Munn, S; Bolz, S; Somerville, A

2000-12-01

Thematic Apperception Test (Murray, 1943) responses of 69 Asian American (hereafter, Asian) and 83 White students were coded for defenses according to the Defense Mechanism Manual (Cramer, 1991b) and studied for differential validity in predicting paper-and-pencil measures of relevant constructs. Three tests for differential validity were used: (a) differences between validity coefficients, (b) interactions between predictor and ethnicity in criterion prediction, and (c) differences between groups in mean prediction errors using a common regression equation. Modest differential validity was found. It was surprising that the DMM scales were slightly stronger predictors of their criteria among Asians than among Whites and when a common predictor was used, desirable criteria were overpredicted for Asians, whereas undesirable ones were overpredicted for Whites. The results were not affected by acculturation level or English vocabulary among the Asians.

Validation testing of shallow notched round-bar screening test specimens. [for the space shuttle main engine

NASA Technical Reports Server (NTRS)

Vroman, G. A.

1975-01-01

The capability of shallow-notched, round-bar, tensile specimens for screening critical environments as they affect the material fracture properties of the space shuttle main engine was tested and analyzed. Specimens containing a 0.050-inch-deep circumferential sharp notch were cyclically loaded in a 5000-psi hydrogen environment at temperatures of +70 and -15 F. Replication of test results and a marked change in cyclic life because of temperature variation demonstrated the validity of the specimen type to be utilized for screening tests.
Estimation of AUC or Partial AUC under Test-Result-Dependent Sampling.

PubMed

Wang, Xiaofei; Ma, Junling; George, Stephen; Zhou, Haibo

2012-01-01

The area under the ROC curve (AUC) and partial area under the ROC curve (pAUC) are summary measures used to assess the accuracy of a biomarker in discriminating true disease status. The standard sampling approach used in biomarker validation studies is often inefficient and costly, especially when ascertaining the true disease status is costly and invasive. To improve efficiency and reduce the cost of biomarker validation studies, we consider a test-result-dependent sampling (TDS) scheme, in which subject selection for determining the disease state is dependent on the result of a biomarker assay. We first estimate the test-result distribution using data arising from the TDS design. With the estimated empirical test-result distribution, we propose consistent nonparametric estimators for AUC and pAUC and establish the asymptotic properties of the proposed estimators. Simulation studies show that the proposed estimators have good finite sample properties and that the TDS design yields more efficient AUC and pAUC estimates than a simple random sampling (SRS) design. A data example based on an ongoing cancer clinical trial is provided to illustrate the TDS design and the proposed estimators. This work can find broad applications in design and analysis of biomarker validation studies.
Factorial validity and measurement equivalence of the Client Assessment of Treatment Scale for psychiatric inpatient care - a study in three European countries.

PubMed

Richardson, Michelle; Katsakou, Christina; Torres-González, Francisco; Onchev, George; Kallert, Thomas; Priebe, Stefan

2011-06-30

Patients' views of inpatient care need to be assessed for research and routine evaluation. For this a valid instrument is required. The Client Assessment of Treatment Scale (CAT) has been used in large scale international studies, but its psychometric properties have not been well established. The structural validity of the CAT was tested among involuntary inpatients with psychosis. Data from locations in three separate European countries (England, Spain and Bulgaria) were collected. The factorial validity was initially tested using single sample confirmatory factor analyses in each country. Subsequent multi-sample analyses were used to test for invariance of the factor loadings, and factor variances across the countries. Results provide good initial support for the factorial validity and invariance of the CAT scores. Future research is needed to cross-validate these findings and to generalise them to other countries, treatment settings, and patient populations. Copyright © 2011 Elsevier Ltd. All rights reserved.
EUCLID/NISP GRISM qualification model AIT/AIV campaign: optical, mechanical, thermal and vibration tests

NASA Astrophysics Data System (ADS)

Caillat, A.; Costille, A.; Pascal, S.; Rossin, C.; Vives, S.; Foulon, B.; Sanchez, P.

2017-09-01

Dark matter and dark energy mysteries will be explored by the Euclid ESA M-class space mission which will be launched in 2020. Millions of galaxies will be surveyed through visible imagery and NIR imagery and spectroscopy in order to map in three dimensions the Universe at different evolution stages over the past 10 billion years. The massive NIR spectroscopic survey will be done efficiently by the NISP instrument thanks to the use of grisms (for "Grating pRISMs") developed under the responsibility of the LAM. In this paper, we present the verification philosophy applied to test and validate each grism before the delivery to the project. The test sequence covers a large set of verifications: optical tests to validate efficiency and WFE of the component, mechanical tests to validate the robustness to vibration, thermal tests to validate its behavior in cryogenic environment and a complete metrology of the assembled component. We show the test results obtained on the first grism Engineering and Qualification Model (EQM) which will be delivered to the NISP project in fall 2016.
Reliability and validity of the Children's Fear Survey Schedule-Dental Subscale for Arabic-speaking children: a cross-sectional study.

PubMed

El-Housseiny, Azza A; Alsadat, Farah A; Alamoudi, Najlaa M; El Derwi, Douaa A; Farsi, Najat M; Attar, Moaz H; Andijani, Basil M

2016-04-14

Early recognition of dental fear is essential for the effective delivery of dental care. This study aimed to test the reliability and validity of the Arabic version of the Children's Fear Survey Schedule-Dental Subscale (CFSS-DS). A school-based sample of 1546 children was randomly recruited. The Arabic version of the CFSS-DS was completed by children during class time. The scale was tested for internal consistency and test-retest reliability. To test criterion validity, children's behavior was assessed using the Frankl scale during dental examination, and results were compared with children's CFSS-DS scores. To test the scale's construct validity, scores on "fear of going to the dentist soon" were correlated with CFSS-DS scores. Factor analysis was also used. The Arabic version of the CFSS-DS showed high reliability regarding both test-retest reliability (intraclass correlation = 0.83, p < 0.001) and internal consistency (Cronbach's α = 0.88). It showed good criterion validity: children with negative behavior had significantly higher fear scores (t = 13.67, p < 0.001). It also showed moderate construct validity (Spearman's rho correlation, r = 0.53, p < 0.001). Factor analysis identified the following factors: "fear of invasive dental procedures," "fear of less invasive dental procedures" and "fear of strangers." The Arabic version of the CFSS-DS is a reliable and valid measure of dental fear in Arabic-speaking children. Pediatric dentists and researchers may use this validated version of the CFSS-DS to measure dental fear in Arabic-speaking children.
A Concurrent Test of Accuracy-of-Classification for the Strong Vocational Interest and Kuder Occupational Interest Survey

ERIC Educational Resources Information Center

Zytowski, Donald G.

1972-01-01

Owing to the uncertainty concerning the concurrent validity of the SVIB and the KOIS, a test of accuracy of classification of men in the occupations common to both inventories was undertaken. The results suggest that neither show any less validity than had been shown in separate studies previously. (Author)
Authentication of Electromagnetic Interference Removal in Johnson Noise Thermometry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Britton Jr, Charles L.; Roberts, Michael

This report summarizes the testing performed offsite at the TVA Kingston Fossil Plant (KFP). This location is selected as a valid offsite test facility because the environment is very similar to the expected industrial nuclear power plant environment. This report will discuss the EMI discovered in the environment, the removal technique validity, and results from the measurements.
Automated Vision Test Development and Validation

DTIC Science & Technology

2016-11-01

Deputy Chief, Aerosp Med Consultation Div Chair, Aerospace Medicine Department This report is published in the interest of...produce software for desktop displays; and to evaluate features such as user interfaces, threshold algorithms, validity of results, and screening...cost of performing full threshold testing on over 30% of normal subjects, which is quite time consuming. This effort was accomplished using desktop
The Impact of Model Parameterization and Estimation Methods on Tests of Measurement Invariance with Ordered Polytomous Data

ERIC Educational Resources Information Center

Koziol, Natalie A.; Bovaird, James A.

2018-01-01

Evaluations of measurement invariance provide essential construct validity evidence--a prerequisite for seeking meaning in psychological and educational research and ensuring fair testing procedures in high-stakes settings. However, the quality of such evidence is partly dependent on the validity of the resulting statistical conclusions. Type I or…
Reliability and validity of Kano Test for Social Nicotine Dependence (KTSND), and development of its revised scale assessing the psychosocial acceptability of smoking among university students.

PubMed

Kitada, Masako; Musashi, Manabu; Kano, Masato

2011-08-01

To examine reliability and validity of Kano Test for Social Nicotine Dependence (KTSND), a scale assessing the psychosocial acceptability of smoking, and to develop a new version when validity or reliability of KTSND was not acceptable. We carried out a self-administered cross-sectional survey on undergraduate university students. The participants completed the KTSND, and supplemented three questions on the attitudes toward tobacco control policies and smoking states. Using daily smokers, we examined the relationship between the KTSND and Fagerström Test for Nicotine Dependence (FTND). In each study, we examined test-retest reliability and construct validity, discriminant and convergent validity, and factor validity. Although the KTSND had high internal consistency (Cronbach's a 0.82) and high test-retest reliability (r=0.72), the results of factor analysis were unacceptable; we expected three factors to be extracted, however, only two factors of "Overestimate of smoking usefulness" and "Allege smoking as a taste and/or culture" were extracted. Using the Kano's Test for Assessing Acceptability of Smoking (KTAAS), the new version of KTSND in which a question was replaced with another one, the third factor of "Neglect of harm of tobacco smoking" was extracted adding to the above-mentioned two. KTAAS had also both high internal consistency (Cronbach's alpha 0.82) and test-retest reliability (r=0.66). Overall, the KTSND and the KTAAS score differed according to smoking states, and the nonsmokers' scores were the lowest. The KTSND was a popular questionnaire in Japan, however, its validity assessed using factor analysis was not acceptable, while KTAAS had sufficient reliability and validity, and might assess the cognition and attitude affirming or accepting tobacco smoking among university students.
Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project.

PubMed

Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes

2011-12-09

Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.
Validation of a laboratory and hospital information system in a medical laboratory accredited according to ISO 15189

PubMed Central

Biljak, Vanja Radisic; Ozvald, Ivan; Radeljak, Andrea; Majdenic, Kresimir; Lasic, Branka; Siftar, Zoran; Lovrencic, Marijana Vucic; Flegar-Mestric, Zlata

2012-01-01

Introduction The aim of the study was to present a protocol for laboratory information system (LIS) and hospital information system (HIS) validation at the Institute of Clinical Chemistry and Laboratory Medicine of the Merkur University Hospital, Zagreb, Croatia. Materials and methods: Validity of data traceability was checked by entering all test requests for virtual patient into HIS/LIS and printing corresponding barcoded labels that provided laboratory analyzers with the information on requested tests. The original printouts of the test results from laboratory analyzer(s) were compared with the data obtained from LIS and entered into the provided template. Transfer of data from LIS to HIS was examined by requesting all tests in HIS and creating real data in a finding generated in LIS. Data obtained from LIS and HIS were entered into a corresponding template. The main outcome measure was the accuracy of transfer obtained from laboratory analyzers and results transferred from LIS and HIS expressed as percentage (%). Results: The accuracy of data transfer from laboratory analyzers to LIS was 99.5% and of that from LIS to HIS 100%. Conclusion: We presented our established validation protocol for laboratory information system and demonstrated that a system meets its intended purpose. PMID:22384522
Ares I Scale Model Acoustic Test Liftoff Acoustic Results and Comparisons

NASA Technical Reports Server (NTRS)

Counter, Doug; Houston, Janice

2011-01-01

Conclusions: Ares I-X flight data validated the ASMAT LOA results. Ares I Liftoff acoustic environments were verified with scale model test results. Results showed that data book environments were under-conservative for Frustum (Zone 5). Recommendations: Data book environments can be updated with scale model test and flight data. Subscale acoustic model testing useful for future vehicle environment assessments.
Convergent and diagnostic validity of STAVUX, a word and pseudoword spelling test for adults.

PubMed

Östberg, Per; Backlund, Charlotte; Lindström, Emma

2016-10-01

Few comprehensive spelling tests are available in Swedish, and none have been validated in adults with reading and writing disorders. The recently developed STAVUX test includes word and pseudoword spelling subtests with high internal consistency and adult norms stratified by education. This study evaluated the convergent and diagnostic validity of STAVUX in adults with dyslexia. Forty-six adults, 23 with dyslexia and 23 controls, took STAVUX together with a standard word-decoding test and a self-rated measure of spelling skills. STAVUX subtest scores showed moderate to strong correlations with word-decoding scores and predicted self-rated spelling skills. Word and pseudoword subtest scores both predicted dyslexia status. Receiver-operating characteristic (ROC) analysis showed excellent diagnostic discriminability. Sensitivity was 91% and specificity 96%. In conclusion, the results of this study support the convergent and diagnostic validity of STAVUX.
Development of Decision Support Formulas for the Prediction of Bladder Outlet Obstruction and Prostatic Surgery in Patients With Lower Urinary Tract Symptom/Benign Prostatic Hyperplasia: Part II, External Validation and Usability Testing of a Smartphone App.

PubMed

Choo, Min Soo; Jeong, Seong Jin; Cho, Sung Yong; Yoo, Changwon; Jeong, Chang Wook; Ku, Ja Hyeon; Oh, Seung-June

2017-04-01

We aimed to externally validate the prediction model we developed for having bladder outlet obstruction (BOO) and requiring prostatic surgery using 2 independent data sets from tertiary referral centers, and also aimed to validate a mobile app for using this model through usability testing. Formulas and nomograms predicting whether a subject has BOO and needs prostatic surgery were validated with an external validation cohort from Seoul National University Bundang Hospital and Seoul Metropolitan Government-Seoul National University Boramae Medical Center between January 2004 and April 2015. A smartphone-based app was developed, and 8 young urologists were enrolled for usability testing to identify any human factor issues of the app. A total of 642 patients were included in the external validation cohort. No significant differences were found in the baseline characteristics of major parameters between the original (n=1,179) and the external validation cohort, except for the maximal flow rate. Predictions of requiring prostatic surgery in the validation cohort showed a sensitivity of 80.6%, a specificity of 73.2%, a positive predictive value of 49.7%, and a negative predictive value of 92.0%, and area under receiver operating curve of 0.84. The calibration plot indicated that the predictions have good correspondence. The decision curve showed also a high net benefit. Similar evaluation results using the external validation cohort were seen in the predictions of having BOO. Overall results of the usability test demonstrated that the app was user-friendly with no major human factor issues. External validation of these newly developed a prediction model demonstrated a moderate level of discrimination, adequate calibration, and high net benefit gains for predicting both having BOO and requiring prostatic surgery. Also a smartphone app implementing the prediction model was user-friendly with no major human factor issue.
Development of Modal Test Techniques for Validation of a Solar Sail Design

NASA Technical Reports Server (NTRS)

Gaspar, James L.; Mann, Troy; Behun, Vaughn; Wilkie, W. Keats; Pappa, Richard

2004-01-01

This paper focuses on the development of modal test techniques for validation of a solar sail gossamer space structure design. The major focus is on validating and comparing the capabilities of various excitation techniques for modal testing solar sail components. One triangular shaped quadrant of a solar sail membrane was tested in a 1 Torr vacuum environment using various excitation techniques including, magnetic excitation, and surface-bonded piezoelectric patch actuators. Results from modal tests performed on the sail using piezoelectric patches at different positions are discussed. The excitation methods were evaluated for their applicability to in-vacuum ground testing and to the development of on orbit flight test techniques. The solar sail membrane was tested in the horizontal configuration at various tension levels to assess the variation in frequency with tension in a vacuum environment. A segment of a solar sail mast prototype was also tested in ambient atmospheric conditions using various excitation techniques, and these methods are also assessed for their ground test capabilities and on-orbit flight testing.
Developing the Persian version of the homophone meaning generation test

PubMed Central

Ebrahimipour, Mona; Motamed, Mohammad Reza; Ashayeri, Hassan; Modarresi, Yahya; Kamali, Mohammad

2016-01-01

Background: Finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. The Homophone Meaning Generation Test (HMGT) can measure the ability to switch between verbal concepts, which is required in word retrieval. The purpose of this study was to adapt and validate the Persian version of the HMGT. Methods: The first phase involved the adaptation of the HMGT to the Persian language. The second phase concerned the psychometric testing. The word-finding performance was assessed in 90 Persian-speaking healthy individuals (20-50 year old; 45 males and 45 females) through three naming tasks: Semantic Fluency, Phonemic Fluency, and Homophone Meaning Generation Test. The participants had no history of neurological or psychiatric diseases, alcohol abuse, severe depression, or history of speech, language, or learning problems. Results: The internal consistency coefficient was larger than 0.8 for all the items with a total Cronbach’s alpha of 0.80. Interrater and intrarater reliability were also excellent. The validity of all items was above 0.77, and the content validity index (0.99) was appropriate. The Persian HMGT had strong convergent validity with semantic and phonemic switching and adequate divergent validity with semantic and phonemic clustering. Conclusion: The Persian version of the Homophone Meaning Generation Test is an appropriate, valid, and reliable test to evaluate the ability to switch between verbal concepts in the assessment of word-finding performance. PMID:27390705
Validation of the OECD reproduction test guideline with the New Zealand mudsnail Potamopyrgus antipodarum using trenbolone and prochloraz.

PubMed

Geiß, Cornelia; Ruppert, Katharina; Askem, Clare; Barroso, Carlos; Faber, Daniel; Ducrot, Virginie; Holbech, Henrik; Hutchinson, Thomas H; Kajankari, Paula; Kinnberg, Karin Lund; Lagadic, Laurent; Matthiessen, Peter; Morris, Steve; Neiman, Maurine; Penttinen, Olli-Pekka; Sanchez-Marin, Paula; Teigeler, Matthias; Weltje, Lennart; Oehlmann, Jörg

2017-04-01

The Organisation for Economic Cooperation and Development (OECD) provides several standard test methods for the environmental hazard assessment of chemicals, mainly based on primary producers, arthropods, and fish. In April 2016, two new test guidelines with two mollusc species representing different reproductive strategies were approved by OECD member countries. One test guideline describes a 28-day reproduction test with the parthenogenetic New Zealand mudsnail Potamopyrgus antipodarum. The main endpoint of the test is reproduction, reflected by the embryo number in the brood pouch per female. The development of a new OECD test guideline involves several phases including inter-laboratory validation studies to demonstrate the robustness of the proposed test design and the reproducibility of the test results. Therefore, a ring test of the reproduction test with P. antipodarum was conducted including eight laboratories with the test substances trenbolone and prochloraz and results are presented here. Most laboratories could meet test validity criteria, thus demonstrating the robustness of the proposed test protocol. Trenbolone did not have an effect on the reproduction of the snails at the tested concentration range (nominal: 10-1000 ng/L). For prochloraz, laboratories produced similar EC 10 and NOEC values, showing the inter-laboratory reproducibility of results. The average EC 10 and NOEC values for reproduction (with coefficient of variation) were 26.2 µg/L (61.7%) and 29.7 µg/L (32.9%), respectively. This ring test shows that the mudsnail reproduction test is a well-suited tool for use in the chronic aquatic hazard and risk assessment of chemicals.
Development of an Agility Test for Badminton Players and Assessment of Its Validity and Test-Retest Reliability.

PubMed

Loureiro, Luiz de França Bahia; de Freitas, Paulo Barbosa

2016-04-01

Badminton requires open and fast actions toward the shuttlecock, but there is no specific agility test for badminton players with specific movements. To develop an agility test that simultaneously assesses perception and motor capacity and examine the test's concurrent and construct validity and its test-retest reliability. The Badcamp agility test consists of running as fast as possible to 6 targets placed on the corners and middle points of a rectangular area (5.6 × 4.2 m) from the start position located in the center of it, following visual stimuli presented in a luminous panel. The authors recruited 43 badminton players (17-32 y old) to evaluate concurrent (with shuttle-run agility test--SRAT) and construct validity and test-retest reliability. Results revealed that Badcamp presents concurrent and construct validity, as its performance is strongly related to SRAT (ρ = 0.83, P < .001), with performance of experts being better than nonexpert players (P < .01). In addition, Badcamp is reliable, as no difference (P = .07) and a high intraclass correlation (ICC = .93) were found in the performance of the players on 2 different occasions. The findings indicate that Badcamp is an effective, valid, and reliable tool to measure agility, allowing coaches and athletic trainers to evaluate players' athletic condition and training effectiveness and possibly detect talented individuals in this sport.
Validating the Adolescent Form of the Substance Abuse Subtle Screening Inventory.

ERIC Educational Resources Information Center

Risberg, Richard A.; And Others

1995-01-01

Tests validity of the Substance Abuse Subtle Screening Inventory (SASSI) in detecting chemical dependency in adolescents (n=107), when compared to the Minnesota Multiphasic Personality Inventory (MMPI) results. Further validation for the SASSI was obtained. Treatment implications and suggestions for further research are provided. (SNR)

Comprehension of Written Grammar Test: Reliability and Known-Groups Validity Study With Hearing and Deaf and Hard-of-Hearing Students.

PubMed

Cannon, Joanna E; Hubley, Anita M; Millhoff, Courtney; Mazlouman, Shahla

2016-01-01

The aim of the current study was to gather validation evidence for the Comprehension of Written Grammar (CWG; Easterbrooks, 2010) receptive test of 26 grammatical structures of English print for use with children who are deaf and hard of hearing (DHH). Reliability and validity data were collected for 98 participants (49 DHH and 49 hearing) in Grades 2-6. The objectives were to: (a) examine 4-week test-retest reliability data; and (b) provide evidence of known-groups validity by examining expected differences between the groups on the CWG vocabulary pretest and main test, as well as selected structures. Results indicated excellent test-retest reliability estimates for CWG test scores. DHH participants performed statistically significantly lower on the CWG vocabulary pretest and main test than the hearing participants. Significantly lower performance by DHH participants on most expected grammatical structures (e.g., basic sentence patterns, auxiliary "be" singular/plural forms, tense, comparatives, and complementation) also provided known groups evidence. Overall, the findings of this study showed strong evidence of the reliability of scores and known group-based validity of inferences made from the CWG. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Development and validation of an energy-balance knowledge test for fourth- and fifth-grade students.

PubMed

Chen, Senlin; Zhu, Xihe; Kang, Minsoo

2017-05-01

A valid test measuring children's energy-balance (EB) knowledge is lacking in research. This study developed and validated the energy-balance knowledge test (EBKT) for fourth and fifth grade students. The original EBKT contained 25 items but was reduced to 23 items based on pilot result and intensive expert panel discussion. De-identified data were collected from 468 fourth and fifth grade students enrolled in four schools to examine the psychometric properties of the EBKT items. The Rasch model analysis was conducted using the Winstep 3.65.0 software. Differential item functioning (DIF) analysis flagged 1 item (item #4) functioning differently between boys and girls, which was deleted. The final 22-item EBKT showed desirable model-data fit indices. The items had large variability ranging from -3.58 logit (item #10, the easiest) to 1.70 logit (item #3, the hardest). The average person ability on the test was 0.28 logit (SD = .78). Additional analyses supported known-group difference validity of the EBKT scores in capturing gender- and grade-based ability differences. The test was overall valid but could be further improved by expanding test items to discern various ability levels. For lack of a better test, researchers and practitioners may use the EBKT to assess fourth- and fifth-grade students' EB knowledge.
A critical analysis of test-retest reliability in instrument validation studies of cancer patients under palliative care: a systematic review

PubMed Central

2014-01-01

Background Patient-reported outcome validation needs to achieve validity and reliability standards. Among reliability analysis parameters, test-retest reliability is an important psychometric property. Retested patients must be in a clinically stable condition. This is particularly problematic in palliative care (PC) settings because advanced cancer patients are prone to a faster rate of clinical deterioration. The aim of this study was to evaluate the methods by which multi-symptom and health-related qualities of life (HRQoL) based on patient-reported outcomes (PROs) have been validated in oncological PC settings with regards to test-retest reliability. Methods A systematic search of PubMed (1966 to June 2013), EMBASE (1980 to June 2013), PsychInfo (1806 to June 2013), CINAHL (1980 to June 2013), and SCIELO (1998 to June 2013), and specific PRO databases was performed. Studies were included if they described a set of validation studies. Studies were included if they described a set of validation studies for an instrument developed to measure multi-symptom or multidimensional HRQoL in advanced cancer patients under PC. The COSMIN checklist was used to rate the methodological quality of the study designs. Results We identified 89 validation studies from 746 potentially relevant articles. From those 89 articles, 31 measured test-retest reliability and were included in this review. Upon critical analysis of the overall quality of the criteria used to determine the test-retest reliability, 6 (19.4%), 17 (54.8%), and 8 (25.8%) of these articles were rated as good, fair, or poor, respectively, and no article was classified as excellent. Multi-symptom instruments were retested over a shortened interval when compared to the HRQoL instruments (median values 24 hours and 168 hours, respectively; p = 0.001). Validation studies that included objective confirmation of clinical stability in their design yielded better results for the test-retest analysis with regard to both pain and global HRQoL scores (p < 0.05). The quality of the statistical analysis and its description were of great concern. Conclusion Test-retest reliability has been infrequently and poorly evaluated. The confirmation of clinical stability was an important factor in our analysis, and we suggest that special attention be focused on clinical stability when designing a PRO validation study that includes advanced cancer patients under PC. PMID:24447633
Validation of Milliflex® Quantum for Bioburden Testing of Pharmaceutical Products.

PubMed

Gordon, Oliver; Goverde, Marcel; Staerk, Alexandra; Roesti, David

2017-01-01

This article reports the validation strategy used to demonstrate that the Milliflex ® Quantum yielded non-inferior results to the traditional bioburden method. It was validated according to USP <1223>, European Pharmacopoeia 5.1.6, and Parenteral Drug Association Technical Report No. 33 and comprised the validation parameters robustness, ruggedness, repeatability, specificity, limit of detection and quantification, accuracy, precision, linearity, range, and equivalence in routine operation. For the validation, a combination of pharmacopeial ATCC strains as well as a broad selection of in-house isolates were used. In-house isolates were used in stressed state. Results were statistically evaluated regarding the pharmacopeial acceptance criterion of ≥70% recovery compared to the traditional method. Post-hoc test power calculations verified the appropriateness of the used sample size to detect such a difference. Furthermore, equivalence tests verified non-inferiority of the rapid method as compared to the traditional method. In conclusion, the rapid bioburden on basis of the Milliflex ® Quantum was successfully validated as alternative method to the traditional bioburden test. LAY ABSTRACT: Pharmaceutical drug products must fulfill specified quality criteria regarding their microbial content in order to ensure patient safety. Drugs that are delivered into the body via injection, infusion, or implantation must be sterile (i.e., devoid of living microorganisms). Bioburden testing measures the levels of microbes present in the bulk solution of a drug before sterilization, and thus it provides important information for manufacturing a safe product. In general, bioburden testing has to be performed using the methods described in the pharmacopoeias (membrane filtration or plate count). These methods are well established and validated regarding their effectiveness; however, the incubation time required to visually identify microbial colonies is long. Thus, alternative methods that detect microbial contamination faster will improve control over the manufacturing process and speed up product release. Before alternative methods may be used, they must undergo a side-by-side comparison with pharmacopeial methods. In this comparison, referred to as validation, it must be shown in a statistically verified manner that the effectiveness of the alternative method is at least equivalent to that of the pharmacopeial methods. Here we describe the successful validation of an alternative bioburden testing method based on fluorescent staining of growing microorganisms applying the Milliflex ® Quantum system by MilliporeSigma. © PDA, Inc. 2017.
Brazilian Center for the Validation of Alternative Methods (BraCVAM) and the process of validation in Brazil.

PubMed

Presgrave, Octavio; Moura, Wlamir; Caldeira, Cristiane; Pereira, Elisabete; Bôas, Maria H Villas; Eskes, Chantra

2016-03-01

The need for the creation of a Brazilian centre for the validation of alternative methods was recognised in 2008, and members of academia, industry and existing international validation centres immediately engaged with the idea. In 2012, co-operation between the Oswaldo Cruz Foundation (FIOCRUZ) and the Brazilian Health Surveillance Agency (ANVISA) instigated the establishment of the Brazilian Center for the Validation of Alternative Methods (BraCVAM), which was officially launched in 2013. The Brazilian validation process follows OECD Guidance Document No. 34, where BraCVAM functions as the focal point to identify and/or receive requests from parties interested in submitting tests for validation. BraCVAM then informs the Brazilian National Network on Alternative Methods (RENaMA) of promising assays, which helps with prioritisation and contributes to the validation studies of selected assays. A Validation Management Group supervises the validation study, and the results obtained are peer-reviewed by an ad hoc Scientific Review Committee, organised under the auspices of BraCVAM. Based on the peer-review outcome, BraCVAM will prepare recommendations on the validated test method, which will be sent to the National Council for the Control of Animal Experimentation (CONCEA). CONCEA is in charge of the regulatory adoption of all validated test methods in Brazil, following an open public consultation. 2016 FRAME.
The development and psychometric testing of East Asian Acculturation Scale among Asian immigrant women in Taiwan.

PubMed

Kuo, Shu-Fen; Chang, Wen-Yin; Chang, Lu-I; Chou, Yu-Hua; Chen, Ching-Min

2013-01-01

This is a report of development and psychometric testing of the East Asian Acculturation Measure-Chinese version (EAAM-C) scale. An instrument validation design with a cross-sectional survey was conducted. The process was carried in two phases. In Phase 1, Barry's East Asian Acculturation Measure was translated and back translated to evaluate its content, face validity, and feasibility validity. In Phase 2, the 16-item EAAM-C was pilot-tested among 485 female immigrants for test-retest reliability, internal consistency, theoretically-supported construct validity and concurrent validity. The pilot work and the survey results indicated the tools possessed adequate content and face validity. The Cronbach's Alphas for the EAAM-C was 0.72, and 0.76-0.79 for its subscales, and the correlation of test-retest reliability (at 3 weeks) was 0.75. After dropping one item, four theoretically-supported factors which explained 61.82% of the variance were abstracted using exploratory factor analysis: assimilation, integration, separation, and marginalization. Based on the underlying four-factor theoretical structures of the EAAM, the confirmatory factor analysis of the EAAM-C was further examined. The analysis revealed that the four-factor model was an acceptable fit for the data which demonstrated adequate finding in its construct validity. These factors were inter-correlated, and showed statistically significant correlation with the Chinese Health Questionnaire, indicating adequate concurrent validity. The scale shows acceptable validity and consistency, and suggests that immigrant acculturation is a complex construct. This quick evaluation instrument can be applied to assess clients' acculturation and in further developing certain interventions to improve their health.
Development and validation of the short-form Adolescent Health Promotion Scale.

PubMed

Chen, Mei-Yen; Lai, Li-Ju; Chen, Hsiu-Chih; Gaete, Jorge

2014-10-26

Health-promoting lifestyle choices of adolescents are closely related to current and subsequent health status. However, parsimonious yet reliable and valid screening tools are scarce. The original 40-item adolescent health promotion (AHP) scale was developed by our research team and has been applied to measure adolescent health-promoting behaviors worldwide. The aim of our study was to examine the psychometric properties of a newly developed short-form version of the AHP (AHP-SF) including tests of its reliability and validity. The study was conducted in nine middle and high schools in southern Taiwan. Participants were 814 adolescents randomly divided into two subgroups with equal size and homogeneity of baseline characteristics. The first subsample (calibration sample) was used to modify and shorten the factorial model while the second subsample (validation sample) was utilized to validate the result obtained from the first one. The psychometric testing of the AHP-SF included internal reliability of McDonald's omega and Cronbach's alpha, convergent validity, discriminant validity, and construct validity with confirmatory factor analysis (CFA). The results of the CFA supported a six-factor model and 21 items were retained in the AHP-SF with acceptable model fit. For the discriminant validity test, results indicated that adolescents with lower AHP-SF scores were more likely to be overweight or obese, skip breakfast, and spend more time watching TV and playing computer games. The AHP-SF also showed excellent internal consistency with a McDonald's omega of 0.904 (Cronbach's alpha 0.905) in the calibration group. The current findings suggest that the AHP-SF is a valid and reliable instrument for the evaluation of adolescent health-promoting behaviors. Primary health care providers and clinicians can use the AHP-SF to assess these behaviors and evaluate the outcome of health promotion programs in the adolescent population.
Recovery Act. Development and Validation of an Advanced Stimulation Prediction Model for Enhanced Geothermal Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gutierrez, Marte

2013-12-31

This research project aims to develop and validate an advanced computer model that can be used in the planning and design of stimulation techniques to create engineered reservoirs for Enhanced Geothermal Systems. The specific objectives of the proposal are to; Develop a true three-dimensional hydro-thermal fracturing simulator that is particularly suited for EGS reservoir creation; Perform laboratory scale model tests of hydraulic fracturing and proppant flow/transport using a polyaxial loading device, and use the laboratory results to test and validate the 3D simulator; Perform discrete element/particulate modeling of proppant transport in hydraulic fractures, and use the results to improve understandmore » of proppant flow and transport; Test and validate the 3D hydro-thermal fracturing simulator against case histories of EGS energy production; and Develop a plan to commercialize the 3D fracturing and proppant flow/transport simulator. The project is expected to yield several specific results and benefits. Major technical products from the proposal include; A true-3D hydro-thermal fracturing computer code that is particularly suited to EGS; Documented results of scale model tests on hydro-thermal fracturing and fracture propping in an analogue crystalline rock; Documented procedures and results of discrete element/particulate modeling of flow and transport of proppants for EGS applications; and Database of monitoring data, with focus of Acoustic Emissions (AE) from lab scale modeling and field case histories of EGS reservoir creation.« less
Portuguese version of a stress and well-being evaluation tool (ASSET)at the workplace: validation of the psychometric properties

PubMed Central

Moreira, Sérgio; Carreiras, Joana; Cooper, Cary; Smeed, Matthew; Reis, Maria de Fátima; Pereira Miguel, José

2018-01-01

Objective The main objective of this work was to translate the English version of ASSET (A Shortened Stress Evaluation Tool) into the Portuguese version and to validate its psychometric properties. Additionally, this work tested the convergent validity of the instrument. Methods The translation and retroversion were conducted by experts and submitted to the authors for approval. Within an observational, cross-sectional study, regarding mental health at the workplace, ASSET together with other scales was applied to a sample of 405 participants. The psychometric validity of the subscales was studied using confirmatory factorial analysis. Results The factorial structure of ASSET is globally supported by the results, with the Perceptions of Your Job and Attitudes Towards your Organisation subscales requiring slight adjustments in the item structure and the Your Health subscales replicating the original structure. The convergent validity also supports the ASSET, showing that all subscales are significantly correlated with variables used to test convergence. Conclusions Globally, the results constitute an important contribution to ASSET and open the possibility of its usage among Portuguese-speaking countries. The results provide an evidence on the validity of the instrument and, in particular, of the mental and physical health subscales. PMID:29440211
Extended version of the "Sniffin' Sticks" identification test: test-retest reliability and validity.

PubMed

Sorokowska, A; Albrecht, E; Haehner, A; Hummel, T

2015-03-30

The extended, 32-item version of the Sniffin' Sticks identification test was developed in order to create a precise tool enabling repeated, longitudinal testing of individual olfactory subfunctions. Odors of the previous test version had to be changed for technical reasons, and the odor identification test needed re-investigation in terms of reliability, validity, and normative values. In our study we investigated olfactory abilities of a group of 100 patients with olfactory dysfunction and 100 controls. We reconfirmed the high test-retest reliability of the extended version of the Sniffin' Sticks identification test and high correlations between the new and the original part of this tool. In addition, we confirmed the validity of the test as it discriminated clearly between controls and patients with olfactory loss. The additional set of 16 odor identification sticks can be either included in the current olfactory test, thus creating a more detailed diagnosis tool, or it can be used separately, enabling to follow olfactory function over time. Additionally, the normative values presented in our paper might provide useful guidelines for interpretation of the extended identification test results. The revised version of the Sniffin' Sticks 32-item odor identification test is a reliable and valid tool for the assessment of olfactory function. Copyright © 2015 Elsevier B.V. All rights reserved.
Development and validation of a nutrition knowledge questionnaire for a Canadian population.

PubMed

Bradette-Laplante, Maude; Carbonneau, Élise; Provencher, Véronique; Bégin, Catherine; Robitaille, Julie; Desroches, Sophie; Vohl, Marie-Claude; Corneau, Louise; Lemieux, Simone

2017-05-01

The present study aimed to develop and validate a nutrition knowledge questionnaire in a sample of French Canadians from the province of Quebec, taking into account dietary guidelines. A thirty-eight-item questionnaire was developed by the research team and evaluated for content validity by an expert panel, and then administered to respondents. Face validity and construct validity were measured in a pre-test. Exploratory factor analysis and covariance structure analysis were performed to verify the structure of the questionnaire and identify problematic items. Internal consistency and test-retest reliability were evaluated through a validation study. Online survey. Six nutrition and psychology experts, fifteen registered dietitians (RD) and 180 lay people participated. Content validity evaluation resulted in the removal of two items and reformulation of one item. Following face validity, one item was reformulated. Construct validity was found to be adequate, with higher scores for RD v. non-RD (21·5 (sd 2·1) v. 15·7 (sd 3·0) out of 24, P<0·001). Exploratory factor analysis revealed that the questionnaire contained only one factor. Covariance structure analysis led to removal of sixteen items. Internal consistency for the overall questionnaire was adequate (Cronbach's α=0·73). Assessment of test-retest reliability resulted in significant associations for the total knowledge score (r=0·59, P<0·001). This nutrition knowledge questionnaire was found to be a suitable instrument which can be used to measure levels of nutrition knowledge in a Canadian population. It could also serve as a model for the development of similar instruments in other populations.
Evaluation of the Effect of the Volume Throughput and Maximum Flux of Low-Surface-Tension Fluids on Bacterial Penetration of 0.2 Micron-Rated Filters during Process-Specific Filter Validation Testing.

PubMed

Folmsbee, Martha

2015-01-01

Approximately 97% of filter validation tests result in the demonstration of absolute retention of the test bacteria, and thus sterile filter validation failure is rare. However, while Brevundimonas diminuta (B. diminuta) penetration of sterilizing-grade filters is rarely detected, the observation that some fluids (such as vaccines and liposomal fluids) may lead to an increased incidence of bacterial penetration of sterilizing-grade filters by B. diminuta has been reported. The goal of the following analysis was to identify important drivers of filter validation failure in these rare cases. The identification of these drivers will hopefully serve the purpose of assisting in the design of commercial sterile filtration processes with a low risk of filter validation failure for vaccine, liposomal, and related fluids. Filter validation data for low-surface-tension fluids was collected and evaluated with regard to the effect of bacterial load (CFU/cm(2)), bacterial load rate (CFU/min/cm(2)), volume throughput (mL/cm(2)), and maximum filter flux (mL/min/cm(2)) on bacterial penetration. The data set (∼1162 individual filtrations) included all instances of process-specific filter validation failures performed at Pall Corporation, including those using other filter media, but did not include all successful retentive filter validation bacterial challenges. It was neither practical nor necessary to include all filter validation successes worldwide (Pall Corporation) to achieve the goals of this analysis. The percentage of failed filtration events for the selected total master data set was 27% (310/1162). Because it is heavily weighted with penetration events, this percentage is considerably higher than the actual rate of failed filter validations, but, as such, facilitated a close examination of the conditions that lead to filter validation failure. In agreement with our previous reports, two of the significant drivers of bacterial penetration identified were the total bacterial load and the bacterial load rate. In addition to these parameters, another three possible drivers of failure were also identified: volume throughput, maximum filter flux, and pressure. Of the data for which volume throughput information was available, 24% (249/1038) of the filtrations resulted in penetration. However, for the volume throughput range of 680-2260 mL/cm(2), only 9 out of 205 bacterial challenges (∼4%) resulted in penetration. Of the data for which flux information was available, 22% (212/946) resulted in bacterial penetration. However, in the maximum filter flux range from 7 to 18 mL/min/cm(2), only one out of 121 filtrations (0.6%) resulted in penetration. A slight increase in filter failure was observed in filter bacterial challenges with a differential pressure greater than 30 psid. When designing a commercial process for the sterile filtration of a low-surface-tension fluid (or any other potentially high-risk fluid), targeting the volume throughput range of 680-2260 mL/cm(2) or flux range of 7-18 mL/min/cm(2), and maintaining the differential pressure below 30 psid, could significantly decrease the risk of validation filter failure. However, it is important to keep in mind that these are general trends described in this study and some test fluids may not conform to the general trends described here. Ultimately, it is important to evaluate both filterability and bacterial retention of the test fluid under proposed process conditions prior to finalizing the manufacturing process to ensure successful process-specific filter validation of low-surface-tension fluids. An overwhelming majority of process-specific filter validation (qualification) tests result in the demonstration of absolute retention of test bacteria by sterilizing-grade membrane filters. As such, process-specific filter validation failure is rare. However, while bacterial penetration of sterilizing-grade filters during process-specific filter validation is rarely detected, some fluids (such as vaccines and liposomal fluids) have been associated with an increased incidence of bacterial penetration. The goal of the following analysis was to identify important drivers of process-specific filter validation failure. The identification of these drivers will possibly serve to assist in the design of commercial sterile filtration processes with a low risk of filter validation failure. Filter validation data for low-surface-tension fluids was collected and evaluated with regard to bacterial concentration and rates, as well as filtered fluid volume and rate (Pall Corporation). The master data set (∼1160 individual filtrations) included all recorded instances of process-specific filter validation failures but did not include all successful filter validation bacterial challenge tests. This allowed for a close examination of the conditions that lead to process-specific filter validation failure. As previously reported, two significant drivers of bacterial penetration were identified: the total bacterial load (the total number of bacteria per filter) and the bacterial load rate (the rate at which bacteria were applied to the filter). In addition to these parameters, another three possible drivers of failure were also identified: volumetric throughput, filter flux, and pressure. When designing a commercial process for the sterile filtration of a low-surface-tension fluid (or any other penetrative-risk fluid), targeting the identified bacterial challenge loads, volume throughput, and corresponding flux rates could decrease, and possibly eliminate, the risk of validation filter failure. However, it is important to keep in mind that these are general trends described in this study and some test fluids may not conform to the general trends described here. Ultimately, it is important to evaluate both filterability and bacterial retention of the test fluid under proposed process conditions prior to finalizing the manufacturing process to ensure successful filter validation of low-surface-tension fluids. © PDA, Inc. 2015.
Developing a model of competence in the operating theatre: psychometric validation of the perceived perioperative competence scale-revised.

PubMed

Gillespie, Brigid M; Polit, Denise F; Hamlin, Lois; Chaboyer, Wendy

2012-01-01

This paper describes the development and validation of the Revised Perioperative Competence Scale (PPCS-R). There is a lack of a psychometrically tested sound self-assessment tools to measure nurses' perceived competence in the operating room. Content validity was established by a panel of international experts and the original 98-item scale was pilot tested with 345 nurses in Queensland, Australia. Following the removal of several items, a national sample that included all 3209 nurses who were members of the Australian College of Operating Room Nurses was surveyed using the 94-item version. Psychometric testing assessed content validity using exploratory factor analysis, internal consistency using Cronbach's alpha, and construct validity using the "known groups" technique. During item reduction, several preliminary factor analyses were performed on two random halves of the sample (n=550). Usable data for psychometric assessment were obtained from 1122 nurses. The original 94-item scale was reduced to 40 items. The final factor analysis using the entire sample resulted in a 40 item six-factor solution. Cronbach's alpha for the 40-item scale was .96. Construct validation demonstrated significant differences (p<.0001) in perceived competence scores relative to years of operating room experience and receipt of specialty education. On the basis of these results, the psychometric properties of the PPCS-R were considered encouraging. Further testing of the tool in different samples of operating room nurses is necessary to enable cross-cultural comparisons. Copyright © 2011 Elsevier Ltd. All rights reserved.
Control and Non-Payload Communications (CNPC) Prototype Radio Validation Flight Test Report

NASA Technical Reports Server (NTRS)

Shalkhauser, Kurt A.; Ishac, Joseph A.; Iannicca, Dennis C.; Bretmersky, Steven C.; Smith, Albert E.

2017-01-01

This report provides an overview and results from the unmanned aircraft (UA) Control and Non-Payload Communications (CNPC) Generation 5 prototype radio validation flight test campaign. The radios used in the test campaign were developed under cooperative agreement NNC11AA01A between the NASA Glenn Research Center and Rockwell Collins, Inc., of Cedar Rapids, Iowa. Measurement results are presented for flight tests over hilly terrain, open water, and urban landscape, utilizing radio sets installed into a NASA aircraft and ground stations. Signal strength and frame loss measurement data are analyzed relative to time and aircraft position, specifically addressing the impact of line-of-sight terrain obstructions on CNPC data flow. Both the radio and flight test system are described.
Criterion-Related Validity of the Distance- and Time-Based Walk/Run Field Tests for Estimating Cardiorespiratory Fitness: A Systematic Review and Meta-Analysis.

PubMed

Mayorga-Vega, Daniel; Bocanegra-Parrilla, Raúl; Ornelas, Martha; Viciana, Jesús

2016-01-01

The main purpose of the present meta-analysis was to examine the criterion-related validity of the distance- and time-based walk/run tests for estimating cardiorespiratory fitness among apparently healthy children and adults. Relevant studies were searched from seven electronic bibliographic databases up to August 2015 and through other sources. The Hunter-Schmidt's psychometric meta-analysis approach was conducted to estimate the population criterion-related validity of the following walk/run tests: 5,000 m, 3 miles, 2 miles, 3,000 m, 1.5 miles, 1 mile, 1,000 m, ½ mile, 600 m, 600 yd, ¼ mile, 15 min, 12 min, 9 min, and 6 min. From the 123 included studies, a total of 200 correlation values were analyzed. The overall results showed that the criterion-related validity of the walk/run tests for estimating maximum oxygen uptake ranged from low to moderate (rp = 0.42-0.79), with the 1.5 mile (rp = 0.79, 0.73-0.85) and 12 min walk/run tests (rp = 0.78, 0.72-0.83) having the higher criterion-related validity for distance- and time-based field tests, respectively. The present meta-analysis also showed that sex, age and maximum oxygen uptake level do not seem to affect the criterion-related validity of the walk/run tests. When the evaluation of an individual's maximum oxygen uptake attained during a laboratory test is not feasible, the 1.5 mile and 12 min walk/run tests represent useful alternatives for estimating cardiorespiratory fitness. As in the assessment with any physical fitness field test, evaluators must be aware that the performance score of the walk/run field tests is simply an estimation and not a direct measure of cardiorespiratory fitness.
Validation of a Scalable Solar Sailcraft

NASA Technical Reports Server (NTRS)

Murphy, D. M.

2006-01-01

The NASA In-Space Propulsion (ISP) program sponsored intensive solar sail technology and systems design, development, and hardware demonstration activities over the past 3 years. Efforts to validate a scalable solar sail system by functional demonstration in relevant environments, together with test-analysis correlation activities on a scalable solar sail system have recently been successfully completed. A review of the program, with descriptions of the design, results of testing, and analytical model validations of component and assembly functional, strength, stiffness, shape, and dynamic behavior are discussed. The scaled performance of the validated system is projected to demonstrate the applicability to flight demonstration and important NASA road-map missions.
Validity and reliability of portfolio assessment of student competence in two dental school populations: a four-year study.

PubMed

Gadbury-Amyot, Cynthia C; McCracken, Michael S; Woldt, Janet L; Brennan, Robert L

2014-05-01

The purpose of this study was to empirically investigate the validity and reliability of portfolio assessment in two U.S. dental schools using a unified framework for validity. In the process of validation, it is not the test that is validated but rather the claims (interpretations and uses) about test scores that are validated. Kane's argument-based validation framework provided the structure for reporting results where validity claims are followed by evidence to support the argument. This multivariate generalizability theory study found that the greatest source of variance was attributable to faculty raters, suggesting that portfolio assessment would benefit from two raters' evaluating each portfolio independently. The results are generally supportive of holistic scoring, but analytical scoring deserves further research. Correlational analyses between student portfolios and traditional measures of student competence and readiness for licensure resulted in significant correlations between portfolios and National Board Dental Examination Part I (r=0.323, p<0.01) and Part II scores (r=0.268, p<0.05) and small and non-significant correlations with grade point average and scores on the Western Regional Examining Board (WREB) exam. It is incumbent upon the users of portfolio assessment to determine if the claims and evidence arguments set forth in this study support the proposed claims for and decisions about portfolio assessment in their respective institutions.
Test-retest reliability, smallest real difference and concurrent validity of six different balance tests on young people with mild to moderate intellectual disability.

PubMed

Blomqvist, Sven; Wester, Anita; Sundelin, Gunnevi; Rehn, Börje

2012-12-01

Some studies have reported that people with intellectual disability may have reduced balance ability compared with the population in general. However, none of these studies involved adolescents, and the reliability and validity of balance tests in this population are not known. The purpose of this study was to examine the reliability of six different balance tests and to investigate their concurrent validity. Test-retest reliability assessment. All subjects were recruited from a special school for people with intellectual disability in Bollnäs, Sweden. Eighty-nine adolescents (35 females and 54 males) with mild to moderate intellectual disability with a mean age of 18 years (range 16 to 20 years). All subjects followed the same test protocol on two occasions within an 11-day period. Balance test performances. Intraclass correlation coefficients greater than 0.80 were achieved for four of the balance tests: Extended Timed Up and Go Test, Modified Functional Reach Test, One-leg Stance Test and Force Platform Test. The smallest real differences ranged from 12% to 40%; less than 20% is considered to be low. Concurrent validity among these balance tests varied between no and low correlation. The results indicate that these tests could be used to evaluate changes in balance ability over time in people with mild to moderate intellectual disability. The low concurrent validity illustrates the importance of knowing more about the influence of various sensory subsystems that are significant for balance among adolescents with intellectual disability. Copyright © 2011 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Psychometric instrumentation: reliability and validity of instruments used for clinical practice, evidence-based practice projects and research studies.

PubMed

Mayo, Ann M

2015-01-01

It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
Perception of competence in middle school physical education: instrument development and validation.

PubMed

Scrabis-Fletcher, Kristin; Silverman, Stephen

2010-03-01

Perception of Competence (POC) has been studied extensively in physical activity (PA) research with similar instruments adapted for physical education (PE) research. Such instruments do not account for the unique PE learning environment. Therefore, an instrument was developed and the scores validated to measure POC in middle school PE. A multiphase design was used consisting of an intensive theoretical review, elicitation study, prepilot study, pilot study, content validation study, and final validation study (N=1281). Data analysis included a multistep iterative process to identify the best model fit. A three-factor model for POC was tested and resulted in root mean square error of approximation = .09, root mean square residual = .07, goodness offit index = .90, and adjusted goodness offit index = .86 values in the acceptable range (Hu & Bentler, 1999). A two-factor model was also tested and resulted in a good fit (two-factor fit indexes values = .05, .03, .98, .97, respectively). The results of this study suggest that an instrument using a three- or two-factor model provides reliable and valid scores ofPOC measurement in middle school PE.

Validity and reliability of an online visual-spatial working memory task for self-reliant administration in school-aged children.

PubMed

Van de Weijer-Bergsma, Eva; Kroesbergen, Evelyn H; Prast, Emilie J; Van Luit, Johannes E H

2015-09-01

Working memory is an important predictor of academic performance, and of math performance in particular. Most working memory tasks depend on one-to-one administration by a testing assistant, which makes the use of such tasks in large-scale studies time-consuming and costly. Therefore, an online, self-reliant visual-spatial working memory task (the Lion game) was developed for primary school children (6-12 years of age). In two studies, the validity and reliability of the Lion game were investigated. The results from Study 1 (n = 442) indicated satisfactory six-week test-retest reliability, excellent internal consistency, and good concurrent and predictive validity. The results from Study 2 (n = 5,059) confirmed the results on the internal consistency and predictive validity of the Lion game. In addition, multilevel analysis revealed that classroom membership influenced Lion game scores. We concluded that the Lion game is a valid and reliable instrument for the online computerized and self-reliant measurement of visual-spatial working memory (i.e., updating).
Validity of Highlighting on Text Comprehension

NASA Astrophysics Data System (ADS)

So, Joey C. Y.; Chan, Alan H. S.

2009-10-01

In this study, 38 university students were tested with a Chinese reading task on an LED display under different task conditions for determining the effects of the highlighting and its validity on comprehension performance on light-emitting diodes (LED) display for Chinese reading. Four levels of validity (0%, 33%, 67% and 100%) and a control condition with no highlighting were tested. Each subject was required to perform the five experimental conditions in which different passages were read and comprehended. The results showed that the condition with 100% validity of highlighting was found to have better comprehension performance than other validity levels and conditions with no highlighting. The comprehension score of the condition without highlighting effect was comparatively lower than those highlighting conditions with distracters, though not significant.
The Validity and Reliability Test of the Indonesian Version of Gastroesophageal Reflux Disease Quality of Life (GERD-QOL) Questionnaire.

PubMed

Siahaan, Laura A; Syam, Ari F; Simadibrata, Marcellus; Setiati, Siti

2017-01-01

to obtain a valid and reliable GERD-QOL questionnaire for Indonesian application. at the initial stage, the GERD-QOL questionnaire was first translated into Indonesian language and the translated questionnaire was subsequently translated back into the original language (back-to-back translation). The results were evaluated by the researcher team and therefore, an Indonesian version of GERD-QOL questionnaire was developed. Ninety-one patients who had been clinically diagnosed with GERD based on the Montreal criteria were interviewed using the Indonesian version of GERD-QOL questionnaire and the SF 36 questionnaire. The validity was evaluated using a method of construct validity and external validity, and reliability can be tested by the method of internal consistency and test retest. the Indonesian version of GERD-QOL questionnaire had a good internal consistency reliability with a Cronbach Alpha of 0.687-0.842 and a good test retest reliability with an intra-class correlation coefficient of 0.756-0.936; p<0.05). The questionnaire had also been demonstrated to have a good validity with a proven high correlation to each question of SF-36 (p<0.05). the Indonesian version of GERD-QOL questionnaire has been proven valid and reliable to evaluate the quality of life of GERD patients.
[Reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test].

PubMed

Zhang, C; Yang, G P; Li, Z; Li, X N; Li, Y; Hu, J; Zhang, F Y; Zhang, X J

2017-08-10

Objective: To assess the reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test (AUDIT) among medical students in China and to provide correct way of application on the recommended scales. Methods: An E-questionnaire was developed and sent to medical students in five different colleges. Students were all active volunteers to accept the testings. Cronbach's α and split-half reliability were calculated to evaluate the reliability of AUDIT while content, contract, discriminant and convergent validity were performed to measure the validity of the scales. Results: The overall Cronbach's α of AUDIT was 0.782 and the split-half reliability was 0.711. Data showed that the domain Cronbach's α and split-half reliability were 0.796 and 0.794 for hazardous alcohol use, 0.561 and 0.623 for dependence symptoms, and 0.647 and 0.640 for harmful alcohol use. Results also showed that the content validity index on the levels of items I-CVI) were from 0.83 to 1.00, the content validity index of scale level (S-CVI/UA) was 0.90, content validity index of average scale level (S-CVI/Ave) was 0.99 and the content validity ratios (CVR) were from 0.80 to 1.00. The simplified version of AUDIT supported a presupposed three-factor structure which could explain 61.175% of the total variance revealed through exploratory factor analysis. AUDIT semed to have good convergent and discriminant validity, with the success rate of calibration experiment as 100%. Conclusion: AUDIT showed good reliability and validity among medical students in China thus worth for promotion on its use.
NDARC - NASA Design and Analysis of Rotorcraft Validation and Demonstration

NASA Technical Reports Server (NTRS)

Johnson, Wayne

2010-01-01

Validation and demonstration results from the development of the conceptual design tool NDARC (NASA Design and Analysis of Rotorcraft) are presented. The principal tasks of NDARC are to design a rotorcraft to satisfy specified design conditions and missions, and then analyze the performance of the aircraft for a set of off-design missions and point operating conditions. The aircraft chosen as NDARC development test cases are the UH-60A single main-rotor and tail-rotor helicopter, the CH-47D tandem helicopter, the XH-59A coaxial lift-offset helicopter, and the XV-15 tiltrotor. These aircraft were selected because flight performance data, a weight statement, detailed geometry information, and a correlated comprehensive analysis model are available for each. Validation consists of developing the NDARC models for these aircraft by using geometry and weight information, airframe wind tunnel test data, engine decks, rotor performance tests, and comprehensive analysis results; and then comparing the NDARC results for aircraft and component performance with flight test data. Based on the calibrated models, the capability of the code to size rotorcraft is explored.
Development and validation of a brief screening instrument for psychosocial risk associated with genetic testing: a pan-Canadian cohort study

PubMed Central

Esplen, Mary Jane; Cappelli, Mario; Wong, Jiahui; Bottorff, Joan L; Hunter, Jon; Carroll, June; Dorval, Michel; Wilson, Brenda; Allanson, Judith; Semotiuk, Kara; Aronson, Melyssa; Bordeleau, Louise; Charlemagne, Nicole; Meschino, Wendy

2013-01-01

Objectives To develop a brief, reliable and valid instrument to screen psychosocial risk among those who are undergoing genetic testing for Adult-Onset Hereditary Disease (AOHD). Design A prospective two-phase cohort study. Setting 5 genetic testing centres for AOHD, such as cancer, Huntington's disease or haemochromatosis, in ambulatory clinics of tertiary hospitals across Canada. Participants 141 individuals undergoing genetic testing were approached and consented to the instrument development phase of the study (Phase I). The Genetic Psychosocial Risk Instrument (GPRI) developed in Phase I was tested in Phase II for item refinement and validation. A separate cohort of 722 individuals consented to the study, 712 completed the baseline package and 463 completed all follow-up assessments. Most participants were female, at the mid-life stage. Individuals in advanced stages of the illness or with cognitive impairment or a language barrier were excluded. Interventions Phase I: GPRI items were generated from (1) a review of the literature, (2) input from genetic counsellors and (3) phase I participants. Phase II: further item refinement and validation were conducted with a second cohort of participants who completed the GPRI at baseline and were followed for psychological distress 1-month postgenetic testing results. Primary and secondary outcome measures GPRI, Hamilton Depression Rating Scale (HAM-D), Hamilton Anxiety Rating Scale (HAM-A), Brief Symptom Inventory (BSI) and Impact of Event Scale (IES). Results The final 20-item GPRI had a high reliability—Cronbach's α at 0.81. The construct validity was supported by high correlations between GPRI and BSI and IES. The predictive value was demonstrated by a receiver operating characteristic curve of 0.78 plotting GPRI against follow-up assessments using HAM-D and HAM-A. Conclusions With a cut-off score of 50, GPRI identified 84% of participants who displayed distress postgenetic testing results, supporting its potential usefulness in a clinical setting. PMID:23485718
TU-D-201-05: Validation of Treatment Planning Dose Calculations: Experience Working with MPPG 5.a

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xue, J; Park, J; Kim, L

2016-06-15

Purpose: Newly published medical physics practice guideline (MPPG 5.a.) has set the minimum requirements for commissioning and QA of treatment planning dose calculations. We present our experience in the validation of a commercial treatment planning system based on MPPG 5.a. Methods: In addition to tests traditionally performed to commission a model-based dose calculation algorithm, extensive tests were carried out at short and extended SSDs, various depths, oblique gantry angles and off-axis conditions to verify the robustness and limitations of a dose calculation algorithm. A comparison between measured and calculated dose was performed based on validation tests and evaluation criteria recommendedmore » by MPPG 5.a. An ion chamber was used for the measurement of dose at points of interest, and diodes were used for photon IMRT/VMAT validations. Dose profiles were measured with a three-dimensional scanning system and calculated in the TPS using a virtual water phantom. Results: Calculated and measured absolute dose profiles were compared at each specified SSD and depth for open fields. The disagreement is easily identifiable with the difference curve. Subtle discrepancy has revealed the limitation of the measurement, e.g., a spike at the high dose region and an asymmetrical penumbra observed on the tests with an oblique MLC beam. The excellent results we had (> 98% pass rate on 3%/3mm gamma index) on the end-to-end tests for both IMRT and VMAT are attributed to the quality beam data and the good understanding of the modeling. The limitation of the model and the uncertainty of measurement were considered when comparing the results. Conclusion: The extensive tests recommended by the MPPG encourage us to understand the accuracy and limitations of a dose algorithm as well as the uncertainty of measurement. Our experience has shown how the suggested tests can be performed effectively to validate dose calculation models.« less
Tissue Preservation Assessment Preliminary Results

NASA Technical Reports Server (NTRS)

Globus, Ruth; Costes, Sylvain

2017-01-01

Pre-flight groundbased testing done to prepare for the first Rodent Research mission validation flight, RR1 (Choi et al, 2016 PlosOne). We purified RNA and measured RIN values to assess quality of the samples. For protein, we measured liver enzyme activities. We tested protocol and methods of preservation to date. Here we present an overview of results related to tissue preservation from the RR1 validation mission and a summary of findings to date from investigators who received RR1 teissues various Biospecimen Sharing Program.
Improving the Validity and Reliability of a Health Promotion Survey for Physical Therapists

PubMed Central

Stephens, Jaca L.; Lowman, John D.; Graham, Cecilia L.; Morris, David M.; Kohler, Connie L.; Waugh, Jonathan B.

2013-01-01

Purpose Physical therapists (PTs) have a unique opportunity to intervene in the area of health promotion. However, no instrument has been validated to measure PTs’ views on health promotion in physical therapy practice. The purpose of this study was to evaluate the content validity and test-retest reliability of a health promotion survey designed for PTs. Methods An expert panel of PTs assessed the content validity of “The Role of Health Promotion in Physical Therapy Survey” and provided suggestions for revision. Item content validity was assessed using the content validity ratio (CVR) as well as the modified kappa statistic. Therapists then participated in the test-retest reliability assessment of the revised health promotion survey, which was assessed using a weighted kappa statistic. Results Based on feedback from the expert panelists, significant revisions were made to the original survey. The expert panel reached at least a majority consensus agreement for all items in the revised survey and the survey-CVR improved from 0.44 to 0.66. Only one item on the revised survey had substantial test-retest agreement, with 55% of the items having moderate agreement and 43% poor agreement. Conclusions All items on the revised health promotion survey demonstrated at least fair validity, but few items had reasonable test-retest reliability. Further modifications should be made to strengthen the validity and improve the reliability of this survey. PMID:23754935
Reliability and validity of pendulum test measures of spasticity obtained with the Polhemus tracking system from patients with chronic stroke

PubMed Central

Bohannon, Richard W; Harrison, Steven; Kinsella-Shaw, Jeffrey

2009-01-01

Background Spasticity is a common impairment accompanying stroke. Spasticity of the quadriceps femoris muscle can be quantified using the pendulum test. The measurement properties of pendular kinematics captured using a magnetic tracking system has not been studied among patients who have experienced a stroke. Therefore, this study describes the test-retest reliability and known groups and convergent validity of the pendulum test measures obtained with the Polhemus tracking system. Methods Eight patients with chronic stroke underwent pendulum tests with their affected and unaffected lower limbs, with and without the addition of a 2.2 kg cuff weight at the ankle, using the Polhemus magnetic tracking system. Also measured bilaterally were knee resting angles, Ashworth scores (grades 0–4) of quadriceps femoris muscles, patellar tendon (knee jerk) reflexes (grades 0–4), and isometric knee extension force. Results Three measures obtained from pendular traces of the affected side were reliable (intraclass correlation coefficient ≥ .844). Known groups validity was confirmed by demonstration of a significant difference in the measurements between sides. Convergent validity was supported by correlations ≥ .57 between pendulum test measures and other measures reflective of spasticity. Conclusion Pendulum test measures obtained with the Polhemus tracking system from the affected side of patients with stroke have good test-retest reliability and both known groups and convergent validity. PMID:19642989
Six Years of Comprehensive, Clinical, Performance-Based Assessment Using Standardized Patients at the Southern Illinois University School of Medicine.

ERIC Educational Resources Information Center

Vu, Nu Viet; And Others

1992-01-01

The use of a performance-based assessment of senior medical students' clinical skills utilizing standardized patients was evaluated, with 6,804 student-patient encounters involving 405 students over 6 years. Results provide evidence for test security, content validity, construct validity, reliability, and test ability to discriminate a wide range…
Testing a Multi-Stage Screening System: Predicting Performance on Australia's National Achievement Test Using Teachers' Ratings of Academic and Social Behaviors

ERIC Educational Resources Information Center

Kettler, Ryan J.; Elliott, Stephen N.; Davies, Michael; Griffin, Patrick

2012-01-01

This study addresses the predictive validity of results from a screening system of academic enablers, with a sample of Australian elementary school students, when the criterion variable is end-of-year achievement. The investigation included (a) comparing the predictive validity of a brief criterion-referenced nomination system with more…
Validation of a laboratory and hospital information system in a medical laboratory accredited according to ISO 15189.

PubMed

Biljak, Vanja Radisic; Ozvald, Ivan; Radeljak, Andrea; Majdenic, Kresimir; Lasic, Branka; Siftar, Zoran; Lovrencic, Marijana Vucic; Flegar-Mestric, Zlata

2012-01-01

The aim of the study was to present a protocol for laboratory information system (LIS) and hospital information system (HIS) validation at the Institute of Clinical Chemistry and Laboratory Medicine of the Merkur University Hospital, Zagreb, Croatia. Validity of data traceability was checked by entering all test requests for virtual patient into HIS/LIS and printing corresponding barcoded labels that provided laboratory analyzers with the information on requested tests. The original printouts of the test results from laboratory analyzer(s) were compared with the data obtained from LIS and entered into the provided template. Transfer of data from LIS to HIS was examined by requesting all tests in HIS and creating real data in a finding generated in LIS. Data obtained from LIS and HIS were entered into a corresponding template. The main outcome measure was the accuracy of transfer obtained from laboratory analyzers and results transferred from LIS and HIS expressed as percentage (%). The accuracy of data transfer from laboratory analyzers to LIS was 99.5% and of that from LIS to HIS 100%. We presented our established validation protocol for laboratory information system and demonstrated that a system meets its intended purpose.
Comparative study between EDXRF and ASTM E572 methods using two-way ANOVA

NASA Astrophysics Data System (ADS)

Krummenauer, A.; Veit, H. M.; Zoppas-Ferreira, J.

2018-03-01

Comparison with reference method is one of the necessary requirements for the validation of non-standard methods. This comparison was made using the experiment planning technique with two-way ANOVA. In ANOVA, the results obtained using the EDXRF method, to be validated, were compared with the results obtained using the ASTM E572-13 standard test method. Fisher's tests (F-test) were used to comparative study between of the elements: molybdenum, niobium, copper, nickel, manganese, chromium and vanadium. All F-tests of the elements indicate that the null hypothesis (Ho) has not been rejected. As a result, there is no significant difference between the methods compared. Therefore, according to this study, it is concluded that the EDXRF method was approved in this method comparison requirement.
Reliability and Validity of Ten Consumer Activity Trackers Depend on Walking Speed.

PubMed

Fokkema, Tryntsje; Kooiman, Thea J M; Krijnen, Wim P; VAN DER Schans, Cees P; DE Groot, Martijn

2017-04-01

To examine the test-retest reliability and validity of ten activity trackers for step counting at three different walking speeds. Thirty-one healthy participants walked twice on a treadmill for 30 min while wearing 10 activity trackers (Polar Loop, Garmin Vivosmart, Fitbit Charge HR, Apple Watch Sport, Pebble Smartwatch, Samsung Gear S, Misfit Flash, Jawbone Up Move, Flyfit, and Moves). Participants walked three walking speeds for 10 min each; slow (3.2 km·h), average (4.8 km·h), and vigorous (6.4 km·h). To measure test-retest reliability, intraclass correlations (ICC) were determined between the first and second treadmill test. Validity was determined by comparing the trackers with the gold standard (hand counting), using mean differences, mean absolute percentage errors, and ICC. Statistical differences were calculated by paired-sample t tests, Wilcoxon signed-rank tests, and by constructing Bland-Altman plots. Test-retest reliability varied with ICC ranging from -0.02 to 0.97. Validity varied between trackers and different walking speeds with mean differences between the gold standard and activity trackers ranging from 0.0 to 26.4%. Most trackers showed relatively low ICC and broad limits of agreement of the Bland-Altman plots at the different speeds. For the slow walking speed, the Garmin Vivosmart and Fitbit Charge HR showed the most accurate results. The Garmin Vivosmart and Apple Watch Sport demonstrated the best accuracy at an average walking speed. For vigorous walking, the Apple Watch Sport, Pebble Smartwatch, and Samsung Gear S exhibited the most accurate results. Test-retest reliability and validity of activity trackers depends on walking speed. In general, consumer activity trackers perform better at an average and vigorous walking speed than at a slower walking speed.
An initial investigation into the validity of a computer-based auditory processing assessment (Feather Squadron).

PubMed

Barker, Matthew D; Purdy, Suzanne C

2016-01-01

This research investigates a novel method for identifying and measuring school-aged children with poor auditory processing through a tablet computer. Feasibility and test-retest reliability are investigated by examining the percentage of Group 1 participants able to complete the tasks and developmental effects on performance. Concurrent validity was investigated against traditional tests of auditory processing using Group 2. There were 847 students aged 5 to 13 years in group 1, and 46 aged 5 to 14 years in group 2. Some tasks could not be completed by the youngest participants. Significant correlations were found between results of most auditory processing areas assessed by the Feather Squadron test and traditional auditory processing tests. Test-retest comparisons indicated good reliability for most of the Feather Squadron assessments and some of the traditional tests. The results indicate the Feather Squadron assessment is a time-efficient, feasible, concurrently valid, and reliable approach for measuring auditory processing in school-aged children. Clinically, this may be a useful option for audiologists when performing auditory processing assessments as it is a relatively fast, engaging, and easy way to assess auditory processing abilities. Research is needed to investigate further the construct validity of this new assessment by examining the association between performance on Feather Squadron and objective evoked potential, lesion studies, and/or functional imaging measures of auditory function.
K(3)EDTA Vacuum Tubes Validation for Routine Hematological Testing.

PubMed

Lima-Oliveira, Gabriel; Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Poli, Giovanni; Solero, Giovanni Pietro; Picheth, Geraldo; Guidi, Gian Cesare

2012-01-01

Background and Objective. Some in vitro diagnostic devices (e.g, blood collection vacuum tubes and syringes for blood analyses) are not validated before the quality laboratory managers decide to start using or to change the brand. Frequently, the laboratory or hospital managers select the vacuum tubes for blood collection based on cost considerations or on relevance of a brand. The aim of this study was to validate two dry K(3)EDTA vacuum tubes of different brands for routine hematological testing. Methods. Blood specimens from 100 volunteers in two different K(3)EDTA vacuum tubes were collected by a single, expert phlebotomist. The routine hematological testing was done on Advia 2120i hematology system. The significance of the differences between samples was assessed by paired Student's t-test after checking for normality. The level of statistical significance was set at P < 0.05. Results and Conclusions. Different brand's tubes evaluated can represent a clinically relevant source of variations only on mean platelet volume (MPV) and platelet distribution width (PDW). Basically, our validation will permit the laboratory or hospital managers to select the brand's vacuum tubes validated according to him/her technical or economical reasons for routine hematological tests.
K3EDTA Vacuum Tubes Validation for Routine Hematological Testing

PubMed Central

Lima-Oliveira, Gabriel; Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Poli, Giovanni; Solero, Giovanni Pietro; Picheth, Geraldo; Guidi, Gian Cesare

2012-01-01

Background and Objective. Some in vitro diagnostic devices (e.g, blood collection vacuum tubes and syringes for blood analyses) are not validated before the quality laboratory managers decide to start using or to change the brand. Frequently, the laboratory or hospital managers select the vacuum tubes for blood collection based on cost considerations or on relevance of a brand. The aim of this study was to validate two dry K3EDTA vacuum tubes of different brands for routine hematological testing. Methods. Blood specimens from 100 volunteers in two different K3EDTA vacuum tubes were collected by a single, expert phlebotomist. The routine hematological testing was done on Advia 2120i hematology system. The significance of the differences between samples was assessed by paired Student's t-test after checking for normality. The level of statistical significance was set at P < 0.05. Results and Conclusions. Different brand's tubes evaluated can represent a clinically relevant source of variations only on mean platelet volume (MPV) and platelet distribution width (PDW). Basically, our validation will permit the laboratory or hospital managers to select the brand's vacuum tubes validated according to him/her technical or economical reasons for routine hematological tests. PMID:22888448
Contemporary Test Validity in Theory and Practice: A Primer for Discipline-Based Education Researchers.

PubMed

Reeves, Todd D; Marbach-Ad, Gili

2016-01-01

Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology--either quantitative or qualitative--on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations. © 2016 T. D. Reeves and G. Marbach-Ad. CBE—Life Sciences Education © 2016 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.

PubMed

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W

2016-01-01

External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.

Methods to validate the accuracy of an indirect calorimeter in the in-vitro setting.

PubMed

Oshima, Taku; Ragusa, Marco; Graf, Séverine; Dupertuis, Yves Marc; Heidegger, Claudia-Paula; Pichard, Claude

2017-12-01

The international ICALIC initiative aims at developing a new indirect calorimeter according to the needs of the clinicians and researchers in the field of clinical nutrition and metabolism. The project initially focuses on validating the calorimeter for use in mechanically ventilated acutely ill adult patient. However, standard methods to validate the accuracy of calorimeters have not yet been established. This paper describes the procedures for the in-vitro tests to validate the accuracy of the new indirect calorimeter, and defines the ranges for the parameters to be evaluated in each test to optimize the validation for clinical and research calorimetry measurements. Two in-vitro tests have been defined to validate the accuracy of the gas analyzers and the overall function of the new calorimeter. 1) Gas composition analysis allows validating the accuracy of O 2 and CO 2 analyzers. Reference gas of known O 2 (or CO 2 ) concentration is diluted by pure nitrogen gas to achieve predefined O 2 (or CO 2 ) concentration, to be measured by the indirect calorimeter. O 2 and CO 2 concentrations to be tested were determined according to their expected ranges of concentrations during calorimetry measurements. 2) Gas exchange simulator analysis validates O 2 consumption (VO 2 ) and CO 2 production (VCO 2 ) measurements. CO 2 gas injection into artificial breath gas provided by the mechanical ventilator simulates VCO 2 . Resulting dilution of O 2 concentration in the expiratory air is analyzed by the calorimeter as VO 2 . CO 2 gas of identical concentration to the fraction of inspired O 2 (FiO 2 ) is used to simulate identical VO 2 and VCO 2 . Indirect calorimetry results from publications were analyzed to determine the VO 2 and VCO 2 values to be tested for the validation. O 2 concentration in respiratory air is highest at inspiration, and can decrease to 15% during expiration. CO 2 concentration can be as high as 5% in expired air. To validate analyzers for measurements of FiO 2 up to 70%, ranges of O 2 and CO 2 concentrations to be tested were defined as 15-70% and 0.5-5.0%, respectively. The mean VO 2 in 426 adult mechanically ventilated patients was 270 ml/min, with 2 standard deviation (SD) ranges of 150-391 ml/min. Thus, VO 2 and VCO 2 to be simulated for the validation were defined as 150, 250, and 400 ml/min. The procedures for the in-vitro tests of the new indirect calorimeter and the ranges for the parameters to be evaluated in each test have been defined to optimize the validation of accuracy for clinical and research indirect calorimetry measurements. The combined methods will be used to validate the accuracy of the new indirect calorimeter developed by the ICALIC initiative, and should become the standard method to validate the accuracy of any future indirect calorimeters. Copyright © 2017 European Society for Clinical Nutrition and Metabolism. Published by Elsevier Ltd. All rights reserved.
Development and Initial Validation of Perceived Competence and Satisfaction Measures for Racquet Sports.

ERIC Educational Resources Information Center

Aguilar, Teresita E.; Petrakis, Elizabeth

1989-01-01

The development and initial validation of the Racquet Sports Competence-Satisfaction Scale for measuring perceived competence and satisfaction in badminton, racquetball, and tennis is described. Results of a review panel and two field tests (with 168 and 208 university students) support the validity of the competence and satisfaction measures.…
Assessment of Irrational Beliefs: The Question of Discriminant Validity.

ERIC Educational Resources Information Center

Smith, Timothy W.; Zurawski, Raymond M.

1983-01-01

Evaluated discriminant validity in frequently used measures of irrational beliefs relative to measures of trait anxiety in college students (N=142). Results showed discriminant validity in the Rational Behavior Inventory but not in the Irrational Beliefs Test and correlated cognitive rather than somatic aspects of trait anxiety with both measures.…
Associations among Classroom Emotional Processes, Student Interest, and Engagement: A Convergent Validity Test

ERIC Educational Resources Information Center

Mazer, Joseph P.

2017-01-01

The results of this study compile convergent validity evidence for the Student Interest Scale and Student Engagement Scale through associations among emotional support, emotion work, student interest, and engagement. Confirmatory factor analysis indicates that the factor structures of the measures are stable, reliable, and valid. The results…
The Validity of the Comparative Interrupted Time Series Design for Evaluating the Effect of School-Level Interventions.

PubMed

Jacob, Robin; Somers, Marie-Andree; Zhu, Pei; Bloom, Howard

2016-06-01

In this article, we examine whether a well-executed comparative interrupted time series (CITS) design can produce valid inferences about the effectiveness of a school-level intervention. This article also explores the trade-off between bias reduction and precision loss across different methods of selecting comparison groups for the CITS design and assesses whether choosing matched comparison schools based only on preintervention test scores is sufficient to produce internally valid impact estimates. We conduct a validation study of the CITS design based on the federal Reading First program as implemented in one state using results from a regression discontinuity design as a causal benchmark. Our results contribute to the growing base of evidence regarding the validity of nonexperimental designs. We demonstrate that the CITS design can, in our example, produce internally valid estimates of program impacts when multiple years of preintervention outcome data (test scores in the present case) are available and when a set of reasonable criteria are used to select comparison organizations (schools in the present case). © The Author(s) 2016.
Digital Divide Measurement in Lembata Regency Using SIBIS

NASA Astrophysics Data System (ADS)

Gabriel, Cecilia Dai Payon Binti; Setyohadi, Djoko Budiyanto; Suyoto

2018-02-01

Along with technological development in Indonesia, digital divide occurs in various regions, which were behind in terms of information on how to use, access and utilize ICT in collecting information from internet. One of the regions is Lembata Regency in East Nusa Tenggara, where digital divide among the people should be measured. The purpose of this study was to determine the level of digital divide among the people of Lembata Regency. To determine the level of digital divide, we used SIBIS GPS (General Population Survey) method, which consisted of several indicators or aspect, i.e. internet usage behavior, internet utilization, and e-government. We also performed two tests, i.e. validity test and reliability test to obtain value of index of digital divide measurement among the people of Lembata Regency. The results of validity test which is processed using SPSS program are categorized valid for each variable indicator and the reliability test results show reliable status. According to the test results on digital discrepancy in Lembata people, the internet usage attitude indicator is categorized low which is 63.1%, the internet usage function indicator is categorized low which is 64%, and the digital discrepancy of e-government indicator is categorized medium which is 40.4%. Therefore, the result of this study because consideration for the government of Lembata Regency in improving ICT services in e-government and in distributing ICT access and ability equally to the people.
Testing Math or Testing Language? The Construct Validity of the KeyMath-Revised for Children With Intellectual Disability and Language Difficulties.

PubMed

Rhodes, Katherine T; Branum-Martin, Lee; Morris, Robin D; Romski, MaryAnn; Sevcik, Rose A

2015-11-01

Although it is often assumed that mathematics ability alone predicts mathematics test performance, linguistic demands may also predict achievement. This study examined the role of language in mathematics assessment performance for children with intellectual disability (ID) at less severe levels, on the KeyMath-Revised Inventory (KM-R) with a sample of 264 children, in grades 2-5. Using confirmatory factor analysis, the hypothesis that the KM-R would demonstrate discriminant validity with measures of language abilities in a two-factor model was compared to two plausible alternative models. Results indicated that KM-R did not have discriminant validity with measures of children's language abilities and was a multidimensional test of both mathematics and language abilities for this population of test users. Implications are considered for test development, interpretation, and intervention.
Development and Validation of Methodology to Model Flow in Ventilation Systems Commonly Found in Nuclear Facilities. Phase I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Strons, Philip; Bailey, James L.; Davis, John

2016-03-01

In this work, we apply the CFD in modeling airflow and particulate transport. This modeling is then compared to field validation studies to both inform and validate the modeling assumptions. Based on the results of field tests, modeling assumptions and boundary conditions are refined and the process is repeated until the results are found to be reliable with a high level of confidence.
Recent Advances in Simulation of Eddy Current Testing of Tubes and Experimental Validations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reboud, C.; Premel, D.; Lesselier, D.

2007-03-21

Eddy current testing (ECT) is widely used in iron and steel industry for the inspection of tubes during manufacturing. A collaboration between CEA and the Vallourec Research Center led to the development of new numerical functionalities dedicated to the simulation of ECT of non-magnetic tubes by external probes. The achievement of experimental validations led us to the integration of these models into the CIVA platform. Modeling approach and validation results are discussed here. A new numerical scheme is also proposed in order to improve the accuracy of the model.
Recent Advances in Simulation of Eddy Current Testing of Tubes and Experimental Validations

NASA Astrophysics Data System (ADS)

Reboud, C.; Prémel, D.; Lesselier, D.; Bisiaux, B.

2007-03-01

Eddy current testing (ECT) is widely used in iron and steel industry for the inspection of tubes during manufacturing. A collaboration between CEA and the Vallourec Research Center led to the development of new numerical functionalities dedicated to the simulation of ECT of non-magnetic tubes by external probes. The achievement of experimental validations led us to the integration of these models into the CIVA platform. Modeling approach and validation results are discussed here. A new numerical scheme is also proposed in order to improve the accuracy of the model.
Validation and Verification of Composite Pressure Vessel Design

NASA Technical Reports Server (NTRS)

Kreger, Stephen T.; Ortyl, Nicholas; Grant, Joseph; Taylor, F. Tad

2006-01-01

Ten composite pressure vessels were instrumented with fiber Bragg grating sensors and pressure tested Through burst. This paper and presentation will discuss the testing methodology, the test results, compare the testing results to the analytical model, and also compare the fiber Bragg grating sensor data with data obtained against that obtained from foil strain gages.
Measuring the needs of mental health patients in Greece: reliability and validity of the Greek version of the Camberwell assessment of need.

PubMed

Stefanatou, Pentagiotissa; Giannouli, Eleni; Konstantakopoulos, George; Vitoratou, Silia; Mavreas, Venetsanos

2014-11-01

Evaluation of mental health services based on patients' needs assessments has never taken place in Greece, although it is a crucial factor for the efficient use of their limited resources. To examine the inter-rater and test-retest reliability and the concurrent/convergent validity of the Greek research version of the Camberwell Assessment of Need-Research (CAN-R). A total of 53 schizophrenic patient-staff pairs were interviewed twice to test the inter-rater and test-retest reliability of the Greek version of the CAN-R. The World Health Organization Quality of Life-Brief Form (WHOQOL-BREF) and World Health Organization Disability Assessment Schedule-2.0 (WHODAS-2.0) were administered to the patients to examine concurrent validity. The inter-rater and test-retest reliability of patient and staff interviews for the 22 individual items and the eight summary scores of the instrument's four sections were good to excellent. Significant correlations emerged between CAN scores and the WHOQOL-BREF and WHODAS-2.0 domains for both patient and staff ratings, indicating good concurrent validity. Our results suggest that the Greek version of the CAN-R is a reliable instrument for assessing mental health patients' needs. Moreover, it is the first CAN-R validity study with satisfactory results using WHOQOL-BREF and WHODAS-2.0 as criterion variables. © The Author(s) 2013.
Anatomy of a physics test: Validation of the physics items on the Texas Assessment of Knowledge and Skills

NASA Astrophysics Data System (ADS)

Marshall, Jill A.; Hagedorn, Eric A.; O'Connor, Jerry

2009-06-01

We report the results of an analysis of the Texas Assessment of Knowledge and Skills (TAKS) designed to determine whether the TAKS is a valid indicator of whether students know and can do physics at the level necessary for success in future coursework, STEM careers, and life in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam, performing full-information factor analysis, calculating classical test indices, and determining each item's response curve using item response theory. Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing.
Monitoring sedation status over time in ICU patients: reliability and validity of the Richmond Agitation-Sedation Scale (RASS).

PubMed

Ely, E Wesley; Truman, Brenda; Shintani, Ayumi; Thomason, Jason W W; Wheeler, Arthur P; Gordon, Sharon; Francis, Joseph; Speroff, Theodore; Gautam, Shiva; Margolin, Richard; Sessler, Curtis N; Dittus, Robert S; Bernard, Gordon R

2003-06-11

Goal-directed delivery of sedative and analgesic medications is recommended as standard care in intensive care units (ICUs) because of the impact these medications have on ventilator weaning and ICU length of stay, but few of the available sedation scales have been appropriately tested for reliability and validity. To test the reliability and validity of the Richmond Agitation-Sedation Scale (RASS). Prospective cohort study. Adult medical and coronary ICUs of a university-based medical center. Thirty-eight medical ICU patients enrolled for reliability testing (46% receiving mechanical ventilation) from July 21, 1999, to September 7, 1999, and an independent cohort of 275 patients receiving mechanical ventilation were enrolled for validity testing from February 1, 2000, to May 3, 2001. Interrater reliability of the RASS, Glasgow Coma Scale (GCS), and Ramsay Scale (RS); validity of the RASS correlated with reference standard ratings, assessments of content of consciousness, GCS scores, doses of sedatives and analgesics, and bispectral electroencephalography. In 290-paired observations by nurses, results of both the RASS and RS demonstrated excellent interrater reliability (weighted kappa, 0.91 and 0.94, respectively), which were both superior to the GCS (weighted kappa, 0.64; P<.001 for both comparisons). Criterion validity was tested in 411-paired observations in the first 96 patients of the validation cohort, in whom the RASS showed significant differences between levels of consciousness (P<.001 for all) and correctly identified fluctuations within patients over time (P<.001). In addition, 5 methods were used to test the construct validity of the RASS, including correlation with an attention screening examination (r = 0.78, P<.001), GCS scores (r = 0.91, P<.001), quantity of different psychoactive medication dosages 8 hours prior to assessment (eg, lorazepam: r = - 0.31, P<.001), successful extubation (P =.07), and bispectral electroencephalography (r = 0.63, P<.001). Face validity was demonstrated via a survey of 26 critical care nurses, which the results showed that 92% agreed or strongly agreed with the RASS scoring scheme, and 81% agreed or strongly agreed that the instrument provided a consensus for goal-directed delivery of medications. The RASS demonstrated excellent interrater reliability and criterion, construct, and face validity. This is the first sedation scale to be validated for its ability to detect changes in sedation status over consecutive days of ICU care, against constructs of level of consciousness and delirium, and correlated with the administered dose of sedative and analgesic medications.
Validation of the Physics Analysis used to Characterize the AGR-1 TRISO Fuel Irradiation Test

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sterbentz, James W.; Harp, Jason M.; Demkowicz, Paul A.

2015-05-01

The results of a detailed physics depletion calculation used to characterize the AGR-1 TRISO-coated particle fuel test irradiated in the Advanced Test Reactor (ATR) at the Idaho National Laboratory are compared to measured data for the purpose of validation. The particle fuel was irradiated for 13 ATR power cycles over three calendar years. The physics analysis predicts compact burnups ranging from 11.30-19.56% FIMA and cumulative neutron fast fluence from 2.21?4.39E+25 n/m 2 under simulated high-temperature gas-cooled reactor conditions in the ATR. The physics depletion calculation can provide a full characterization of all 72 irradiated TRISO-coated particle compacts during and post-irradiation,more » so validation of this physics calculation was a top priority. The validation of the physics analysis was done through comparisons with available measured experimental data which included: 1) high-resolution gamma scans for compact activity and burnup, 2) mass spectrometry for compact burnup, 3) flux wires for cumulative fast fluence, and 4) mass spectrometry for individual actinide and fission product concentrations. The measured data are generally in very good agreement with the calculated results, and therefore provide an adequate validation of the physics analysis and the results used to characterize the irradiated AGR-1 TRISO fuel.« less
Validation of a virtual reality-based simulator for shoulder arthroscopy.

PubMed

Rahm, Stefan; Germann, Marco; Hingsammer, Andreas; Wieser, Karl; Gerber, Christian

2016-05-01

This study was to determine face and construct validity of a new virtual reality-based shoulder arthroscopy simulator which uses passive haptic feedback. Fifty-one participants including 25 novices (<20 shoulder arthroscopies) and 26 experts (>100 shoulder arthroscopies) completed two tests: for assessment of face validity, a questionnaire was filled out concerning quality of simulated reality and training potential using a 7-point Likert scale (range 1-7). Construct validity was tested by comparing simulator metrics (operation time in seconds, camera and grasper pathway in centimetre and grasper openings) between novices and experts test results. Overall simulated reality was rated high with a median value of 5.5 (range 2.8-7) points. Training capacity scored a median value of 5.8 (range 3-7) points. Experts were significantly faster in the diagnostic test with a median of 91 (range 37-208) s than novices with 1177 (range 81-383) s (p < 0.0001) and in the therapeutic test 102 (range 58-283) s versus 229 (range 114-399) s (p < 0.0001). Similar results were seen in the other metric values except in the camera pathway in the therapeutic test. The tested simulator achieved high scores in terms of realism and training capability. It reliably discriminated between novices and experts. Further improvements of the simulator, especially in the field of therapeutic arthroscopy, might improve its value as training and assessment tool for shoulder arthroscopy skills. II.
European Portuguese adaptation and validation of dilemmas used to assess moral decision-making.

PubMed

Fernandes, Carina; Gonçalves, Ana Ribeiro; Pasion, Rita; Ferreira-Santos, Fernando; Paiva, Tiago Oliveira; Melo E Castro, Joana; Barbosa, Fernando; Martins, Isabel Pavão; Marques-Teixeira, João

2018-03-01

Objective To adapt and validate a widely used set of moral dilemmas to European Portuguese, which can be applied to assess decision-making. Moreover, the classical formulation of the dilemmas was compared with a more focused moral probe. Finally, a shorter version of the moral scenarios was tested. Methods The Portuguese version of the set of moral dilemmas was tested in 53 individuals from several regions of Portugal. In a second study, an alternative way of questioning on moral dilemmas was tested in 41 participants. Finally, the shorter version of the moral dilemmas was tested in 137 individuals. Results Results evidenced no significant differences between English and Portuguese versions. Also, asking whether actions are "morally acceptable" elicited less utilitarian responses than the original question, although without reaching statistical significance. Finally, all tested versions of moral dilemmas exhibited the same pattern of responses, suggesting that the fundamental elements to the moral decision-making were preserved. Conclusions We found evidence of cross-cultural validity for moral dilemmas. However, the moral focus might affect utilitarian/deontological judgments.
Reverberation Chamber Uniformity Validation and Radiated Susceptibility Test Procedures for the NASA High Intensity Radiated Fields Laboratory

NASA Technical Reports Server (NTRS)

Koppen, Sandra V.; Nguyen, Truong X.; Mielnik, John J.

2010-01-01

The NASA Langley Research Center's High Intensity Radiated Fields Laboratory has developed a capability based on the RTCA/DO-160F Section 20 guidelines for radiated electromagnetic susceptibility testing in reverberation chambers. Phase 1 of the test procedure utilizes mode-tuned stirrer techniques and E-field probe measurements to validate chamber uniformity, determines chamber loading effects, and defines a radiated susceptibility test process. The test procedure is segmented into numbered operations that are largely software controlled. This document is intended as a laboratory test reference and includes diagrams of test setups, equipment lists, as well as test results and analysis. Phase 2 of development is discussed.
Development and Validation of a New Questionnaire Assessing Quality of Life in Adults with Hypopituitarism: Adult Hypopituitarism Questionnaire (AHQ)

PubMed Central

Ishii, Hitoshi; Shimatsu, Akira; Okimura, Yasuhiko; Tanaka, Toshiaki; Hizuka, Naomi; Kaji, Hidesuke; Hanew, Kunihiko; Oki, Yutaka; Yamashiro, Sayuri; Takano, Koji; Chihara, Kazuo

2012-01-01

Objective To develop and validate the Adult Hypopituitarism Questionnaire (AHQ) as a disease-specific, self-administered questionnaire for evaluation of quality of life (QOL) in adult patients with hypopituitarism. Methods We developed and validated this new questionnaire, using a standardized procedure which included item development, pilot-testing and psychometric validation. Of the patients who participated in psychometric validation, those whose clinical conditions were judged to be stable were asked to answer the survey questionnaire twice, in order to assess test-retest reliability. Results Content validity of the initial questionnaire was evaluated via two pilot tests. After these tests, we made minor revisions and finalized the initial version of the questionnaire. The questionnaire was constructed with two domains, one psycho-social and the other physical. For psychometric assessment, analyses were performed on the responses of 192 adult patients with various types of hypopituitarism. The intraclass correlations of the respective domains were 0.91 and 0.95, and the Cronbach’s alpha coefficients were 0.96 and 0.95, indicating adequate test-retest reliability and internal consistency for each domain. For known-group validity, patients with hypopituitarism due to hypothalamic disorder showed significantly lower scores in 11 out of 13 sub-domains compared to those who had hypopituitarism due to pituitary disorder. Regarding construct validity, the domain structure was found to be almost the same as that initially hypothesized. Exploratory factor analysis (n = 228) demonstrated that each domain consisted of six and seven sub-domains. Conclusion The AHQ showed good reliability and validity for evaluating QOL in adult patients with hypopituitarism. PMID:22984490
Test-retest reliability and construct validity of the ENERGY-parent questionnaire on parenting practices, energy balance-related behaviours and their potential behavioural determinants: the ENERGY-project.

PubMed

Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes

2012-08-13

Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.

Validity and Reliability of Baseline Testing in a Standardized Environment.

PubMed

Higgins, Kathryn L; Caze, Todd; Maerlender, Arthur

2017-08-11

The Immediate Postconcussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery commonly used to determine cognitive recovery from concussion based on comparing post-injury scores to baseline scores. This model is based on the premise that ImPACT baseline test scores are a valid and reliable measure of optimal cognitive function at baseline. Growing evidence suggests that this premise may not be accurate and a large contributor to invalid and unreliable baseline test scores may be the protocol and environment in which baseline tests are administered. This study examined the effects of a standardized environment and administration protocol on the reliability and performance validity of athletes' baseline test scores on ImPACT by comparing scores obtained in two different group-testing settings. Three hundred-sixty one Division 1 cohort-matched collegiate athletes' baseline data were assessed using a variety of indicators of potential performance invalidity; internal reliability was also examined. Thirty-one to thirty-nine percent of the baseline cases had at least one indicator of low performance validity, but there were no significant differences in validity indicators based on environment in which the testing was conducted. Internal consistency reliability scores were in the acceptable to good range, with no significant differences between administration conditions. These results suggest that athletes may be reliably performing at levels lower than their best effort would produce. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Recovery Act. Development and Validation of an Advanced Stimulation Prediction Model for Enhanced Geothermal System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gutierrez, Marte

The research project aims to develop and validate an advanced computer model that can be used in the planning and design of stimulation techniques to create engineered reservoirs for Enhanced Geothermal Systems. The specific objectives of the proposal are to: 1) Develop a true three-dimensional hydro-thermal fracturing simulator that is particularly suited for EGS reservoir creation. 2) Perform laboratory scale model tests of hydraulic fracturing and proppant flow/transport using a polyaxial loading device, and use the laboratory results to test and validate the 3D simulator. 3) Perform discrete element/particulate modeling of proppant transport in hydraulic fractures, and use the resultsmore » to improve understand of proppant flow and transport. 4) Test and validate the 3D hydro-thermal fracturing simulator against case histories of EGS energy production. 5) Develop a plan to commercialize the 3D fracturing and proppant flow/transport simulator. The project is expected to yield several specific results and benefits. Major technical products from the proposal include: 1) A true-3D hydro-thermal fracturing computer code that is particularly suited to EGS, 2) Documented results of scale model tests on hydro-thermal fracturing and fracture propping in an analogue crystalline rock, 3) Documented procedures and results of discrete element/particulate modeling of flow and transport of proppants for EGS applications, and 4) Database of monitoring data, with focus of Acoustic Emissions (AE) from lab scale modeling and field case histories of EGS reservoir creation.« less
Validation in the Absence of Observed Events

DOE PAGES

Lathrop, John; Ezell, Barry

2015-07-22

Here our paper addresses the problem of validating models in the absence of observed events, in the area of Weapons of Mass Destruction terrorism risk assessment. We address that problem with a broadened definition of “Validation,” based on “backing up” to the reason why modelers and decision makers seek validation, and from that basis re-define validation as testing how well the model can advise decision makers in terrorism risk management decisions. We develop that into two conditions: Validation must be based on cues available in the observable world; and it must focus on what can be done to affect thatmore » observable world, i.e. risk management. That in turn leads to two foci: 1.) the risk generating process, 2.) best use of available data. Based on our experience with nine WMD terrorism risk assessment models, we then describe three best use of available data pitfalls: SME confidence bias, lack of SME cross-referencing, and problematic initiation rates. Those two foci and three pitfalls provide a basis from which we define validation in this context in terms of four tests -- Does the model: … capture initiation? … capture the sequence of events by which attack scenarios unfold? … consider unanticipated scenarios? … consider alternative causal chains? Finally, we corroborate our approach against three key validation tests from the DOD literature: Is the model a correct representation of the simuland? To what degree are the model results comparable to the real world? Over what range of inputs are the model results useful?« less
Validation in the Absence of Observed Events.

PubMed

Lathrop, John; Ezell, Barry

2016-04-01

This article addresses the problem of validating models in the absence of observed events, in the area of weapons of mass destruction terrorism risk assessment. We address that problem with a broadened definition of "validation," based on stepping "up" a level to considering the reason why decisionmakers seek validation, and from that basis redefine validation as testing how well the model can advise decisionmakers in terrorism risk management decisions. We develop that into two conditions: validation must be based on cues available in the observable world; and it must focus on what can be done to affect that observable world, i.e., risk management. That leads to two foci: (1) the real-world risk generating process, and (2) best use of available data. Based on our experience with nine WMD terrorism risk assessment models, we then describe three best use of available data pitfalls: SME confidence bias, lack of SME cross-referencing, and problematic initiation rates. Those two foci and three pitfalls provide a basis from which we define validation in this context in terms of four tests--Does the model: … capture initiation? … capture the sequence of events by which attack scenarios unfold? … consider unanticipated scenarios? … consider alternative causal chains? Finally, we corroborate our approach against three validation tests from the DOD literature: Is the model a correct representation of the process to be simulated? To what degree are the model results comparable to the real world? Over what range of inputs are the model results useful? © 2015 Society for Risk Analysis.
Investigation of different modeling approaches for computational fluid dynamics simulation of high-pressure rocket combustors

NASA Astrophysics Data System (ADS)

Ivancic, B.; Riedmann, H.; Frey, M.; Knab, O.; Karl, S.; Hannemann, K.

2016-07-01

The paper summarizes technical results and first highlights of the cooperation between DLR and Airbus Defence and Space (DS) within the work package "CFD Modeling of Combustion Chamber Processes" conducted in the frame of the Propulsion 2020 Project. Within the addressed work package, DLR Göttingen and Airbus DS Ottobrunn have identified several test cases where adequate test data are available and which can be used for proper validation of the computational fluid dynamics (CFD) tools. In this paper, the first test case, the Penn State chamber (RCM1), is discussed. Presenting the simulation results from three different tools, it is shown that the test case can be computed properly with steady-state Reynolds-averaged Navier-Stokes (RANS) approaches. The achieved simulation results reproduce the measured wall heat flux as an important validation parameter very well but also reveal some inconsistencies in the test data which are addressed in this paper.
Development and validation of a short version of the Partnership Self-Assessment Tool (PSAT) among professionals in Dutch disease-management partnerships

PubMed Central

2011-01-01

Background The extent to which partnership synergy is created within quality improvement programmes in the Netherlands is unknown. In this article, we describe the psychometric testing of the Partnership Self-Assessment Tool (PSAT) among professionals in twenty-two disease-management partnerships participating in quality improvement projects focused on chronic care in the Netherlands. Our objectives are to validate the PSAT in the Netherlands and to reduce the number of items of the original PSAT while maintaining validity and reliability. Methods The Dutch version of the PSAT was tested in twenty-two disease-management partnerships with 218 professionals. We tested the instrument by means of structural equation modelling, and examined its validity and reliability. Results After eliminating 14 items, the confirmatory factor analyses revealed good indices of fit with the resulting 15-item PSAT-Short version (PSAT-S). Internal consistency as represented by Cronbach's alpha ranged from acceptable (0.75) for the 'efficiency' subscale to excellent for the 'leadership' subscale (0.87). Convergent validity was provided with high correlations of the partnership dimensions and partnership synergy (ranged from 0.512 to 0.609) and high correlations with chronic illness care (ranged from 0.447 to 0.329). Conclusion The psychometric properties and convergent validity of the PSAT-S were satisfactory rendering it a valid and reliable instrument for assessing partnership synergy and its dimensions of partnership functioning. PMID:21714931
Assessing Perceptions AbouT Hazardous Substances (PATHS): The PATHS questionnaire

PubMed Central

Amlôt, Richard; Page, Lisa; Pearce, Julia; Wessely, Simon

2013-01-01

How people perceive the nature of a hazardous substance may determine how they respond when potentially exposed to it. We tested a new Perceptions AbouT Hazardous Substances (PATHS) questionnaire. In Study 1 (N = 21), we assessed the face validity of items concerning perceptions about eight properties of a hazardous substance. In Study 2 (N = 2030), we tested the factor structure, reliability and validity of the PATHS questionnaire across four qualitatively different substances. In Study 3 (N = 760), we tested the impact of information provision on Perceptions AbouT Hazardous Substances scores. Our results showed that our eight measures demonstrated good reliability and validity when used for non-contagious hazards. PMID:23104995
Validation of Physics Standardized Test Items

NASA Astrophysics Data System (ADS)

Marshall, Jill

2008-10-01

The Texas Physics Assessment Team (TPAT) examined the Texas Assessment of Knowledge and Skills (TAKS) to determine whether it is a valid indicator of physics preparation for future course work and employment, and of the knowledge and skills needed to act as an informed citizen in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam using standard statistical methods employed by test developers (factor analysis and Item Response Theory). Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing..
Development and validation of a new screening questionnaire for dysphagia in early stages of Parkinson's disease.

PubMed

Simons, Janine A; Fietzek, Urban M; Waldmann, Annika; Warnecke, Tobias; Schuster, Tibor; Ceballos-Baumann, Andrés O

2014-09-01

Dysphagia in patients with Parkinson's disease (PD) significantly reduces quality of life and predicted lifetime. Current screening procedures are insufficiently evaluated. We aimed to develop and validate a patient-reported outcome questionnaire for early diagnosis of dysphagia in patients with PD. The two-phased project comprised the questionnaire, diagnostic scales construction (N = 105), and a validation study (N = 82). Data for the project were gathered from PD patients at a German Movement Disorder Center. For validation purposes, a clinical evaluation focusing on swallowing tests, tests of sensory reflexes, and fiberoptic endoscopic evaluation of swallowing (FEES) was performed that yielded a criteria sum score against which the results of the questionnaire were compared. Specificity and sensitivity were evaluated for the detection of noticeable dysphagia and for the risk of aspiration. The Munich Dysphagia Test - Parkinson's disease (MDT-PD) consists of 26 items that show high internal consistency (α = 0.91). For the validation study, 82 patients, aged 70.9 ± 8.7 (mean ± SD), with a median Hoehn & Yahr stage of 3, were assessed. 73% of patients had dysphagia with noticeable oropharyngeal symptoms (44%) or with penetration/aspiration (29%). The criteria sum score correlated positively with the screening result (r = 0.70, p < 0.001). The MDT-PD sum score classified not noticeable dysphagia vs. risk of aspiration (noticeable dysphagia) with a sensitivity of 90% (82%) and a specificity of 86% (71%), and yielded similar results in cross-validation, respectively. MDT-PD is a valid screening tool for early diagnosis of swallowing problems and aspiration risk, as well as initial graduation of dysphagia severity in PD patients. Copyright © 2014 Elsevier Ltd. All rights reserved.
Validation of the SimSET simulation package for modeling the Siemens Biograph mCT PET scanner

NASA Astrophysics Data System (ADS)

Poon, Jonathan K.; Dahlbom, Magnus L.; Casey, Michael E.; Qi, Jinyi; Cherry, Simon R.; Badawi, Ramsey D.

2015-02-01

Monte Carlo simulation provides a valuable tool in performance assessment and optimization of system design parameters for PET scanners. SimSET is a popular Monte Carlo simulation toolkit that features fast simulation time, as well as variance reduction tools to further enhance computational efficiency. However, SimSET has lacked the ability to simulate block detectors until its most recent release. Our goal is to validate new features of SimSET by developing a simulation model of the Siemens Biograph mCT PET scanner and comparing the results to a simulation model developed in the GATE simulation suite and to experimental results. We used the NEMA NU-2 2007 scatter fraction, count rates, and spatial resolution protocols to validate the SimSET simulation model and its new features. The SimSET model overestimated the experimental results of the count rate tests by 11-23% and the spatial resolution test by 13-28%, which is comparable to previous validation studies of other PET scanners in the literature. The difference between the SimSET and GATE simulation was approximately 4-8% for the count rate test and approximately 3-11% for the spatial resolution test. In terms of computational time, SimSET performed simulations approximately 11 times faster than GATE simulations. The new block detector model in SimSET offers a fast and reasonably accurate simulation toolkit for PET imaging applications.
Extinguishing agent for magnesium fire, phases 5 and 6

NASA Astrophysics Data System (ADS)

Beeson, H. D.; Tapscott, R. E.; Mason, B. E.

1987-07-01

This report documents the validation testing of the extinguishing system for metal fires developed as part of Phases 1 to 4. The results of this validation testing form the basis of information from which draft military specifications necessary to procure the agent and the agent delivery system may be developed. The developed system was tested against a variety of large-scale metal fire scenarios and the capabilities of the system were assessed. In addition the response of the system to storage and to changes in ambient conditions was tested. Results of this testing revealed that the developed system represented a reliable metal fire extinguishing system that could control and extinguish very large metal fires. The specifications developed for the agent and for the delivery system are discussed in detail.
Exploring the validity of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) with established emotions measures.

PubMed

Roberts, Richard D; Schulze, Ralf; O'Brien, Kristin; MacCann, Carolyn; Reid, John; Maul, Andy

2006-11-01

Emotions measures represent an important means of obtaining construct validity evidence for emotional intelligence (EI) tests because they have the same theoretical underpinnings. Additionally, the extent to which both emotions and EI measures relate to intelligence is poorly understood. The current study was designed to address these issues. Participants (N = 138) completed the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT), two emotions measures, as well as four intelligence tests. Results provide mixed support for the model hypothesized to underlie the MSCEIT, with emotions research and EI measures failing to load on the same factor. The emotions measures loaded on the same factor as intelligence measures. The validity of certain EI components (in particular, Emotion Perception), as currently assessed, appears equivocal. Copyright 2006 APA, all rights reserved.
Validation of Finite Element Crash Test Dummy Models for Predicting Orion Crew Member Injuries During a Simulated Vehicle Landing

NASA Technical Reports Server (NTRS)

Tabiei, Al; Lawrence, Charles; Fasanella, Edwin L.

2009-01-01

A series of crash tests were conducted with dummies during simulated Orion crew module landings at the Wright-Patterson Air Force Base. These tests consisted of several crew configurations with and without astronaut suits. Some test results were collected and are presented. In addition, finite element models of the tests were developed and are presented. The finite element models were validated using the experimental data, and the test responses were compared with the computed results. Occupant crash data, such as forces, moments, and accelerations, were collected from the simulations and compared with injury criteria to assess occupant survivability and injury. Some of the injury criteria published in the literature is summarized for completeness. These criteria were used to determine potential injury during crew impact events.
Validity and reliability of the Diagnostic Adaptive Behaviour Scale.

PubMed

Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P

2016-01-01

The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
How Can Consumers Be Sure a Genetic Test Is Valid and Useful?

MedlinePlus

... does it take to get the results? Will health insurance cover the costs of genetic testing? What are the benefits of genetic testing? What are the risks and limitations of genetic testing? What is genetic ...
Criterion-Related Validity of Sit-and-Reach Tests for Estimating Hamstring and Lumbar Extensibility: a Meta-Analysis

PubMed Central

Mayorga-Vega, Daniel; Merino-Marban, Rafael; Viciana, Jesús

2014-01-01

The main purpose of the present meta-analysis was to examine the scientific literature on the criterion-related validity of sit-and-reach tests for estimating hamstring and lumbar extensibility. For this purpose relevant studies were searched from seven electronic databases dated up through December 2012. Primary outcomes of criterion-related validity were Pearson´s zero-order correlation coefficients (r) between sit-and-reach tests and hamstrings and/or lumbar extensibility criterion measures. Then, from the included studies, the Hunter- Schmidt´s psychometric meta-analysis approach was conducted to estimate population criterion- related validity of sit-and-reach tests. Firstly, the corrected correlation mean (rp), unaffected by statistical artefacts (i.e., sampling error and measurement error), was calculated separately for each sit-and-reach test. Subsequently, the three potential moderator variables (sex of participants, age of participants, and level of hamstring extensibility) were examined by a partially hierarchical analysis. Of the 34 studies included in the present meta-analysis, 99 correlations values across eight sit-and-reach tests and 51 across seven sit-and-reach tests were retrieved for hamstring and lumbar extensibility, respectively. The overall results showed that all sit-and-reach tests had a moderate mean criterion-related validity for estimating hamstring extensibility (rp = 0.46-0.67), but they had a low mean for estimating lumbar extensibility (rp = 0. 16-0.35). Generally, females, adults and participants with high levels of hamstring extensibility tended to have greater mean values of criterion-related validity for estimating hamstring extensibility. When the use of angular tests is limited such as in a school setting or in large scale studies, scientists and practitioners could use the sit-and-reach tests as a useful alternative for hamstring extensibility estimation, but not for estimating lumbar extensibility. Key Points Overall sit-and-reach tests have a moderate mean criterion-related validity for estimating hamstring extensibility, but they have a low mean validity for estimating lumbar extensibility. Among all the sit-and-reach test protocols, the Classic sit-and-reach test seems to be the best option to estimate hamstring extensibility. End scores (e.g., the Classic sit-and-reach test) are a better indicator of hamstring extensibility than the modifications that incorporate fingers-to-box distance (e.g., the Modified sit-and-reach test). When angular tests such as straight leg raise or knee extension tests cannot be used, sit-and-reach tests seem to be a useful field test alternative to estimate hamstring extensibility, but not to estimate lumbar extensibility. PMID:24570599
Revalidation of the NASA Ames 11-by 11-Foot Transonic Wind Tunnel with a Commercial Airplane Model

NASA Technical Reports Server (NTRS)

Kmak, Frank J.; Hudgins, M.; Hergert, D.; George, Michael W. (Technical Monitor)

2001-01-01

The 11-By 11-Foot Transonic leg of the Unitary Plan Wind Tunnel (UPWT) was modernized to improve tunnel performance, capability, productivity, and reliability. Wind tunnel tests to demonstrate the readiness of the tunnel for a return to production operations included an Integrated Systems Test (IST), calibration tests, and airplane validation tests. One of the two validation tests was a 0.037-scale Boeing 777 model that was previously tested in the 11-By 11-Foot tunnel in 1991. The objective of the validation tests was to compare pre-modernization and post-modernization results from the same airplane model in order to substantiate the operational readiness of the facility. Evaluation of within-test, test-to-test, and tunnel-to-tunnel data repeatability were made to study the effects of the tunnel modifications. Tunnel productivity was also evaluated to determine the readiness of the facility for production operations. The operation of the facility, including model installation, tunnel operations, and the performance of tunnel systems, was observed and facility deficiency findings generated. The data repeatability studies and tunnel-to-tunnel comparisons demonstrated outstanding data repeatability and a high overall level of data quality. Despite some operational and facility problems, the validation test was successful in demonstrating the readiness of the facility to perform production airplane wind tunnel%, tests.
The Dutch Review Process for Evaluating the Quality of Psychological Tests: History, Procedure, and Results

ERIC Educational Resources Information Center

Evers, Arne; Sijtsma, Klaas; Lucassen, Wouter; Meijer, Rob R.

2010-01-01

This article describes the 2009 revision of the Dutch Rating System for Test Quality and presents the results of test ratings from almost 30 years. The rating system evaluates the quality of a test on seven criteria: theoretical basis, quality of the testing materials, comprehensiveness of the manual, norms, reliability, construct validity, and…
Flight Test 4 Preliminary Results: NASA Ames SSI

NASA Technical Reports Server (NTRS)

Isaacson, Doug; Gong, Chester; Reardon, Scott; Santiago, Confesor

2016-01-01

Realization of the expected proliferation of Unmanned Aircraft System (UAS) operations in the National Airspace System (NAS) depends on the development and validation of performance standards for UAS Detect and Avoid (DAA) Systems. The RTCA Special Committee 228 is charged with leading the development of draft Minimum Operational Performance Standards (MOPS) for UAS DAA Systems. NASA, as a participating member of RTCA SC-228 is committed to supporting the development and validation of draft requirements as well as the safety substantiation and end-to-end assessment of DAA system performance. The Unmanned Aircraft System (UAS) Integration into the National Airspace System (NAS) Project conducted flight test program, referred to as Flight Test 4, at Armstrong Flight Research Center from April -June 2016. Part of the test flights were dedicated to the NASA Ames-developed Detect and Avoid (DAA) System referred to as JADEM (Java Architecture for DAA Extensibility and Modeling). The encounter scenarios, which involved NASA's Ikhana UAS and a manned intruder aircraft, were designed to collect data on DAA system performance in real-world conditions and uncertainties with four different surveillance sensor systems. Flight test 4 has four objectives: (1) validate DAA requirements in stressing cases that drive MOPS requirements, including: high-speed cooperative intruder, low-speed non-cooperative intruder, high vertical closure rate encounter, and Mode CS-only intruder (i.e. without ADS-B), (2) validate TCASDAA alerting and guidance interoperability concept in the presence of realistic sensor, tracking and navigational errors and in multiple-intruder encounters against both cooperative and non-cooperative intruders, (3) validate Well Clear Recovery guidance in the presence of realistic sensor, tracking and navigational errors, and (4) validate DAA alerting and guidance requirements in the presence of realistic sensor, tracking and navigational errors. The results will be presented at RTCA Special Committee 228 in support of final verification and validation of the DAA MOPS.
Concurrent and discriminant validity of the Star Excursion Balance Test for military personnel with lateral ankle sprain.

PubMed

Bastien, Maude; Moffet, Hélène; Bouyer, Laurent; Perron, Marc; Hébert, Luc J; Leblond, Jean

2014-02-01

The Star Excursion Balance Test (SEBT) has frequently been used to measure motor control and residual functional deficits at different stages of recovery from lateral ankle sprain (LAS) in various populations. However, the validity of the measure used to characterize performance--the maximal reach distance (MRD) measured by visual estimation--is still unknown. To evaluate the concurrent validity of the MRD in the SEBT estimated visually vs the MRD measured with a 3D motion-capture system and evaluate and compare the discriminant validity of 2 MRD-normalization methods (by height or by lower-limb length) in participants with or without LAS (n = 10 per group). There is a high concurrent validity and a good degree of accuracy between the visual estimation measurement and the MRD gold-standard measurement for both groups and under all conditions. The Cohen d ratios between groups and MANOVA products were higher when computed from MRD data normalized by height. The results support the concurrent validity of visual estimation of the MRD and the use of the SEBT to evaluate motor control. Moreover, normalization of MRD data by height appears to increase the discriminant validity of this test.

Validity and reliability of a scale to measure genital body image.

PubMed

Zielinski, Ruth E; Kane-Low, Lisa; Miller, Janis M; Sampselle, Carolyn

2012-01-01

Women's body image dissatisfaction extends to body parts usually hidden from view--their genitals. Ability to measure genital body image is limited by lack of valid and reliable questionnaires. We subjected a previously developed questionnaire, the Genital Self Image Scale (GSIS) to psychometric testing using a variety of methods. Five experts determined the content validity of the scale. Then using four participant groups, factor analysis was performed to determine construct validity and to identify factors. Further construct validity was established using the contrasting groups approach. Internal consistency and test-retest reliability was determined. Twenty one of 29 items were considered content valid. Two items were added based on expert suggestions. Factor analysis was undertaken resulting in four factors, identified as Genital Confidence, Appeal, Function, and Comfort. The revised scale (GSIS-20) included 20 items explaining 59.4% of the variance. Women indicating an interest in genital cosmetic surgery exhibited significantly lower scores on the GSIS-20 than those who did not. The final 20 item scale exhibited internal reliability across all sample groups as well as test-retest reliability. The GSIS-20 provides a measure of genital body image demonstrating reliability and validity across several populations of women.
Development of self and peer performance assessment on iodometric titration experiment

NASA Astrophysics Data System (ADS)

Nahadi; Siswaningsih, W.; Kusumaningtyas, H.

2018-05-01

This study aims to describe the process in developing of reliable and valid assessment to measure students’ performance on iodometric titration and the effect of the self and peer assessment on students’ performance. The self and peer-instrument provides valuable feedback for the student performance improvement. The developed assessment contains rubric and task for facilitating self and peer assessment. The participants are 24 students at the second-grade student in certain vocational high school in Bandung. The participants divided into two groups. The first 12 students involved in the validity test of the developed assessment, while the remain 12 students participated for the reliability test. The content validity was evaluated based on the judgment experts. Test result of content validity based on judgment expert show that the developed performance assessment instrument categorized as valid on each task with the realibity classified as very good. Analysis of the impact of the self and peer assessment implementation showed that the peer instrument supported the self assessment.
10 CFR 26.131 - Cutoff levels for validity screening and initial validity tests.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 10 Energy 1 2010-01-01 2010-01-01 false Cutoff levels for validity screening and initial validity tests. 26.131 Section 26.131 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.131 Cutoff levels for validity screening and initial validity tests. (a) Each...
10 CFR 26.131 - Cutoff levels for validity screening and initial validity tests.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 10 Energy 1 2011-01-01 2011-01-01 false Cutoff levels for validity screening and initial validity tests. 26.131 Section 26.131 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.131 Cutoff levels for validity screening and initial validity tests. (a) Each...
Preliminary psychometric properties of the chinese version of the work-related quality of life scale-2 in the nursing profession.

PubMed

Lin, Shike; Chaiear, Naesinee; Khiewyoo, Jiraporn; Wu, Bin; Johns, Nutjaree Pratheepawanit

2013-03-01

As quality of work-life (QWL) among nurses affects both patient care and institutional standards, assessment regarding QWL for the profession is important. Work-related Quality of Life Scale (WRQOLS) is a reliable QWL assessment tool for the nursing profession. To develop a Chinese version of the WRQOLS-2 and to examine its psychometric properties as an instrument to assess QWL for the nursing profession in China. Forward and back translating procedures were used to develop the Chinese version of WRQOLS-2. Six nursing experts participated in content validity evaluation and 352 registered nurses (RNs) participated in the tests. After a two-week interval, 70 of the RNs were retested. Structural validity was examined by principal components analysis and the Cronbach's alphas calculated. The respective independent sample t-test and intra-class correlation coefficient were used to analyze known-group validity and test-retest reliability. One item was rephrased for adaptation to Chinese organizational cultures. The content validity index of the scale was 0.98. Principal components analysis resulted in a seven-factor model, accounting for 62% of total variance, with Cronbach's alphas for subscales ranging from 0.71 to 0.88. Known-group validity was established in the assessment results of the participants in permanent employment vs. contract employment (t = 2.895, p < 0.01). Good test-retest reliability was observed (r = 0.88, p < 0.01). The translated Chinese version of the WRQOLS-2 has sufficient validity and reliability so that it can be used to evaluate the QWL among nurses in mainland China.
Development of the Patient Education Materials Assessment Tool (PEMAT): a new measure of understandability and actionability for print and audiovisual patient information.

PubMed

Shoemaker, Sarah J; Wolf, Michael S; Brach, Cindy

2014-09-01

To develop a reliable and valid instrument to assess the understandability and actionability of print and audiovisual materials. We compiled items from existing instruments/guides that the expert panel assessed for face/content validity. We completed four rounds of reliability testing, and produced evidence of construct validity with consumers and readability assessments. The experts deemed the PEMAT items face/content valid. Four rounds of reliability testing and refinement were conducted using raters untrained on the PEMAT. Agreement improved across rounds. The final PEMAT showed moderate agreement per Kappa (Average K=0.57) and strong agreement per Gwet's AC1 (Average=0.74). Internal consistency was strong (α=0.71; Average Item-Total Correlation=0.62). For construct validation with consumers (n=47), we found significant differences between actionable and poorly-actionable materials in comprehension scores (76% vs. 63%, p<0.05) and ratings (8.9 vs. 7.7, p<0.05). For understandability, there was a significant difference for only one of two topics on consumer numeric scores. For actionability, there were significant positive correlations between PEMAT scores and consumer-testing results, but no relationship for understandability. There were, however, strong, negative correlations between grade-level and both consumer-testing results and PEMAT scores. The PEMAT demonstrated strong internal consistency, reliability, and evidence of construct validity. The PEMAT can help professionals judge the quality of materials (available at: http://www.ahrq.gov/pemat). Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Assessing Computer Literacy: A Validated Instrument and Empirical Results.

ERIC Educational Resources Information Center

Gabriel, Roy M.

1985-01-01

Describes development of a comprehensive computer literacy assessment battery for K-12 curriculum based on objectives of a curriculum implemented in the Worldwide Department of Defense Dependents Schools system. Test development and field test data are discussed and a correlational analysis which assists in interpretation of test results is…
Reliability and factorial validity of flexibility tests for team sports.

PubMed

Sporis, Goran; Vucetic, Vlatko; Jovanovic, Mario; Jukic, Igor; Omrcen, Darija

2011-04-01

The main goal of this method paper was to evaluate the reliability and factorial validity of flexibility tests used in soccer, and to do crossvalidation study on 2 other team sports using handball and basketball players. The second aim was to compare the validity of the different tests and evaluate the flexibility of soccer players; the third was to determine the positional differences between attackers, defenders, and midfielders in all flexibility tests. One hundred and fifty (n = 150) elite male junior soccer players, members of the First Croatian Junior League Teams, and 60 (n = 60) handball and 60 (n = 60) basketball players also members of the First Croatian Junior League Teams volunteered to participate in the study, tested for the purpose of crossvalidation. The SAR and V-SAR had the greatest AVR and ICC. The within-subjects variation ranged from between 0.3 and 3.8%. The lowest value of CV was found between the LSPL and LSPR. Low to moderate statistically significant correlation coefficients were found among all the measured flexibility tests. It was observed that the greatest correlations existed between the SAR and V-SAR (r = 0.65) and between the LLSR and LLSL (r = 0.56). Statistically significant correlations were also observed between the BLPL and BLPR (r = 0.62). The principal components factor analysis of 9 flexibility tests resulted in the extraction of 3 significant components. The results of this study have the following implications for the assessment of flexibility in soccer: (a) all flexibility tests used in this study have the acceptable between and within-subjects reliability and they can be used to estimate the flexibility of soccer players; (b) the LSPL and LSPR tests are the most reliable and valid flexibility tests for the estimation of flexibility of professional soccer players.
Fecal electrolyte testing for evaluation of unexplained diarrhea: Validation of body fluid test accuracy in the absence of a reference method.

PubMed

Voskoboev, Nikolay V; Cambern, Sarah J; Hanley, Matthew M; Giesen, Callen D; Schilling, Jason J; Jannetto, Paul J; Lieske, John C; Block, Darci R

2015-11-01

Validation of tests performed on body fluids other than blood or urine can be challenging due to the lack of a reference method to confirm accuracy. The aim of this study was to evaluate alternate assessments of accuracy that laboratories can rely on to validate body fluid tests in the absence of a reference method using the example of sodium (Na(+)), potassium (K(+)), and magnesium (Mg(2+)) testing in stool fluid. Validations of fecal Na(+), K(+), and Mg(2+) were performed on the Roche cobas 6000 c501 (Roche Diagnostics) using residual stool specimens submitted for clinical testing. Spiked recovery, mixing studies, and serial dilutions were performed and % recovery of each analyte was calculated to assess accuracy. Results were confirmed by comparison to a reference method (ICP-OES, PerkinElmer). Mean recoveries for fecal electrolytes were Na(+) upon spiking=92%, mixing=104%, and dilution=105%; K(+) upon spiking=94%, mixing=96%, and dilution=100%; and Mg(2+) upon spiking=93%, mixing=98%, and dilution=100%. When autoanalyzer results were compared to reference ICP-OES results, Na(+) had a slope=0.94, intercept=4.1, and R(2)=0.99; K(+) had a slope=0.99, intercept=0.7, and R(2)=0.99; and Mg(2+) had a slope=0.91, intercept=-4.6, and R(2)=0.91. Calculated osmotic gap using both methods were highly correlated with slope=0.95, intercept=4.5, and R(2)=0.97. Acid pretreatment increased magnesium recovery from a subset of clinical specimens. A combination of mixing, spiking, and dilution recovery experiments are an acceptable surrogate for assessing accuracy in body fluid validations in the absence of a reference method. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Validation of an advanced analytical procedure applied to the measurement of environmental radioactivity.

PubMed

Thanh, Tran Thien; Vuong, Le Quang; Ho, Phan Long; Chuong, Huynh Dinh; Nguyen, Vo Hoang; Tao, Chau Van

2018-04-01

In this work, an advanced analytical procedure was applied to calculate radioactivity in spiked water samples in a close geometry gamma spectroscopy. It included MCNP-CP code in order to calculate the coincidence summing correction factor (CSF). The CSF results were validated by a deterministic method using ETNA code for both p-type HPGe detectors. It showed that a good agreement for both codes. Finally, the validity of the developed procedure was confirmed by a proficiency test to calculate the activities of various radionuclides. The results of the radioactivity measurement with both detectors using the advanced analytical procedure were received the ''Accepted'' statuses following the proficiency test. Copyright © 2018 Elsevier Ltd. All rights reserved.
Results of Fall 2001 Pilot: Methodology for Validation of Course Prerequisites.

ERIC Educational Resources Information Center

Serban, Andreea M.; Fleming, Steve

The purpose of this study was to test a methodology that will help Santa Barbara City College (SBCC), California, to validate the course prerequisites that fall under the category of highest level of scrutiny--data collection and analysis--as defined by the Chancellor's Office. This study gathered data for the validation of prerequisites for three…
The Validity and Reliability of the Mobbing Scale (MS)

ERIC Educational Resources Information Center

Yaman, Erkan

2009-01-01

The aim of this research is to develop the Mobbing Scale and examine its validity and reliability. The sample of the study consisted of 515 persons from Sakarya and Bursa. In this study, construct validity, internal consistency, test-retest reliability, and item analysis of the scale were examined. As a result of factor analysis for construct…
Validation to Portuguese of the Scale of Student Satisfaction and Self-Confidence in Learning1

PubMed Central

Almeida, Rodrigo Guimarães dos Santos; Mazzo, Alessandra; Martins, José Carlos Amado; Baptista, Rui Carlos Negrão; Girão, Fernanda Berchelli; Mendes, Isabel Amélia Costa

2015-01-01

Objective: translate and validate to Portuguese the Scale of Student Satisfaction and Self-Confidence in Learning. Material and Methods: methodological translation and validation study of a research tool. After following all steps of the translation process, for the validation process, the event III Workshop Brazil - Portugal: Care Delivery to Critical Patients was created, promoted by one Brazilian and another Portuguese teaching institution. Results: 103 nurses participated. As to the validity and reliability of the scale, the correlation pattern between the variables, the sampling adequacy test (Kaiser-Meyer-Olkin) and the sphericity test (Bartlett) showed good results. In the exploratory factorial analysis (Varimax), item 9 behaved better in factor 1 (Satisfaction) than in factor 2 (Self-confidence in learning). The internal consistency (Cronbach's alpha) showed coefficients of 0.86 in factor 1 with six items and 0.77 for factor 2 with 07 items. Conclusion: in Portuguese this tool was called: Escala de Satisfação de Estudantes e Autoconfiança na Aprendizagem. The results found good psychometric properties and a good potential use. The sampling size and specificity are limitations of this study, but future studies will contribute to consolidate the validity of the scale and strengthen its potential use. PMID:26625990
The sensitivity of laboratory tests assessing driving related skills to dose-related impairment of alcohol: A literature review.

PubMed

Jongen, S; Vuurman, E F P M; Ramaekers, J G; Vermeeren, A

2016-04-01

Laboratory tests assessing driving related skills can be useful as initial screening tools to assess potential drug induced impairment as part of a standardized behavioural assessment. Unfortunately, consensus about which laboratory tests should be included to reliably assess drug induced impairment has not yet been reached. The aim of the present review was to evaluate the sensitivity of laboratory tests to the dose dependent effects of alcohol, as a benchmark, on performance parameters. In total, 179 experimental studies were included. Results show that a cued go/no-go task and a divided attention test with primary tracking and secondary visual search were consistently sensitive to the impairing effects at medium and high blood alcohol concentrations. Driving performance assessed in a simulator was less sensitive to the effects of alcohol as compared to naturalistic, on-the-road driving. In conclusion, replicating results of several potentially useful tests and their predictive validity of actual driving impairment should deserve further research. In addition, driving simulators should be validated and compared head to head to naturalistic driving in order to increase construct validity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Validity of the Eating Attitude Test among Exercisers.

PubMed

Lane, Helen J; Lane, Andrew M; Matheson, Hilary

2004-12-01

Theory testing and construct measurement are inextricably linked. To date, no published research has looked at the factorial validity of an existing eating attitude inventory for use with exercisers. The Eating Attitude Test (EAT) is a 26-item measure that yields a single index of disordered eating attitudes. The original factor analysis showed three interrelated factors: Dieting behavior (13-items), oral control (7-items), and bulimia nervosa-food preoccupation (6-items). The primary purpose of the study was to examine the factorial validity of the EAT among a sample of exercisers. The second purpose was to investigate relationships between eating attitudes scores and selected psychological constructs. In stage one, 598 regular exercisers completed the EAT. Confirmatory factor analysis (CFA) was used to test the single-factor, a three-factor model, and a four-factor model, which distinguished bulimia from food pre-occupation. CFA of the single-factor model (RCFI = 0.66, RMSEA = 0.10), the three-factor-model (RCFI = 0.74; RMSEA = 0.09) showed poor model fit. There was marginal fit for the 4-factor model (RCFI = 0.91, RMSEA = 0.06). Results indicated five-items showed poor factor loadings. After these 5-items were discarded, the three models were re-analyzed. CFA results indicated that the single-factor model (RCFI = 0.76, RMSEA = 0.10) and three-factor model (RCFI = 0.82, RMSEA = 0.08) showed poor fit. CFA results for the four-factor model showed acceptable fit indices (RCFI = 0.98, RMSEA = 0.06). Stage two explored relationships between EAT scores, mood, self-esteem, and motivational indices toward exercise in terms of self-determination, enjoyment and competence. Correlation results indicated that depressed mood scores positively correlated with bulimia and dieting scores. Further, dieting was inversely related with self-determination toward exercising. Collectively, findings suggest that a 21-item four-factor model shows promising validity coefficients among exercise participants, and that future research is needed to investigate eating attitudes among samples of exercisers. Key PointsValidity of psychometric measures should be thoroughly investigated. Researchers should not assume that a scale validation on one sample will show the same validity coefficients in a different population.The Eating Attitude Test is a commonly used scale. The present study shows a revised 21-item scale was suitable for exercisers.Researchers using the Eating Attitude Test should use subscales of Dieting, Oral control, Food pre-occupation, and Bulimia.Future research should involve qualitative techniques and interview exercise participants to explore the nature of eating attitudes.
Validity of the Worth 4 Dot Test in Patients with Red-Green Color Vision Defect.

PubMed

Bak, Eunoo; Yang, Hee Kyung; Hwang, Jeong-Min

2017-05-01

The Worth four dot test uses red and green glasses for binocular dissociation, and although it has been believed that patients with red-green color vision defects cannot accurately perform the Worth four dot test, this has not been validated. Therefore, the purpose of this study was to demonstrate the validity of the Worth four dot test in patients with congenital red-green color vision defects who have normal or abnormal binocular vision. A retrospective review of medical records was performed on 30 consecutive congenital red-green color vision defect patients who underwent the Worth four dot test. The type of color vision anomaly was determined by the Hardy Rand and Rittler (HRR) pseudoisochromatic plate test, Ishihara color test, anomaloscope, and/or the 100 hue test. All patients underwent a complete ophthalmologic examination. Binocular sensory status was evaluated with the Worth four dot test and Randot stereotest. The results were interpreted according to the presence of strabismus or amblyopia. Among the 30 patients, 24 had normal visual acuity without strabismus nor amblyopia and 6 patients had strabismus and/or amblyopia. The 24 patients without strabismus nor amblyopia all showed binocular fusional responses by seeing four dots of the Worth four dot test. Meanwhile, the six patients with strabismus or amblyopia showed various results of fusion, suppression, and diplopia. Congenital red-green color vision defect patients of different types and variable degree of binocularity could successfully perform the Worth four dot test. They showed reliable results that were in accordance with their estimated binocular sensory status.
Verification and Validation Studies for the LAVA CFD Solver

NASA Technical Reports Server (NTRS)

Moini-Yekta, Shayan; Barad, Michael F; Sozer, Emre; Brehm, Christoph; Housman, Jeffrey A.; Kiris, Cetin C.

2013-01-01

The verification and validation of the Launch Ascent and Vehicle Aerodynamics (LAVA) computational fluid dynamics (CFD) solver is presented. A modern strategy for verification and validation is described incorporating verification tests, validation benchmarks, continuous integration and version control methods for automated testing in a collaborative development environment. The purpose of the approach is to integrate the verification and validation process into the development of the solver and improve productivity. This paper uses the Method of Manufactured Solutions (MMS) for the verification of 2D Euler equations, 3D Navier-Stokes equations as well as turbulence models. A method for systematic refinement of unstructured grids is also presented. Verification using inviscid vortex propagation and flow over a flat plate is highlighted. Simulation results using laminar and turbulent flow past a NACA 0012 airfoil and ONERA M6 wing are validated against experimental and numerical data.
Validity of a novel computerized screening test system for mild cognitive impairment.

PubMed

Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk

2018-06-20

ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.
Solar Tower Experiments for Radiometric Calibration and Validation of Infrared Imaging Assets and Analysis Tools for Entry Aero-Heating Measurements

NASA Technical Reports Server (NTRS)

Splinter, Scott C.; Daryabeigi, Kamran; Horvath, Thomas J.; Mercer, David C.; Ghanbari, Cheryl M.; Ross, Martin N.; Tietjen, Alan; Schwartz, Richard J.

2008-01-01

The NASA Engineering and Safety Center sponsored Hypersonic Thermodynamic Infrared Measurements assessment team has a task to perform radiometric calibration and validation of land-based and airborne infrared imaging assets and tools for remote thermographic imaging. The IR assets and tools will be used for thermographic imaging of the Space Shuttle Orbiter during entry aero-heating to provide flight boundary layer transition thermography data that could be utilized for calibration and validation of empirical and theoretical aero-heating tools. A series of tests at the Sandia National Laboratories National Solar Thermal Test Facility were designed for this task where reflected solar radiation from a field of heliostats was used to heat a 4 foot by 4 foot test panel consisting of LI 900 ceramic tiles located on top of the 200 foot tall Solar Tower. The test panel provided an Orbiter-like entry temperature for the purposes of radiometric calibration and validation. The Solar Tower provided an ideal test bed for this series of radiometric calibration and validation tests because it had the potential to rapidly heat the large test panel to spatially uniform and non-uniform elevated temperatures. Also, the unsheltered-open-air environment of the Solar Tower was conducive to obtaining unobstructed radiometric data by land-based and airborne IR imaging assets. Various thermocouples installed on the test panel and an infrared imager located in close proximity to the test panel were used to obtain surface temperature measurements for evaluation and calibration of the radiometric data from the infrared imaging assets. The overall test environment, test article, test approach, and typical test results are discussed.
Assessing the treatment effects in apraxia of speech: introduction and evaluation of the Modified Diadochokinesis Test.

PubMed

Hurkmans, Joost; Jonkers, Roel; Boonstra, Anne M; Stewart, Roy E; Reinders-Messelink, Heleen A

2012-01-01

The number of reliable and valid instruments to measure the effects of therapy in apraxia of speech (AoS) is limited. To evaluate the newly developed Modified Diadochokinesis Test (MDT), which is a task to assess the effects of rate and rhythm therapies for AoS in a multiple baseline across behaviours design. The consistency, accuracy and fluency of speech of 24 adults with AoS and 12 unaffected speakers matched for age, gender and educational level were assessed using the MDT. The reliability and validity of the instrument were considered and outcomes compared with those obtained with existing tests. The results revealed that MDT had a strong internal consistency. Scores were influenced by syllable structure complexity, while distinctive features of articulation had no measurable effect. The test-retest and intra- and inter-rater reliabilities were shown to be adequate, and the discriminant validity was good. For convergent validity different outcomes were found: apart from one correlation, the scores on tests assessing functional communication and AoS correlated significantly with the MDT outcome measures. The spontaneous speech phonology measure of the Aachen Aphasia Test (AAT) correlated significantly with the MDT outcome measures, but no correlations were found for the repetition subtest and the spontaneous speech articulation/prosody measure of the AAT. The study shows that the MDT has adequate psychometric properties, implying that it can be used to measure changes in speech motor control during treatment for apraxia of speech. The results demonstrate the validity and utility of the instrument as a supplement to speech tasks in assessing speech improvement aimed at the level of planning and programming of speech. © 2012 Royal College of Speech and Language Therapists.

Trait and state anxiety across academic evaluative contexts: development and validation of the MTEA-12 and MSEA-12 scales.

PubMed

Sotardi, Valerie A

2018-05-01

Educational measures of anxiety focus heavily on students' experiences with tests yet overlook other assessment contexts. In this research, two brief multiscale questionnaires were developed and validated to measure trait evaluation anxiety (MTEA-12) and state evaluation anxiety (MSEA-12) for use in various assessment contexts in non-clinical, educational settings. The research included a cross-sectional analysis of self-report data using authentic assessment settings in which evaluation anxiety was measured. Instruments were tested using a validation sample of 241 first-year university students in New Zealand. Scale development included component structures for state and trait scales based on existing theoretical frameworks. Analyses using confirmatory factor analysis and descriptive statistics indicate that the scales are reliable and structurally valid. Multivariate general linear modeling using subscales from the MTEA-12, MSEA-12, and student grades suggest adequate criterion-related validity. Initial predictive validity in which one relevant MTEA-12 factor explained between 21% and 54% of the variance in three MSEA-12 factors. Results document MTEA-12 and MSEA-12 as reliable measures of trait and state dimensions of evaluation anxiety for test and writing contexts. Initial estimates suggest the scales as having promising validity, and recommendations for further validation are outlined.
[Design and validation of scales to measure adolescent attitude toward eating and toward physical activity].

PubMed

Lima-Serrano, Marta; Lima-Rodríguez, Joaquín Salvador; Sáez-Bueno, Africa

2012-01-01

Different authors suggest that attitude is a mediator in behavior change, so it is a predictor of behavior practice. The main of this study was to design and to validate two scales for measure adolescent attitude toward healthy eating and adolescent attitude toward healthy physical activity. Scales were design based on a literature review. After, they were validated using an on-line Delphi Panel with eighteen experts, a pretest, and a pilot test with a sample of 188 high school students. Comprehensibility, content validity, adequacy, as well as the reliability (alpha of Cronbach test), and construct validity (exploratory factor analysis) of scales were tested. Scales validated by experts were considered appropriate in the pretest. In the pilot test, the ten-item Attitude to Eating Scale obtained α=0.72. The eight-item Attitude to Physical Activity Scale obtained α=0.86. They showed evidence of one-dimensional interpretation after factor analysis, a) all items got weights r>0.30 in first factor before rotations, b) the first factor explained a significant proportion of variance before rotations, and c) the total variance explained by the main factors extracted was greater than 50%. The Scales showed their reliability and validity. They could be employed to assess attitude to these priority intervention areas in Spanish adolescents, and to evaluate this intermediate result of health interventions and health programs.
Cross-cultural adaptation and psychometric evaluation of oral health impact profile among school teacher community

PubMed Central

Vyas, Shaleen; Nagarajappa, Sandesh; Dasar, Pralhad L.; Mishra, Prashant

2018-01-01

AIM: To translate OHIP-14 into Hindi and test its psychometric properties among school teacher community. METHODS: The OHIP-14 was translated to OHIP-14-H using WHO recommended translation protocol. During pre-testing, an expert panel assessed content validity of the questionnaire. Face validity was assessed on a sample of 10 individuals. The OHIP-14-H was administered on a random sample of 170 primary school teachers. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and Intra-class correlation coefficient (ICC) respectively, with 2 weeks interval. Predictive validity was tested by comparing OHIP-14-H scores with clinical parameters. The concurrent validity was assessed using self-reported oral health and discriminant validity was ascertained through negative association with sociodemographic variables. RESULTS: The mean OHIP-14-H score was 9.57 (S.D = 4.58). ICC and Cronbach's alpha for OHIP-14-H was 0.96 and 0.92 respectively. Concurrent validity using binomial regression model indicated that good (OR = 0.56, 95% CI = 0.55 – 4.47) and moderate (OR = 0.25, 95% CI = 0.17 – 1.87) OHIP-14-H scores were negative but significant risk indicators of poor self reported oral health (P < 0.009). Significant predictive validity was observed between OHIP-14-H scores and clinical parameters (P < 0.000). CONCLUSION: Translated and culturally adapted OHIP-14-H indicates good reliability and validity among primary school teachers. PMID:29417064
Test-retest reliability and cross validation of the functioning everyday with a wheelchair instrument.

PubMed

Mills, Tamara L; Holm, Margo B; Schmeler, Mark

2007-01-01

The purpose of this study was to establish the test-retest reliability and content validity of an outcomes tool designed to measure the effectiveness of seating-mobility interventions on the functional performance of individuals who use wheelchairs or scooters as their primary seating-mobility device. The instrument, Functioning Everyday With a Wheelchair (FEW), is a questionnaire designed to measure perceived user function related to wheelchair/scooter use. Using consumer-generated items, FEW Beta Version 1.0 was developed and test-retest reliability was established. Cross-validation of FEW Beta Version 1.0 was then carried out with five samples of seating-mobility users to establish content validity. Based on the content validity study, FEW Version 2.0 was developed and administered to seating-mobility consumers to examine its test-retest reliability. FEW Beta Version 1.0 yielded an intraclass correlation coefficient (ICC) Model (3,k) of .92, p < .001, and the content validity results revealed that FEW Beta Version 1.0 captured 55% of seating-mobility goals reported by consumers across five samples. FEW Version 2.0 yielded ICC(3,k) = .86, p < .001, and captured 98.5% of consumers' seating-mobility goals. The cross-validation study identified new categories of seating-mobility goals for inclusion in FEW Version 2.0, and the content validity of FEW Version 2.0 was confirmed. FEW Beta Version 1.0 and FEW Version 2.0 were highly stable in their measurement of participants' seating-mobility goals over a 1-week interval.
Development and psychometric testing of the Protective Reasons Against Suicide Inventory for assessing older Chinese-speaking outpatients in primary care settings.

PubMed

Wang, Yi-Wen; Tsai, Yun-Fang; Lee, Shwu-Hua; Chen, Ying-Jen; Chen, Hsiu-Fang

2016-07-01

To develop and psychometrically test the Protective Reasons against Suicide Inventory among older Chinese-speaking outpatients. Tools currently exist to test reasons for living among individuals of all ages in western countries, but few are available to assess older adults' protective reasons against suicide in Asia. A cross-sectional survey to investigate protective reasons against suicide among older Chinese-speaking outpatients. The Protective Reasons against Suicide Inventory was developed based on individual interviews with 83 older outpatients in Taiwan, the literature and the authors' clinical experiences. The resulting Inventory was examined in 2013 for content validity, face validity, construct validity, criterion-related validity, internal consistency reliability and test-retest reliability. The Inventory had excellent content validity and face validity. Factor analysis yielded a seven-factor solution, accounting for 87·7% of the variance. Scores on the global Inventory and its subscales tended to be higher in outpatients diagnosed without suicidal ideation than in outpatients diagnosed with suicidal ideation, indicating good criterion validity. Inventory reliability and the intraclass correlation coefficient were satisfactory. The Protective Reasons against Suicide Inventory can be completed in 5 minutes and is perceived as easy to complete. Moreover, the Inventory yielded highly acceptable parameters for validity and reliability. The Protective Reasons against Suicide Inventory can be used to assess older Chinese-speaking outpatients for factors that protect them from attempting suicide. © 2016 John Wiley & Sons Ltd.
Assessing reading comprehension with narrative and expository texts: Dimensionality and relationship with fluency, vocabulary and memory.

PubMed

Santos, Sandra; Cadime, Irene; Viana, Fernanda L; Chaves-Sousa, Séli; Gayo, Elena; Maia, José; Ribeiro, Iolanda

2017-02-01

Reading comprehension assessment should rely on valid instruments that enable adequate conclusions to be taken regarding students' reading comprehension performance. In this article, two studies were conducted to collect validity evidence for the vertically scaled forms of two Tests of Reading Comprehension for Portuguese elementary school students in the second to fourth grades, one with narrative texts (TRC-n) and another with expository ones (TRC-e). Two samples of 950 and 990 students participated in Study 1, the study of the dimensionality of the TRC-n and TRC-e forms, respectively. Confirmatory factor analyses provided evidence of an acceptable fit for the one-factor solution for all test forms. Study 2 included 218 students to collect criterion-related validity. The scores obtained in each of the test forms were significantly correlated with the ones obtained in other reading comprehension measures and with the results obtained in oral reading fluency, vocabulary and working memory tests. Evidence suggests that the test forms are valid measures of reading comprehension. © 2016 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
Summarising and validating test accuracy results across multiple studies for use in clinical practice.

PubMed

Riley, Richard D; Ahmed, Ikhlaaq; Debray, Thomas P A; Willis, Brian H; Noordzij, J Pieter; Higgins, Julian P T; Deeks, Jonathan J

2015-06-15

Following a meta-analysis of test accuracy studies, the translation of summary results into clinical practice is potentially problematic. The sensitivity, specificity and positive (PPV) and negative (NPV) predictive values of a test may differ substantially from the average meta-analysis findings, because of heterogeneity. Clinicians thus need more guidance: given the meta-analysis, is a test likely to be useful in new populations, and if so, how should test results inform the probability of existing disease (for a diagnostic test) or future adverse outcome (for a prognostic test)? We propose ways to address this. Firstly, following a meta-analysis, we suggest deriving prediction intervals and probability statements about the potential accuracy of a test in a new population. Secondly, we suggest strategies on how clinicians should derive post-test probabilities (PPV and NPV) in a new population based on existing meta-analysis results and propose a cross-validation approach for examining and comparing their calibration performance. Application is made to two clinical examples. In the first example, the joint probability that both sensitivity and specificity will be >80% in a new population is just 0.19, because of a low sensitivity. However, the summary PPV of 0.97 is high and calibrates well in new populations, with a probability of 0.78 that the true PPV will be at least 0.95. In the second example, post-test probabilities calibrate better when tailored to the prevalence in the new population, with cross-validation revealing a probability of 0.97 that the observed NPV will be within 10% of the predicted NPV. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Development, construct validity and test-retest reliability of a field-based wheelchair mobility performance test for wheelchair basketball.

PubMed

de Witte, Annemarie M H; Hoozemans, Marco J M; Berger, Monique A M; van der Slikke, Rienk M A; van der Woude, Lucas H V; Veeger, Dirkjan H E J

2018-01-01

The aim of this study was to develop and describe a wheelchair mobility performance test in wheelchair basketball and to assess its construct validity and reliability. To mimic mobility performance of wheelchair basketball matches in a standardised manner, a test was designed based on observation of wheelchair basketball matches and expert judgement. Forty-six players performed the test to determine its validity and 23 players performed the test twice for reliability. Independent-samples t-tests were used to assess whether the times needed to complete the test were different for classifications, playing standards and sex. Intraclass correlation coefficients (ICC) were calculated to quantify reliability of performance times. Males performed better than females (P < 0.001, effect size [ES] = -1.26) and international men performed better than national men (P < 0.001, ES = -1.62). Performance time of low (≤2.5) and high (≥3.0) classification players was borderline not significant with a moderate ES (P = 0.06, ES = 0.58). The reliability was excellent for overall performance time (ICC = 0.95). These results show that the test can be used as a standardised mobility performance test to validly and reliably assess the capacity in mobility performance of elite wheelchair basketball athletes. Furthermore, the described methodology of development is recommended for use in other sports to develop sport-specific tests.
Quality appraisal of generic self-reported instruments measuring health-related productivity changes: a systematic review

PubMed Central

2014-01-01

Background Health impairments can result in disability and changed work productivity imposing considerable costs for the employee, employer and society as a whole. A large number of instruments exist to measure health-related productivity changes; however their methodological quality remains unclear. This systematic review critically appraised the measurement properties in generic self-reported instruments that measure health-related productivity changes to recommend appropriate instruments for use in occupational and economic health practice. Methods PubMed, PsycINFO, Econlit and Embase were systematically searched for studies whereof: (i) instruments measured health-related productivity changes; (ii) the aim was to evaluate instrument measurement properties; (iii) instruments were generic; (iv) ratings were self-reported; (v) full-texts were available. Next, methodological quality appraisal was based on COSMIN elements: (i) internal consistency; (ii) reliability; (iii) measurement error; (iv) content validity; (v) structural validity; (vi) hypotheses testing; (vii) cross-cultural validity; (viii) criterion validity; and (ix) responsiveness. Recommendations are based on evidence syntheses. Results This review included 25 articles assessing the reliability, validity and responsiveness of 15 different generic self-reported instruments measuring health-related productivity changes. Most studies evaluated criterion validity, none evaluated cross-cultural validity and information on measurement error is lacking. The Work Limitation Questionnaire (WLQ) was most frequently evaluated with moderate respectively strong positive evidence for content and structural validity and negative evidence for reliability, hypothesis testing and responsiveness. Less frequently evaluated, the Stanford Presenteeism Scale (SPS) showed strong positive evidence for internal consistency and structural validity, and moderate positive evidence for hypotheses testing and criterion validity. The Productivity and Disease Questionnaire (PRODISQ) yielded strong positive evidence for content validity, evidence for other properties is lacking. The other instruments resulted in mostly fair-to-poor quality ratings with limited evidence. Conclusions Decisions based on the content of the instrument, usage purpose, target country and population, and available evidence are recommended. Until high-quality studies are in place to accurately assess the measurement properties of the currently available instruments, the WLQ and, in a Dutch context, the PRODISQ are cautiously preferred based on its strong positive evidence for content validity. Based on its strong positive evidence for internal consistency and structural validity, the SPS is cautiously recommended. PMID:24495301
What tests should you use to assess small intestinal bacterial overgrowth in systemic sclerosis?

PubMed

Braun-Moscovici, Yolanda; Braun, Marius; Khanna, Dinesh; Balbir-Gurman, Alexandra; Furst, Daniel E

2015-01-01

Small intestinal bacterial overgrowth (SIBO) plays a major role in the pathogenesis of malabsorption in SSc patients and is a source of great morbidity and even mortality, in those patients. This manuscript reviews which tests are valid and should be used in SSc when evaluating SIBO. We performed systematic literature searches in PubMed, Embase and the Cochrane library from 1966 up to November 2014 for English language, published articles examining bacterial overgrowth in SSc (e.g. malabsorption tests, breath tests, xylose test, etc). Articles obtained from these searches were reviewed for additional references. The validity of the tests was evaluated according to the OMERACT principles of truth, discrimination and feasibility. From a total of 65 titles, 22 articles were reviewed and 20 were ultimately extracted to examine the validity of tests for GI morphology, bacterial overgrowth and malabsorption in SSc. Only 1 test (hydrogen and methane breath tests) is fully validated. Four tests are partially validated, including jejunal cultures, xylose, lactulose tests, and 72 hours fecal fat test. Only 1 of a total of 5 GI tests of bacterial overgrowth (see above) is fully validated in SSc. For clinical trials, fully validated tests are preferred, although some investigators use partially validated tests (4 tests). Further validation of GI tests in SSc is needed.
The Queensland high risk foot form (QHRFF) – is it a reliable and valid clinical research tool for foot disease?

PubMed Central

2014-01-01

Background Foot disease complications, such as foot ulcers and infection, contribute to considerable morbidity and mortality. These complications are typically precipitated by “high-risk factors”, such as peripheral neuropathy and peripheral arterial disease. High-risk factors are more prevalent in specific “at risk” populations such as diabetes, kidney disease and cardiovascular disease. To the best of the authors’ knowledge a tool capturing multiple high-risk factors and foot disease complications in multiple at risk populations has yet to be tested. This study aimed to develop and test the validity and reliability of a Queensland High Risk Foot Form (QHRFF) tool. Methods The study was conducted in two phases. Phase one developed a QHRFF using an existing diabetes foot disease tool, literature searches, stakeholder groups and expert panel. Phase two tested the QHRFF for validity and reliability. Four clinicians, representing different levels of expertise, were recruited to test validity and reliability. Three cohorts of patients were recruited; one tested criterion measure reliability (n = 32), another tested criterion validity and inter-rater reliability (n = 43), and another tested intra-rater reliability (n = 19). Validity was determined using sensitivity, specificity and positive predictive values (PPV). Reliability was determined using Kappa, weighted Kappa and intra-class correlation (ICC) statistics. Results A QHRFF tool containing 46 items across seven domains was developed. Criterion measure reliability of at least moderate categories of agreement (Kappa > 0.4; ICC > 0.75) was seen in 91% (29 of 32) tested items. Criterion validity of at least moderate categories (PPV > 0.7) was seen in 83% (60 of 72) tested items. Inter- and intra-rater reliability of at least moderate categories (Kappa > 0.4; ICC > 0.75) was seen in 88% (84 of 96) and 87% (20 of 23) tested items respectively. Conclusions The QHRFF had acceptable validity and reliability across the majority of items; particularly items identifying relevant co-morbidities, high-risk factors and foot disease complications. Recommendations have been made to improve or remove identified weaker items for future QHRFF versions. Overall, the QHRFF possesses suitable practicality, validity and reliability to assess and capture relevant foot disease items across multiple at risk populations. PMID:24468080
Reconceptualising the external validity of discrete choice experiments.

PubMed

Lancsar, Emily; Swait, Joffre

2014-10-01

External validity is a crucial but under-researched topic when considering using discrete choice experiment (DCE) results to inform decision making in clinical, commercial or policy contexts. We present the theory and tests traditionally used to explore external validity that focus on a comparison of final outcomes and review how this traditional definition has been empirically tested in health economics and other sectors (such as transport, environment and marketing) in which DCE methods are applied. While an important component, we argue that the investigation of external validity should be much broader than a comparison of final outcomes. In doing so, we introduce a new and more comprehensive conceptualisation of external validity, closely linked to process validity, that moves us from the simple characterisation of a model as being or not being externally valid on the basis of predictive performance, to the concept that external validity should be an objective pursued from the initial conceptualisation and design of any DCE. We discuss how such a broader definition of external validity can be fruitfully used and suggest innovative ways in which it can be explored in practice.
FEMFLOW3D; a finite-element program for the simulation of three-dimensional aquifers; version 1.0

USGS Publications Warehouse

Durbin, Timothy J.; Bond, Linda D.

1998-01-01

This document also includes model validation, source code, and example input and output files. Model validation was performed using four test problems. For each test problem, the results of a model simulation with FEMFLOW3D were compared with either an analytic solution or the results of an independent numerical approach. The source code, written in the ANSI x3.9-1978 FORTRAN standard, and the complete input and output of an example problem are listed in the appendixes.
Measuring attention in very old adults using the Test of Everyday Attention.

PubMed

van der Leeuw, Guusje; Leveille, Suzanne G; Jones, Richard N; Hausdorff, Jeffrey M; McLean, Robert; Kiely, Dan K; Gagnon, Margaret; Milberg, William P

2017-09-01

There is a need for validated measures of attention for use in longitudinal studies of older populations. We studied 249 participants aged 80 to 101 years using the population-based MOBILIZE Boston Study. Four subscales of the Test of Everyday Attention (TEA) were included, measuring attention switching, selective, sustained and divided attention and a neuropsychological battery including validated measures of multiple cognitive domains measuring attention, executive function and memory. The TEA previously has not been validated in persons aged 80 and older. Among participants who completed the TEA, scores on other attentional measures strongly with TEA domains (R=.60-.70). Proportions of participants with incomplete TEA subscales ranged from 8% (selective attention) to 19% (attentional switching). Reasons for not completing TEA tests included failure to comprehend test instructions despite repetition and practice. These results demonstrate the challenges and potential value of the Test of Everyday Attention in studies of very old populations.
Center for Epidemiologic Studies Depression Scale for Children: psychometric testing of the Chinese version.

PubMed

Li, Ho Cheung William; Chung, Oi Kwan Joyce; Ho, Ka Yan

2010-11-01

This paper is a report of psychometric testing of the Chinese version of the Center for Epidemiologic Studies Depression Scale for Children. The availability of a valid and reliable instrument that accurately detects depressive symptoms in children is crucial before any psychological intervention can be appropriately planned and evaluated. There is no such an instrument for Chinese children. A test-retest, within-subjects design was used. A total of 313 primary school students between the ages of 8 and 12 years were invited to participate in the study in 2009. Participants were asked to respond to the Chinese version of the Center for Epidemiologic Studies Depression Scale for Children, short form of the State Anxiety Scale for Children and Rosenberg's Self-Esteem Scale. The internal consistency, content validity and construct validity and test-retest reliability of the Chinese version of the Center for Epidemiologic Studies Depression Scale for Children were assessed. The newly-translated scale demonstrated adequate internal consistency, good content validity and appropriate convergent and discriminant validity. Confirmatory factor analysis added further evidence of the construct validity of the scale. Results suggest that the newly-translated scale can be used as a self-report assessment tool in detecting depressive symptoms of Chinese children aged between 8 and 12 years. © 2010 Blackwell Publishing Ltd.
Herth hope index: psychometric testing of the Chinese version.

PubMed

Chan, Keung Sum; Li, Ho Cheung William; Chan, Sally Wai-Chi; Lopez, Violeta

2012-09-01

This article is a report on psychometric testing of the Chinese version of the herth hope index. The availability of a valid and reliable instrument that accurately measures the level of hope in patients with heart failure is crucial before any hope-enhancing interventions can be appropriately planned and evaluated. There is no such instrument for Chinese people. A test-retest, within-subjects design was used. A purposive sample of 120 Hong Kong Chinese patients with heart failure between the ages of 60 and 80 years admitted to two medical wards was recruited during an 8-month period in 2009. Participants were asked to respond to the Chinese version of the herth hope index, Hamilton depression rating scale and Rosenberg's self-esteem scale. The internal consistency, content validity and construct validity and test-retest reliability of the Chinese version of the herth hope index were assessed. The newly translated scale demonstrated adequate internal consistency, good content validity and appropriate convergent and discriminant validity. Confirmatory factor analysis added further evidence of the construct validity of the scale. Results suggest that the newly translated scale can be used as a self-report assessment tool in assessing the level of hope in Hong Kong Chinese patients with heart failure. © 2011 Blackwell Publishing Ltd.
Validation of software for calculating the likelihood ratio for parentage and kinship.

PubMed

Drábek, J

2009-03-01

Although the likelihood ratio is a well-known statistical technique, commercial off-the-shelf (COTS) software products for its calculation are not sufficiently validated to suit general requirements for the competence of testing and calibration laboratories (EN/ISO/IEC 17025:2005 norm) per se. The software in question can be considered critical as it directly weighs the forensic evidence allowing judges to decide on guilt or innocence or to identify person or kin (i.e.: in mass fatalities). For these reasons, accredited laboratories shall validate likelihood ratio software in accordance with the above norm. To validate software for calculating the likelihood ratio in parentage/kinship scenarios I assessed available vendors, chose two programs (Paternity Index and familias) for testing, and finally validated them using tests derived from elaboration of the available guidelines for the field of forensics, biomedicine, and software engineering. MS Excel calculation using known likelihood ratio formulas or peer-reviewed results of difficult paternity cases were used as a reference. Using seven testing cases, it was found that both programs satisfied the requirements for basic paternity cases. However, only a combination of two software programs fulfills the criteria needed for our purpose in the whole spectrum of functions under validation with the exceptions of providing algebraic formulas in cases of mutation and/or silent allele.
NASA Countermeasures Evaluation and Validation Project

NASA Technical Reports Server (NTRS)

Lundquist, Charlie M.; Paloski, William H. (Technical Monitor)

2000-01-01

To support its ISS and exploration class mission objectives, NASA has developed a Countermeasure Evaluation and Validation Project (CEVP). The goal of this project is to evaluate and validate the optimal complement of countermeasures required to maintain astronaut health, safety, and functional ability during and after short- and long-duration space flight missions. The CEVP is the final element of the process in which ideas and concepts emerging from basic research evolve into operational countermeasures. The CEVP is accomplishing these objectives by conducting operational/clinical research to evaluate and validate countermeasures to mitigate these maladaptive responses. Evaluation is accomplished by testing in space flight analog facilities, and validation is accomplished by space flight testing. Both will utilize a standardized complement of integrated physiological and psychological tests, termed the Integrated Testing Regimen (ITR) to examine candidate countermeasure efficacy and intersystem effects. The CEVP emphasis is currently placed on validating the initial complement of ISS countermeasures targeting bone, muscle, and aerobic fitness; followed by countermeasures for neurological, psychological, immunological, nutrition and metabolism, and radiation risks associated with space flight. This presentation will review the processes, plans, and procedures that will enable CEVP to play a vital role in transitioning promising research results into operational countermeasures necessary to maintain crew health and performance during long duration space flight.
The Validity and reliability of the Comprehensive Home Environment Survey (CHES).

PubMed

Pinard, Courtney A; Yaroch, Amy L; Hart, Michael H; Serrano, Elena L; McFerren, Mary M; Estabrooks, Paul A

2014-01-01

Few comprehensive measures exist to assess contributors to childhood obesity within the home, specifically among low-income populations. The current study describes the modification and psychometric testing of the Comprehensive Home Environment Survey (CHES), an inclusive measure of the home food, physical activity, and media environment related to childhood obesity. The items were tested for content relevance by an expert panel and piloted in the priority population. The CHES was administered to low-income parents of children 5 to 17 years (N = 150), including a subsample of parents a second time and additional caregivers to establish test-retest and interrater reliabilities. Children older than 9 years (n = 95), as well as parents (N = 150) completed concurrent assessments of diet and physical activity behaviors (predictive validity). Analyses and item trimming resulted in 18 subscales and a total score, which displayed adequate internal consistency (α = .74-.92) and high test-retest reliability (r ≥ .73, ps < .01) and interrater reliability (r ≥ .42, ps < .01). The CHES score and a validated screener for the home environment were correlated (r = .37, p < .01; concurrent validity). CHES subscales were significantly correlated with behavioral measures (r = -.20-.55, p < .05; predictive validity). The CHES shows promise as a valid/reliable assessment of the home environment related to childhood obesity, including healthy diet and physical activity.
Inter-Rater Reliability and Validity of the Australian Football League’s Kicking and Handball Tests

PubMed Central

Cripps, Ashley J.; Hopper, Luke S.; Joyce, Christopher

2015-01-01

Talent identification tests used at the Australian Football League’s National Draft Combine assess the capacities of athletes to compete at a professional level. Tests created for the National Draft Combine are also commonly used for talent identification and athlete development in development pathways. The skills tests created by the Australian Football League required players to either handball (striking the ball with the hand) or kick to a series of 6 randomly generated targets. Assessors subjectively rate each skill execution giving a 0-5 score for each disposal. This study aimed to investigate the inter-rater reliability and validity of the skills tests at an adolescent sub-elite level. Male Australian footballers were recruited from sub-elite adolescent teams (n = 121, age = 15.7 ± 0.3 years, height = 1.77 ± 0.07 m, mass = 69.17 ± 8.08 kg). The coaches (n = 7) of each team were also recruited. Inter-rater reliability was assessed using Inter-class correlations (ICC) and Limits of Agreement statistics. Both the kicking (ICC = 0.96, p < .01) and handball tests (ICC = 0.89, p < .01) demonstrated strong reliability and acceptable levels of absolute agreement. Content validity was determined by examining the test scores sensitivity to laterality and distance. Concurrent validity was assessed by comparing coaches’ perceptions of skill to actual test outcomes. Multivariate analysis of variance (MANOVA) examined the main effect of laterality, with scores on the dominant hand (p = .04) and foot (p < .01) significantly higher compared to the non-dominant side. Follow-up univariate analysis reported significant differences at every distance in the kicking test. A poor correlation was found between coaches’ perceptions of skill and testing outcomes. The results of this study demonstrate both skill tests demonstrate acceptable inter-rater reliable. Partial content validity was confirmed for the kicking test, however further research is required to confirm validity of the handball test. Key points The skill tests created by the AFL demonstrated acceptable levels of relative and absolute inter-rater reliability. Both the AFL’s skills tests are able to differentiate between athletes dominant and non-dominant limbs. However, only the kicking test could consistently differentiated between score outcomes over a range of Australian Football specific disposal distances. Both tests demonstrated poor concurrent validity, with no correlation found between coaches’ perceptions of technical skills and actual skill outcomes measured. PMID:26336356

Comparison of sub-scaled to full-scaled aircrafts in simulation environment for air traffic management

NASA Astrophysics Data System (ADS)

Elbakary, Mohamed I.; Iftekharuddin, Khan M.; Papelis, Yiannis; Newman, Brett

2017-05-01

Air Traffic Management (ATM) concepts are commonly tested in simulation to obtain preliminary results and validate the concepts before adoption. Recently, the researchers found that simulation is not enough because of complexity associated with ATM concepts. In other words, full-scale tests must eventually take place to provide compelling performance evidence before adopting full implementation. Testing using full-scale aircraft produces a high-cost approach that yields high-confidence results but simulation provides a low-risk/low-cost approach with reduced confidence on the results. One possible approach to increase the confidence of the results and simultaneously reduce the risk and the cost is using unmanned sub-scale aircraft in testing new concepts for ATM. This paper presents the simulation results of using unmanned sub-scale aircraft in implementing ATM concepts compared to the full scale aircraft. The results of simulation show that the performance of sub-scale is quite comparable to that of the full-scale which validates use of the sub-scale in testing new ATM concepts. Keywords: Unmanned
Portuguese-language version of the COPD Assessment Test: validation for use in Brazil*

PubMed Central

da Silva, Guilherme Pinheiro Ferreira; Morano, Maria Tereza Aguiar Pessoa; Viana, Cyntia Maria Sampaio; Magalhães, Clarissa Bentes de Araujo; Pereira, Eanes Delgado Barros

2013-01-01

OBJECTIVE: To validate a Portuguese-language version of the COPD assessment test (CAT) for use in Brazil and to assess the reproducibility of this version. METHODS: This was multicenter study involving patients with stable COPD at two teaching hospitals in the city of Fortaleza, Brazil. Two independent observers (twice in one day) administered the Portuguese-language version of the CAT to 50 patients with COPD. One of those observers again administered the scale to the same patients one week later. At baseline, the patients were submitted to pulmonary function testing and the six-minute walk test (6MWT), as well as completing the previously validated Portuguese-language versions of the Saint George's Respiratory Questionnaire (SGRQ), modified Medical Research Council (MMRC) dyspnea scale, and hospital anxiety and depression scale (HADS). RESULTS: Inter-rater and intra-rater reliability was excellent (intraclass correlation coefficient [ICC] = 0.96; 95% CI: 0.93-0.97; p < 0.001; and ICC = 0.98; 95% CI: 0.96-0.98; p < 0.001, respectively). Bland Altman plots showed good test-retest reliability. The CAT total score correlated significantly with spirometry results, 6MWT distance, SGRQ scores, MMRC dyspnea scale scores, and HADS-depression scores. CONCLUSIONS: The Portuguese-language version of the CAT is a valid, reproducible, and reliable instrument for evaluating patients with COPD in Brazil. PMID:24068260
Construct Validity of the Nepalese School Leaving English Reading Test

ERIC Educational Resources Information Center

Dawadi, Saraswati; Shrestha, Prithvi N.

2018-01-01

There has been a steady interest in investigating the validity of language tests in the last decades. Despite numerous studies on construct validity in language testing, there are not many studies examining the construct validity of a reading test. This paper reports on a study that explored the construct validity of the English reading test in…
[New questionnaire to assess self-efficacy toward physical activity in children].

PubMed

Aedo, Angeles; Avila, Héctor

2009-10-01

To design a questionnaire for assessment of self-efficacy toward physical activity in school children, as well as to measure its construct validity, test-retest reliability, and internal consistency. A four-stage multimethod approach was used: (1) bibliographic research followed by exploratory study and the formulation of questions and responses based on a dichotomous scale of 14 items; (2) validation of the content by a panel of experts; (3) application of the preliminary version of the questionnaire to a sample of 900 school-aged children in Mexico City; and (4) determination of the construct validity, test-retest reliability, and internal consistency (Cronbach's alpha). Three factors were identified that explain 64.15% of the variance: the search for positive alternatives to physical activity, ability to deal with possible barriers to exercising, and expectations of skill or competence. The model was validated using the goodness of fit, and the result of 65% less than 0.05 indicated that the estimated factor model fit the data. Cronbach's consistency alpha was 0.733; test-retest reliability was 0.867. The scale designed has adequate reliability and validity. These results are a good indicator of self-efficacy toward physical activity in school children, which is important when developing programs intended to promote such behavior in this age group.
Validation of the Vanderbilt Holistic Face Processing Test.

PubMed

Wang, Chao-Chih; Ross, David A; Gauthier, Isabel; Richler, Jennifer J

2016-01-01

The Vanderbilt Holistic Face Processing Test (VHPT-F) is a new measure of holistic face processing with better psychometric properties relative to prior measures developed for group studies (Richler et al., 2014). In fields where psychologists study individual differences, validation studies are commonplace and the concurrent validity of a new measure is established by comparing it to an older measure with established validity. We follow this approach and test whether the VHPT-F measures the same construct as the composite task, which is group-based measure at the center of the large literature on holistic face processing. In Experiment 1, we found a significant correlation between holistic processing measured in the VHPT-F and the composite task. Although this correlation was small, it was comparable to the correlation between holistic processing measured in the composite task with the same faces, but different target parts (top or bottom), which represents a reasonable upper limit for correlations between the composite task and another measure of holistic processing. These results confirm the validity of the VHPT-F by demonstrating shared variance with another measure of holistic processing based on the same operational definition. These results were replicated in Experiment 2, but only when the demographic profile of our sample matched that of Experiment 1.
Validation of the Vanderbilt Holistic Face Processing Test

PubMed Central

Wang, Chao-Chih; Ross, David A.; Gauthier, Isabel; Richler, Jennifer J.

2016-01-01

The Vanderbilt Holistic Face Processing Test (VHPT-F) is a new measure of holistic face processing with better psychometric properties relative to prior measures developed for group studies (Richler et al., 2014). In fields where psychologists study individual differences, validation studies are commonplace and the concurrent validity of a new measure is established by comparing it to an older measure with established validity. We follow this approach and test whether the VHPT-F measures the same construct as the composite task, which is group-based measure at the center of the large literature on holistic face processing. In Experiment 1, we found a significant correlation between holistic processing measured in the VHPT-F and the composite task. Although this correlation was small, it was comparable to the correlation between holistic processing measured in the composite task with the same faces, but different target parts (top or bottom), which represents a reasonable upper limit for correlations between the composite task and another measure of holistic processing. These results confirm the validity of the VHPT-F by demonstrating shared variance with another measure of holistic processing based on the same operational definition. These results were replicated in Experiment 2, but only when the demographic profile of our sample matched that of Experiment 1. PMID:27933014
Measuring acuity of the approximate number system reliably and validly: the evaluation of an adaptive test procedure

PubMed Central

Lindskog, Marcus; Winman, Anders; Juslin, Peter; Poom, Leo

2013-01-01

Two studies investigated the reliability and predictive validity of commonly used measures and models of Approximate Number System acuity (ANS). Study 1 investigated reliability by both an empirical approach and a simulation of maximum obtainable reliability under ideal conditions. Results showed that common measures of the Weber fraction (w) are reliable only when using a substantial number of trials, even under ideal conditions. Study 2 compared different purported measures of ANS acuity as for convergent and predictive validity in a within-subjects design and evaluated an adaptive test using the ZEST algorithm. Results showed that the adaptive measure can reduce the number of trials needed to reach acceptable reliability. Only direct tests with non-symbolic numerosity discriminations of stimuli presented simultaneously were related to arithmetic fluency. This correlation remained when controlling for general cognitive ability and perceptual speed. Further, the purported indirect measure of ANS acuity in terms of the Numeric Distance Effect (NDE) was not reliable and showed no sign of predictive validity. The non-symbolic NDE for reaction time was significantly related to direct w estimates in a direction contrary to the expected. Easier stimuli were found to be more reliable, but only harder (7:8 ratio) stimuli contributed to predictive validity. PMID:23964256
An integrated assessment instrument: Developing and validating instrument for facilitating critical thinking abilities and science process skills on electrolyte and nonelectrolyte solution matter

NASA Astrophysics Data System (ADS)

Astuti, Sri Rejeki Dwi; Suyanta, LFX, Endang Widjajanti; Rohaeti, Eli

2017-05-01

The demanding of assessment in learning process was impact by policy changes. Nowadays, assessment is not only emphasizing knowledge, but also skills and attitudes. However, in reality there are many obstacles in measuring them. This paper aimed to describe how to develop integrated assessment instrument and to verify instruments' validity such as content validity and construct validity. This instrument development used test development model by McIntire. Development process data was acquired based on development test step. Initial product was observed by three peer reviewer and six expert judgments (two subject matter experts, two evaluation experts and two chemistry teachers) to acquire content validity. This research involved 376 first grade students of two Senior High Schools in Bantul Regency to acquire construct validity. Content validity was analyzed used Aiken's formula. The verifying of construct validity was analyzed by exploratory factor analysis using SPSS ver 16.0. The result show that all constructs in integrated assessment instrument are asserted valid according to content validity and construct validity. Therefore, the integrated assessment instrument is suitable for measuring critical thinking abilities and science process skills of senior high school students on electrolyte solution matter.
Development and testing of the ‘Culture of Care Barometer’ (CoCB) in healthcare organisations: a mixed methods study

PubMed Central

Rafferty, Anne Marie; Philippou, Julia; Fitzpatrick, Joanne M; Pike, Geoff; Ball, Jane

2017-01-01

Objective Concerns about care quality have prompted calls to create workplace cultures conducive to high-quality, safe and compassionate care and to provide a supportive environment in which staff can operate effectively. How healthcare organisations assess their culture of care is an important first step in creating such cultures. This article reports on the development and validation of a tool, the Culture of Care Barometer, designed to assess perceptions of a caring culture among healthcare workers preliminary to culture change. Design/setting/participants An exploratory mixed methods study designed to develop and test the validity of a tool to measure ‘culture of care’ through focus groups and questionnaires. Questionnaire development was facilitated through: a literature review, experts generating items of interest and focus group discussions with healthcare staff across specialities, roles and seniority within three types of public healthcare organisations in the UK. The tool was designed to be multiprofessional and pilot tested with a sample of 467 nurses and healthcare support workers in acute care and then validated with a sample of 1698 staff working across acute, mental health and community services in England. Exploratory factor analysis was used to identify dimensions underlying the Barometer. Results Psychometric testing resulted in the development of a 30-item questionnaire linked to four domains with retained items loading to four factors: organisational values (α=0.93, valid n=1568, M=3.7), team support (α=0.93, valid n=1557, M=3.2), relationships with colleagues (α=0.84, valid n=1617, M=4.0) and job constraints (α=0.70, valid n=1616, M=3.3). Conclusions The study developed a valid and reliable instrument with which to gauge the different attributes of care culture perceived by healthcare staff with potential for organisational benchmarking. PMID:28821526
Use of the Ames Check Standard Model for the Validation of Wall Interference Corrections

NASA Technical Reports Server (NTRS)

Ulbrich, N.; Amaya, M.; Flach, R.

2018-01-01

The new check standard model of the NASA Ames 11-ft Transonic Wind Tunnel was chosen for a future validation of the facility's wall interference correction system. The chosen validation approach takes advantage of the fact that test conditions experienced by a large model in the slotted part of the tunnel's test section will change significantly if a subset of the slots is temporarily sealed. Therefore, the model's aerodynamic coefficients have to be recorded, corrected, and compared for two different test section configurations in order to perform the validation. Test section configurations with highly accurate Mach number and dynamic pressure calibrations were selected for the validation. First, the model is tested with all test section slots in open configuration while keeping the model's center of rotation on the tunnel centerline. In the next step, slots on the test section floor are sealed and the model is moved to a new center of rotation that is 33 inches below the tunnel centerline. Then, the original angle of attack sweeps are repeated. Afterwards, wall interference corrections are applied to both test data sets and response surface models of the resulting aerodynamic coefficients in interference-free flow are generated. Finally, the response surface models are used to predict the aerodynamic coefficients for a family of angles of attack while keeping dynamic pressure, Mach number, and Reynolds number constant. The validation is considered successful if the corrected aerodynamic coefficients obtained from the related response surface model pair show good agreement. Residual differences between the corrected coefficient sets will be analyzed as well because they are an indicator of the overall accuracy of the facility's wall interference correction process.
Initial validation of a web-based self-administered neuropsychological test battery for older adults and seniors.

PubMed

Hansen, Tor Ivar; Haferstrom, Elise Christina D; Brunner, Jan F; Lehn, Hanne; Håberg, Asta Kristine

2015-01-01

Computerized neuropsychological tests are effective in assessing different cognitive domains, but are often limited by the need of proprietary hardware and technical staff. Web-based tests can be more accessible and flexible. We aimed to investigate validity, effects of computer familiarity, education, and age, and the feasibility of a new web-based self-administered neuropsychological test battery (Memoro) in older adults and seniors. A total of 62 (37 female) participants (mean age 60.7 years) completed the Memoro web-based neuropsychological test battery and a traditional battery composed of similar tests intended to measure the same cognitive constructs. Participants were assessed on computer familiarity and how they experienced the two batteries. To properly test the factor structure of Memoro, an additional factor analysis in 218 individuals from the HUNT population was performed. Comparing Memoro to traditional tests, we observed good concurrent validity (r = .49-.63). The performance on the traditional and Memoro test battery was consistent, but differences in raw scores were observed with higher scores on verbal memory and lower in spatial memory in Memoro. Factor analysis indicated two factors: verbal and spatial memory. There were no correlations between test performance and computer familiarity after adjustment for age or age and education. Subjects reported that they preferred web-based testing as it allowed them to set their own pace, and they did not feel scrutinized by an administrator. Memoro showed good concurrent validity compared to neuropsychological tests measuring similar cognitive constructs. Based on the current results, Memoro appears to be a tool that can be used to assess cognitive function in older and senior adults. Further work is necessary to ascertain its validity and reliability.
MEASURING SPORT-SPECIFIC PHYSICAL ABILITIES IN MALE GYMNASTS: THE MEN'S GYMNASTICS FUNCTIONAL MEASUREMENT TOOL.

PubMed

Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel

2016-12-01

Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
Validity and reliability of Internet-based physiotherapy assessment for musculoskeletal disorders: a systematic review.

PubMed

Mani, Suresh; Sharma, Shobha; Omar, Baharudin; Paungmali, Aatit; Joseph, Leonard

2017-04-01

Purpose The purpose of this review is to systematically explore and summarise the validity and reliability of telerehabilitation (TR)-based physiotherapy assessment for musculoskeletal disorders. Method A comprehensive systematic literature review was conducted using a number of electronic databases: PubMed, EMBASE, PsycINFO, Cochrane Library and CINAHL, published between January 2000 and May 2015. The studies examined the validity, inter- and intra-rater reliabilities of TR-based physiotherapy assessment for musculoskeletal conditions were included. Two independent reviewers used the Quality Appraisal Tool for studies of diagnostic Reliability (QAREL) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool to assess the methodological quality of reliability and validity studies respectively. Results A total of 898 hits were achieved, of which 11 articles based on inclusion criteria were reviewed. Nine studies explored the concurrent validity, inter- and intra-rater reliabilities, while two studies examined only the concurrent validity. Reviewed studies were moderate to good in methodological quality. The physiotherapy assessments such as pain, swelling, range of motion, muscle strength, balance, gait and functional assessment demonstrated good concurrent validity. However, the reported concurrent validity of lumbar spine posture, special orthopaedic tests, neurodynamic tests and scar assessments ranged from low to moderate. Conclusion TR-based physiotherapy assessment was technically feasible with overall good concurrent validity and excellent reliability, except for lumbar spine posture, orthopaedic special tests, neurodynamic testa and scar assessment.
Assessment of human epidermal model LabCyte EPI-MODEL for in vitro skin irritation testing according to European Centre for the Validation of Alternative Methods (ECVAM)-validated protocol.

PubMed

Katoh, Masakazu; Hamajima, Fumiyasu; Ogasawara, Takahiro; Hata, Ken-Ichiro

2009-06-01

A validation study of an in vitro skin irritation testing method using a reconstructed human skin model has been conducted by the European Centre for the Validation of Alternative Methods (ECVAM), and a protocol using EpiSkin (SkinEthic, France) has been approved. The structural and performance criteria of skin models for testing are defined in the ECVAM Performance Standards announced along with the approval. We have performed several evaluations of the new reconstructed human epidermal model LabCyte EPI-MODEL, and confirmed that it is applicable to skin irritation testing as defined in the ECVAM Performance Standards. We selected 19 materials (nine irritants and ten non-irritants) available in Japan as test chemicals among the 20 reference chemicals described in the ECVAM Performance Standard. A test chemical was applied to the surface of the LabCyte EPI-MODEL for 15 min, after which it was completely removed and the model then post-incubated for 42 hr. Cell v iability was measured by MTT assay and skin irritancy of the test chemical evaluated. In addition, interleukin-1 alpha (IL-1alpha) concentration in the culture supernatant after post-incubation was measured to provide a complementary evaluation of skin irritation. Evaluation of the 19 test chemicals resulted in 79% accuracy, 78% sensitivity and 80% specificity, confirming that the in vitro skin irritancy of the LabCyte EPI-MODEL correlates highly with in vivo skin irritation. These results suggest that LabCyte EPI-MODEL is applicable to the skin irritation testing protocol set out in the ECVAM Performance Standards.
Validation of mechanical models for reinforced concrete structures: Presentation of the French project ``Benchmark des Poutres de la Rance''

NASA Astrophysics Data System (ADS)

L'Hostis, V.; Brunet, C.; Poupard, O.; Petre-Lazar, I.

2006-11-01

Several ageing models are available for the prediction of the mechanical consequences of rebar corrosion. They are used for service life prediction of reinforced concrete structures. Concerning corrosion diagnosis of reinforced concrete, some Non Destructive Testing (NDT) tools have been developed, and have been in use for some years. However, these developments require validation on existing concrete structures. The French project “Benchmark des Poutres de la Rance” contributes to this aspect. It has two main objectives: (i) validation of mechanical models to estimate the influence of rebar corrosion on the load bearing capacity of a structure, (ii) qualification of the use of the NDT results to collect information on steel corrosion within reinforced-concrete structures. Ten French and European institutions from both academic research laboratories and industrial companies contributed during the years 2004 and 2005. This paper presents the project that was divided into several work packages: (i) the reinforced concrete beams were characterized from non-destructive testing tools, (ii) the mechanical behaviour of the beams was experimentally tested, (iii) complementary laboratory analysis were performed and (iv) finally numerical simulations results were compared to the experimental results obtained with the mechanical tests.
14 CFR 91.1041 - Aircraft proving and validation tests.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 14 Aeronautics and Space 2 2014-01-01 2014-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...
14 CFR 91.1041 - Aircraft proving and validation tests.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 14 Aeronautics and Space 2 2012-01-01 2012-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...
14 CFR 91.1041 - Aircraft proving and validation tests.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 14 Aeronautics and Space 2 2013-01-01 2013-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...
14 CFR 91.1041 - Aircraft proving and validation tests.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 14 Aeronautics and Space 2 2011-01-01 2011-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...
14 CFR 91.1041 - Aircraft proving and validation tests.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 14 Aeronautics and Space 2 2010-01-01 2010-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...

Impact of Learning Model Based on Cognitive Conflict toward Student’s Conceptual Understanding

NASA Astrophysics Data System (ADS)

Mufit, F.; Festiyed, F.; Fauzan, A.; Lufri, L.

2018-04-01

The problems that often occur in the learning of physics is a matter of misconception and low understanding of the concept. Misconceptions do not only happen to students, but also happen to college students and teachers. The existing learning model has not had much impact on improving conceptual understanding and remedial efforts of student misconception. This study aims to see the impact of cognitive-based learning model in improving conceptual understanding and remediating student misconceptions. The research method used is Design / Develop Research. The product developed is a cognitive conflict-based learning model along with its components. This article reports on product design results, validity tests, and practicality test. The study resulted in the design of cognitive conflict-based learning model with 4 learning syntaxes, namely (1) preconception activation, (2) presentation of cognitive conflict, (3) discovery of concepts & equations, (4) Reflection. The results of validity tests by some experts on aspects of content, didactic, appearance or language, indicate very valid criteria. Product trial results also show a very practical product to use. Based on pretest and posttest results, cognitive conflict-based learning models have a good impact on improving conceptual understanding and remediating misconceptions, especially in high-ability students.
[Validity 'and Utilities' clinic of a grid observation (PACSLAC-F) to evaluate the pain in seniors with dementia's living in the Long-Term Care ].

PubMed

Aubin, Michèle; Verreault, René; Savoie, Maryse; LeMay, Sylvie; Hadjistavropoulos, Thomas; Fillion, Lise; Beaulieu, Marie; Viens, Chantal; Bergeron, Rénald; Vézina, Lucie; Misson, Lucie; Fuchs-Lacelle, Shannon

2008-01-01

This study presents the validation of the French Canadian version (PACLSAC-F) of the Pain Assessment Checklist for Seniors with Limited Ability to Communicate (PACSLAC). Unlike the published validation of the English version of the PACSLAC, which was validated retrospectively, the French version was validated prospectively. The PACSLAC-F was completed by nurses working in long-term care facilities after observing 86 seniors, with severe cognitive impairment, in calm, painful or distressing but non-painful situations. The test-retest and inter-observer reliability, the internal consistency, and the discriminent validity were found to be satisfactory. To evaluate the convergent validity with the DOLOPLUS-2 and the clinical relevance of the PACSLAC, it was also completed by nurses during their work shift, with 26 additional patients, for three days per week during a period of four weeks. These results encourage us to test the PACSLAC in a comprehensive program of pain management targeting this population.
The Development and Validation of a Spanish Language Version of the Test Anxiety Inventory for Children and Adolescents

ERIC Educational Resources Information Center

Unruh, Susan M.; Lowe, Patricia A.

2010-01-01

This study details the development and validation of a Spanish language version of the Test Anxiety Inventory for Children and Adolescents (TAICA) for elementary and secondary students. In this study, the TAICA was adapted and administered to a sample of 197 students, 87 males and 110 females, aged 9 to 19 years, in Grades 4 to 12. Results of an…
A Cross-Validation of easyCBM Mathematics Cut Scores in Washington State: 2009-2010 Test. Technical Report #1105

ERIC Educational Resources Information Center

Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

2011-01-01

In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in the state of Washington. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Washington state…
A Cross-Validation of easyCBM[R] Mathematics Cut Scores in Oregon: 2009-2010. Technical Report #1104

ERIC Educational Resources Information Center

Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

2011-01-01

In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in Oregon. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Oregon state test was used as the…
easyCBM Beginning Reading Measures: Grades K-1 Alternate Form Reliability and Criterion Validity with the SAT-10. Technical Report #1403

ERIC Educational Resources Information Center

Wray, Kraig; Lai, Cheng-Fei; Sáez, Leilani; Alonzo, Julie; Tindal, Gerald

2013-01-01

We report the results of an alternate form reliability and criterion validity study of kindergarten and grade 1 (N = 84-199) reading measures from the easyCBM© assessment system and Stanford Early School Achievement Test/Stanford Achievement Test, 10th edition (SESAT/SAT-10) across 5 time points. The alternate form reliabilities ranged from…
Symptom validity testing in memory clinics: Hippocampal-memory associations and relevance for diagnosing mild cognitive impairment.

PubMed

Rienstra, Anne; Groot, Paul F C; Spaan, Pauline E J; Majoie, Charles B L M; Nederveen, Aart J; Walstra, Gerard J M; de Jonghe, Jos F M; van Gool, Willem A; Olabarriaga, Silvia D; Korkhov, Vladimir V; Schmand, Ben

2013-01-01

Patients with mild cognitive impairment (MCI) do not always convert to dementia. In such cases, abnormal neuropsychological test results may not validly reflect cognitive symptoms due to brain disease, and the usual brain-behavior relationships may be absent. This study examined symptom validity in a memory clinic sample and its effect on the associations between hippocampal volume and memory performance. Eleven of 170 consecutive patients (6.5%; 13% of patients younger than 65 years) referred to memory clinics showed noncredible performance on symptom validity tests (SVTs, viz. Word Memory Test and Test of Memory Malingering). They were compared to a demographically matched group (n = 57) selected from the remaining patients. Hippocampal volume, measured by an automated volumetric method (Freesurfer), was correlated with scores on six verbal memory tests. The median correlation was r = .49 in the matched group. However, the relation was absent (median r = -.11) in patients who failed SVTs. Memory clinic samples may include patients who show noncredible performance, which invalidates their MCI diagnosis. This underscores the importance of applying SVTs in evaluating patients with cognitive complaints that may signify a predementia stage, especially when these patients are relatively young.
Validation of the Economic and Health Outcomes Model of Type 2 Diabetes Mellitus (ECHO-T2DM).

PubMed

Willis, Michael; Johansen, Pierre; Nilsson, Andreas; Asseburg, Christian

2017-03-01

The Economic and Health Outcomes Model of Type 2 Diabetes Mellitus (ECHO-T2DM) was developed to address study questions pertaining to the cost-effectiveness of treatment alternatives in the care of patients with type 2 diabetes mellitus (T2DM). Naturally, the usefulness of a model is determined by the accuracy of its predictions. A previous version of ECHO-T2DM was validated against actual trial outcomes and the model predictions were generally accurate. However, there have been recent upgrades to the model, which modify model predictions and necessitate an update of the validation exercises. The objectives of this study were to extend the methods available for evaluating model validity, to conduct a formal model validation of ECHO-T2DM (version 2.3.0) in accordance with the principles espoused by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) and the Society for Medical Decision Making (SMDM), and secondarily to evaluate the relative accuracy of four sets of macrovascular risk equations included in ECHO-T2DM. We followed the ISPOR/SMDM guidelines on model validation, evaluating face validity, verification, cross-validation, and external validation. Model verification involved 297 'stress tests', in which specific model inputs were modified systematically to ascertain correct model implementation. Cross-validation consisted of a comparison between ECHO-T2DM predictions and those of the seminal National Institutes of Health model. In external validation, study characteristics were entered into ECHO-T2DM to replicate the clinical results of 12 studies (including 17 patient populations), and model predictions were compared to observed values using established statistical techniques as well as measures of average prediction error, separately for the four sets of macrovascular risk equations supported in ECHO-T2DM. Sub-group analyses were conducted for dependent vs. independent outcomes and for microvascular vs. macrovascular vs. mortality endpoints. All stress tests were passed. ECHO-T2DM replicated the National Institutes of Health cost-effectiveness application with numerically similar results. In external validation of ECHO-T2DM, model predictions agreed well with observed clinical outcomes. For all sets of macrovascular risk equations, the results were close to the intercept and slope coefficients corresponding to a perfect match, resulting in high R 2 and failure to reject concordance using an F test. The results were similar for sub-groups of dependent and independent validation, with some degree of under-prediction of macrovascular events. ECHO-T2DM continues to match health outcomes in clinical trials in T2DM, with prediction accuracy similar to other leading models of T2DM.
[Spanish version of Adonis Complex Questionnaire. A questionnaire to test the muscle dimorphism and vigorexy].

PubMed

Latorre-Román, Pedro Ángel; Garrido-Ruiz, Antonio; García-Pinillos, Felipe

2014-11-08

To validate the Spanish version of Adonis Complex Questionnaire in bodybuilders. Participants included 99 bodybuilders who train regularly (age: 25.45±5.19 y; BMI=24.53±1.89). In order to test the discriminant and concurrent validity the Exercise Dependence Scale-Revised (EDS-R) and the Eating Attitudes Test (EAT-26) were used. The scale's psychometric properties were obtained through a concurrent validity process, factorial analysis of principal components, internal consistency, and test-retest reliability. The internal consistency of this questionnaire was high (Cronbach's Alpha= 0.880) in total scale. The intraclass correlation coefficient (ICC) to test the temporal consistency of the questionnaire was 0.707 (95% IC=0.336- 0.871). The questionnaire obtained concurrent validity with the EDS-R (r=0.613, p<0.001), and EAT-26 (r=0.422, p<0.001). The results have shown a three-factor structure Factor 1: psychosocial effect of physical appearance, Factor 2: control of physical appearance, Factor 3: concern about physical appearance which explain 65.29% of variance. The Adonis Complex Questionnaire shows a proper psychometric properties and it is a valid and reliable measure of vigorexy and muscle dimorphism in bodybuilders. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
[The appraisal of reliability and validity of subjective workload assessment technique and NASA-task load index].

PubMed

Xiao, Yuan-mei; Wang, Zhi-ming; Wang, Mian-zhen; Lan, Ya-jia

2005-06-01

To test the reliability and validity of two mental workload assessment scales, i.e. subjective workload assessment technique (SWAT) and NASA task load index (NASA-TLX). One thousand two hundred and sixty-eight mental workers were sampled from various kinds of occupations, such as scientific research, education, administration and medicine, etc, with randomized cluster sampling. The re-test reliability, split-half reliability, Cronbach's alpha coefficient and correlation coefficients between item score and total score were adopted to test the reliability. The test of validity included structure validity. The re-test reliability coefficients of these two scales and their items were ranged from 0.516 to 0.753 (P < 0.01), indicating the two scales had good re-test reliability; the split-half reliability of SWAT was 0.645, and its Cronbach's alpha coefficient was more than 0.80, all the correlation coefficients between its items score and total score were more than 0.70; as for NASA-TLX, both the split-half reliability and Cronbach's alpha coefficient were more than 0.80, the correlation coefficients between its items score and total score were all more than 0.60 (P < 0.01) except the item of performance. Both scales had good inner consistency. The Pearson correlation coefficient between the two scales was 0.492 (P < 0.01), implying the results of the two scales had good consistency. Factor analysis showed that the two scales had good structure validity. Both SWAT and NASA-TLX have good reliability and validity and may be used as a valid tool to assess mental workload in China after being revised properly.
The dialysis orders objective structured clinical examination (OSCE): a formative assessment for nephrology fellows

PubMed Central

Prince, Lisa K; Campbell, Ruth C; Gao, Sam W; Kendrick, Jessica; Lebrun, Christopher J; Little, Dustin J; Mahoney, David L; Maursetter, Laura A; Nee, Robert; Saddler, Mark; Watson, Maura A

2018-01-01

Abstract Background Few quantitative nephrology-specific simulations assess fellow competency. We describe the development and initial validation of a formative objective structured clinical examination (OSCE) assessing fellow competence in ordering acute dialysis. Methods The three test scenarios were acute continuous renal replacement therapy, chronic dialysis initiation in moderate uremia and acute dialysis in end-stage renal disease-associated hyperkalemia. The test committee included five academic nephrologists and four clinically practicing nephrologists outside of academia. There were 49 test items (58 points). A passing score was 46/58 points. No item had median relevance less than ‘important’. The content validity index was 0.91. Ninety-five percent of positive-point items were easy–medium difficulty. Preliminary validation was by 10 board-certified volunteers, not test committee members, a median of 3.5 years from graduation. The mean score was 49 [95% confidence interval (CI) 46–51], κ = 0.68 (95% CI 0.59–0.77), Cronbach’s α = 0.84. Results We subsequently administered the test to 25 fellows. The mean score was 44 (95% CI 43–45); 36% passed the test. Fellows scored significantly less than validators (P < 0.001). Of evidence-based questions, 72% were answered correctly by validators and 54% by fellows (P = 0.018). Fellows and validators scored least well on the acute hyperkalemia question. In self-assessing proficiency, 71% of fellows surveyed agreed or strongly agreed that the OSCE was useful. Conclusions The OSCE may be used to formatively assess fellow proficiency in three common areas of acute dialysis practice. Further validation studies are in progress. PMID:29644053
Design and validation of a self-administered test to assess bullying (bull-M) in high school Mexicans: a pilot study

PubMed Central

2013-01-01

Background Bullying (Bull) is a public health problem worldwide, and Mexico is not exempt. However, its epidemiology and early detection in our country is limited, in part, by the lack of validated tests to ensure the respondents’ anonymity. The aim of this study was to validate a self-administered test (Bull-M) for assessing Bull among high-school Mexicans. Methods Experts and school teachers from highly violent areas of Ciudad Juarez (Chihuahua, México), reported common Bull behaviors. Then, a 10-item test was developed based on twelve of these behaviors; the students’ and peers’ participation in Bull acts and in some somatic consequences in Bull victims with a 5-point Likert frequency scale. Validation criteria were: content (CV, judges); reliability [Cronbach’s alpha (CA), test-retest (spearman correlation, rs)]; construct [principal component (PCA), confirmatory factor (CFA), goodness-of-fit (GF) analysis]; and convergent (Bull-M vs. Bull-S test) validity. Results Bull-M showed good reliability (CA = 0.75, rs = 0.91; p < 0.001). Two factors were identified (PCA) and confirmed (CFA): “bullying me (victim)” and “bullying others (aggressor)”. GF indices were: Root mean square error of approximation (0.031), GF index (0.97), and normalized fit index (0.92). Bull-M was as good as Bull-S for measuring Bull prevalence. Conclusions Bull-M has a good reliability and convergent validity and a bi-modal factor structure for detecting Bull victims and aggressors; however, its external validity and sensitivity should be analyzed on a wider and different population. PMID:23577755
SEQUenCE: a service user-centred quality of care instrument for mental health services.

PubMed

Hester, Lorraine; O'Doherty, Lorna Jane; Schnittger, Rebecca; Skelly, Niamh; O'Donnell, Muireann; Butterly, Lisa; Browne, Robert; Frorath, Charlotte; Morgan, Craig; McLoughlin, Declan M; Fearon, Paul

2015-08-01

To develop a quality of care instrument that is grounded in the service user perspective and validate it in a mental health service. The instrument (SEQUenCE (SErvice user QUality of CarE)) was developed through analysis of focus group data and clinical practice guidelines, and refined through field-testing and psychometric analyses. All participants were attending an independent mental health service in Ireland. Participants had a diagnosis of bipolar affective disorder (BPAD) or a psychotic disorder. Twenty-nine service users participated in six focus group interviews. Seventy-one service users participated in field-testing: 10 judged the face validity of an initial 61-item instrument; 28 completed a revised 52-item instrument from which 12 items were removed following test-retest and convergent validity analyses; 33 completed the resulting 40-item instrument. Test-retest reliability, internal consistency and convergent validity of the instrument. The final instrument showed acceptable test-retest reliability at 5-7 days (r = 0.65; P < 0.001), good convergent validity with the Verona Service Satisfaction Scale (r = 0.84, P < 0.001) and good internal consistency (Cronbach's alpha = 0.87). SEQUenCE is a valid, reliable scale that is grounded in the service user perspective and suitable for routine use. It may serve as a useful tool in individual care planning, service evaluation and research. The instrument was developed and validated with service users with a diagnosis of either BPAD or a psychotic disorder; it does not yet have established external validity for other diagnostic groups. © The Author 2015. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
Validating emotional attention regulation as a component of emotional intelligence: A Stroop approach to individual differences in tuning in to and out of nonverbal cues.

PubMed

Elfenbein, Hillary Anger; Jang, Daisung; Sharma, Sudeep; Sanchez-Burks, Jeffrey

2017-03-01

Emotional intelligence (EI) has captivated researchers and the public alike, but it has been challenging to establish its components as objective abilities. Self-report scales lack divergent validity from personality traits, and few ability tests have objectively correct answers. We adapt the Stroop task to introduce a new facet of EI called emotional attention regulation (EAR), which involves focusing emotion-related attention for the sake of information processing rather than for the sake of regulating one's own internal state. EAR includes 2 distinct components. First, tuning in to nonverbal cues involves identifying nonverbal cues while ignoring alternate content, that is, emotion recognition under conditions of distraction by competing stimuli. Second, tuning out of nonverbal cues involves ignoring nonverbal cues while identifying alternate content, that is, the ability to interrupt emotion recognition when needed to focus attention elsewhere. An auditory test of valence included positive and negative words spoken in positive and negative vocal tones. A visual test of approach-avoidance included green- and red-colored facial expressions depicting happiness and anger. The error rates for incongruent trials met the key criteria for establishing the validity of an EI test, in that the measure demonstrated test-retest reliability, convergent validity with other EI measures, divergent validity from factors such as general processing speed and mostly personality, and predictive validity in this case for well-being. By demonstrating that facets of EI can be validly theorized and empirically assessed, results also speak to the validity of EI more generally. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Development and validation of a smartphone-based digits-in-noise hearing test in South African English.

PubMed

Potgieter, Jenni-Marí; Swanepoel, De Wet; Myburgh, Hermanus Carel; Hopper, Thomas Christopher; Smits, Cas

2015-07-01

The objective of this study was to develop and validate a smartphone-based digits-in-noise hearing test for South African English. Single digits (0-9) were recorded and spoken by a first language English female speaker. Level corrections were applied to create a set of homogeneous digits with steep speech recognition functions. A smartphone application was created to utilize 120 digit-triplets in noise as test material. An adaptive test procedure determined the speech reception threshold (SRT). Experiments were performed to determine headphones effects on the SRT and to establish normative data. Participants consisted of 40 normal-hearing subjects with thresholds ≤15 dB across the frequency spectrum (250-8000 Hz) and 186 subjects with normal-hearing in both ears, or normal-hearing in the better ear. The results show steep speech recognition functions with a slope of 20%/dB for digit-triplets presented in noise using the smartphone application. The results of five headphone types indicate that the smartphone-based hearing test is reliable and can be conducted using standard Android smartphone headphones or clinical headphones. A digits-in-noise hearing test was developed and validated for South Africa. The mean SRT and speech recognition functions correspond to previous developed telephone-based digits-in-noise tests.
Multi-Evaporator Miniature Loop Heat Pipe for Small Spacecraft Thermal Control. Part 2; Validation Results

NASA Technical Reports Server (NTRS)

Ku, Jentung; Ottenstein, Laura; Douglas, Donya; Hoang, Triem

2010-01-01

Under NASA s New Millennium Program Space Technology 8 (ST 8) Project, Goddard Space Fight Center has conducted a Thermal Loop experiment to advance the maturity of the Thermal Loop technology from proof of concept to prototype demonstration in a relevant environment , i.e. from a technology readiness level (TRL) of 3 to a level of 6. The thermal Loop is an advanced thermal control system consisting of a miniature loop heat pipe (MLHP) with multiple evaporators and multiple condensers designed for future small system applications requiring low mass, low power, and compactness. The MLHP retains all features of state-of-the-art loop heat pipes (LHPs) and offers additional advantages to enhance the functionality, performance, versatility, and reliability of the system. An MLHP breadboard was built and tested in the laboratory and thermal vacuum environments for the TRL 4 and TRL 5 validations, respectively, and an MLHP proto-flight unit was built and tested in a thermal vacuum chamber for the TRL 6 validation. In addition, an analytical model was developed to simulate the steady state and transient behaviors of the MLHP during various validation tests. The MLHP demonstrated excellent performance during experimental tests and the analytical model predictions agreed very well with experimental data. All success criteria at various TRLs were met. Hence, the Thermal Loop technology has reached a TRL of 6. This paper presents the validation results, both experimental and analytical, of such a technology development effort.
Dutch translation and cross-cultural validation of the Adult Social Care Outcomes Toolkit (ASCOT).

PubMed

van Leeuwen, Karen M; Bosmans, Judith E; Jansen, Aaltje Pd; Rand, Stacey E; Towers, Ann-Marie; Smith, Nick; Razik, Kamilla; Trukeschitz, Birgit; van Tulder, Maurits W; van der Horst, Henriette E; Ostelo, Raymond W

2015-05-13

The Adult Social Care Outcomes Toolkit was developed to measure outcomes of social care in England. In this study, we translated the four level self-completion version (SCT-4) of the ASCOT for use in the Netherlands and performed a cross-cultural validation. The ASCOT SCT-4 was translated into Dutch following international guidelines, including two forward and back translations. The resulting version was pilot tested among frail older adults using think-aloud interviews. Furthermore, using a subsample of the Dutch ACT-study, we investigated test-retest reliability and construct validity and compared response distributions with data from a comparable English study. The pilot tests showed that translated items were in general understood as intended, that most items were reliable, and that the response distributions of the Dutch translation and associations with other measures were comparable to the original English version. Based on the results of the pilot tests, some small modifications and a revision of the Dignity items were proposed for the final translation, which were approved by the ASCOT development team. The complete original English version and the final Dutch translation can be obtained after registration on the ASCOT website ( http://www.pssru.ac.uk/ascot ). This study provides preliminary evidence that the Dutch translation of the ASCOT is valid, reliable and comparable to the original English version. We recommend further research to confirm the validity of the modified Dutch ASCOT translation.
Evaluation of two selection tests for recruitment into radiology specialty training.

PubMed

Patterson, Fiona; Knight, Alec; McKnight, Liam; Booth, Thomas C

2016-07-11

This study evaluated whether two selection tests previously validated for primary care General Practice (GP) trainee selection could provide a valid shortlisting selection method for entry into specialty training for the secondary care specialty of radiology. We conducted a retrospective analysis of data from radiology applicants who also applied to UK GP specialty training or Core Medical Training. The psychometric properties of the two selection tests, a clinical problem solving (CPS) test and situational judgement test (SJT), were analysed to evaluate their reliability. Predictive validity of the tests was analysed by comparing them with the current radiology selection assessments, and the licensure examination results taken after the first stage of training (Fellowship of the Royal College of Radiologists (FRCR) Part 1). The internal reliability of the two selection tests in the radiology applicant sample was good (α ≥ 0.80). The average correlation with radiology shortlisting selection scores was r = 0.26 for the CPS (with p < 0.05 in 5 of 11 shortlisting centres), r = 0.15 for the SJT (with p < 0.05 in 2 of 11 shortlisting centres) and r = 0.25 (with p < 0.05 in 5 of 11 shortlisting centres) for the two tests combined. The CPS test scores significantly correlated with performance in both components of the FRCR Part 1 examinations (r = 0.5 anatomy; r = 0.4 physics; p < 0.05 for both). The SJT did not correlate with either component of the examination. The current CPS test may be an appropriate selection method for shortlisting in radiology but would benefit from further refinement for use in radiology to ensure that the test specification is relevant. The evidence on whether the SJT may be appropriate for shortlisting in radiology is limited. However, these results may be expected to some extent since the SJT is designed to measure non-academic attributes. Further validation work (e.g. with non-academic outcome variables) is required to evaluate whether an SJT will add value in recruitment for radiology specialty training and will further inform construct validity of SJTs as a selection methodology.
Analytic Validation of Immunohistochemical Assays: A Comparison of Laboratory Practices Before and After Introduction of an Evidence-Based Guideline.

PubMed

Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Souers, Rhona J; Fatheree, Lisa A; Volmar, Keith E; Stuart, Lauren N; Nowak, Jan A; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- Laboratories must demonstrate analytic validity before any test can be used clinically, but studies have shown inconsistent practices in immunohistochemical assay validation. - To assess changes in immunohistochemistry analytic validation practices after publication of an evidence-based laboratory practice guideline. - A survey on current immunohistochemistry assay validation practices and on the awareness and adoption of a recently published guideline was sent to subscribers enrolled in one of 3 relevant College of American Pathologists proficiency testing programs and to additional nonsubscribing laboratories that perform immunohistochemical testing. The results were compared with an earlier survey of validation practices. - Analysis was based on responses from 1085 laboratories that perform immunohistochemical staining. Of 1057 responses, 65.4% (691) were aware of the guideline recommendations before this survey was sent and 79.9% (550 of 688) of those have already adopted some or all of the recommendations. Compared with the 2010 survey, a significant number of laboratories now have written validation procedures for both predictive and nonpredictive marker assays and specifications for the minimum numbers of cases needed for validation. There was also significant improvement in compliance with validation requirements, with 99% (100 of 102) having validated their most recently introduced predictive marker assay, compared with 74.9% (326 of 435) in 2010. The difficulty in finding validation cases for rare antigens and resource limitations were cited as the biggest challenges in implementing the guideline. - Dissemination of the 2014 evidence-based guideline validation practices had a positive impact on laboratory performance; some or all of the recommendations have been adopted by nearly 80% of respondents.
On the Impact of Illustrated Assessment Tool on Paragraph Writing of High School Graduates of Qom, Iran

ERIC Educational Resources Information Center

Bagheridoust, Esmaeil; Husseini, Zahra

2011-01-01

Writing as one important skill in language proficiency demands validity, hence high schools are real places in which valid results are needed for high-stake decisions. Unrealistic and non-viable tests result in improper and invalid interpretation and use. Illustrations without any written research have proved their effectiveness in whatsoever…

Diagnostic Validity of High-Density Barium Sulfate in Gastric Cancer Screening: Follow-up of Screenees by Record Linkage with the Osaka Cancer Registry

PubMed Central

Yamamoto, Kenyu; Yamazaki, Hideo; Kuroda, Chikazumi; Kubo, Tsugio; Oshima, Akira; Katsuda, Toshizo; Kuwano, Tadao; Takeda, Yoshihiro

2010-01-01

Background The use of high-density barium sulfate was recommended by the Japan Society of Gastroenterological Cancer Screening (JSGCS) in 2004. We evaluated the diagnostic validity of gastric cancer screening that used high-density barium sulfate. Methods The study subjects were 171 833 residents of Osaka, Japan who underwent gastric cancer screening tests at the Osaka Cancer Prevention and Detection Center during the period from 1 January 2000 through 31 December 2001. Screening was conducted using either high-density barium sulfate (n = 48 336) or moderate-density barium sulfate (n = 123 497). The subjects were followed up and their medical records were linked to those of the Osaka Cancer Registry through 31 December 2002. The results of follow-up during 1 year were defined as the gold standard, and test performance values were calculated. Results The sensitivity and specificity of the screening test using moderate-density barium sulfate were 92.3% and 91.0%, respectively, while the sensitivity and specificity of the high-density barium test were 91.8% and 91.4%, respectively. The results of area under receiver-operating-characteristic (ROC) curve analysis revealed no significant difference between the 2 screening tests. Conclusions Screening tests using high- and moderate-density barium sulfate had similar validity, as determined by sensitivity, specificity, and ROC curve analysis. PMID:20551581
Reliability and validity of three pain provocation tests used for the diagnosis of chronic proximal hamstring tendinopathy.

PubMed

Cacchio, Angelo; Borra, Fabrizio; Severini, Gabriele; Foglia, Andrea; Musarra, Frank; Taddio, Nicola; De Paulis, Fosco

2012-09-01

The clinical assessment of chronic proximal hamstring tendinopathy (PHT) in athletes is a challenge to sports medicine. To be able to compare the results of research and treatments, the methods used to diagnose and evaluate PHT must be clearly defined and reproducible. To assess the reliability and validity of three pain provocation tests used for the diagnosis of PHT. Ninety-two athletes with (N=46) and without (N=46) PHT were examined by one physician and two physiotherapists, who were trained in the examination techniques before the study. The examiners were blinded to the symptoms and identity of the athletes. The three pain provocation tests examined were the Puranen-Orava, bent-knee stretch and modified bent-knee stretch tests. Intraclass correlation coefficients (ICCs) based on the repeated measures analysis of variance were used to analyse the intraexaminer and interexaminer reliability, while sensitivity, specificity, predictive values and likelihood ratios were used to determine the validity of the three tests. The ICC values in all three tests revealed a high correlation (range 0.82 to 0.88) for the interexaminer reliability and a high-to-very high correlation (range 0.87 to 0.93) for the intraexaminer reliability. All three tests displayed a moderate-to-high validity, with the highest degree of validity being yielded by the modified bent-knee stretch test. All three pain provocation tests proved to be of potential value in assessing chronic PHT in athletes. However, we recommend that they be used in conjunction with other objective measures, such as MRI.
Reliability and validity of the McDonald Play Inventory.

PubMed

McDonald, Ann E; Vigen, Cheryl

2012-01-01

This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
The reliability and validity of fatigue measures during multiple-sprint work: an issue revisited.

PubMed

Glaister, Mark; Howatson, Glyn; Pattison, John R; McInnes, Gill

2008-09-01

The ability to repeatedly produce a high-power output or sprint speed is a key fitness component of most field and court sports. The aim of this study was to evaluate the validity and reliability of eight different approaches to quantify this parameter in tests of multiple-sprint performance. Ten physically active men completed two trials of each of two multiple-sprint running protocols with contrasting recovery periods. Protocol 1 consisted of 12 x 30-m sprints repeated every 35 seconds; protocol 2 consisted of 12 x 30-m sprints repeated every 65 seconds. All testing was performed in an indoor sports facility, and sprint times were recorded using twin-beam photocells. All but one of the formulae showed good construct validity, as evidenced by similar within-protocol fatigue scores. However, the assumptions on which many of the formulae were based, combined with poor or inconsistent test-retest reliability (coefficient of variation range: 0.8-145.7%; intraclass correlation coefficient range: 0.09-0.75), suggested many problems regarding logical validity. In line with previous research, the results support the percentage decrement calculation as the most valid and reliable method of quantifying fatigue in tests of multiple-sprint performance.
Cross-Validation of easyCBM Reading Cut Scores in Washington: 2009-2010. Technical Report #1109

ERIC Educational Resources Information Center

Irvin, P. Shawn; Park, Bitnara Jasmine; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

2011-01-01

This technical report presents results from a cross-validation study designed to identify optimal cut scores when using easyCBM[R] reading tests in Washington state. The cross-validation study analyzes data from the 2009-2010 academic year for easyCBM[R] reading measures. A sample of approximately 900 students per grade, randomly split into two…
Prefabricated Roof Beams for Hardened Shelters

DTIC Science & Technology

1993-08-01

beam with a composite concrete slab. Based on the results of the concept evaluation, a test program was designed and conducted to validate the steel...ultimaw, strength. The results of these tests showed that the design procedure accurately predicts the response of the ste,-confined concrete composite...BENDING OF EXTERNALLY REINFORCED CONCRETE BEAMS ........ 67 TABLE 9. SINGLE POINT LOAD BEAM TEST RESULTS
Validation to Portuguese of the Scale of Student Satisfaction and Self-Confidence in Learning.

PubMed

Almeida, Rodrigo Guimarães dos Santos; Mazzo, Alessandra; Martins, José Carlos Amado; Baptista, Rui Carlos Negrão; Girão, Fernanda Berchelli; Mendes, Isabel Amélia Costa

2015-01-01

Translate and validate to Portuguese the Scale of Student Satisfaction and Self-Confidence in Learning. Methodological translation and validation study of a research tool. After following all steps of the translation process, for the validation process, the event III Workshop Brazil - Portugal: Care Delivery to Critical Patients was created, promoted by one Brazilian and another Portuguese teaching institution. 103 nurses participated. As to the validity and reliability of the scale, the correlation pattern between the variables, the sampling adequacy test (Kaiser-Meyer-Olkin) and the sphericity test (Bartlett) showed good results. In the exploratory factorial analysis (Varimax), item 9 behaved better in factor 1 (Satisfaction) than in factor 2 (Self-confidence in learning). The internal consistency (Cronbach's alpha) showed coefficients of 0.86 in factor 1 with six items and 0.77 for factor 2 with 07 items. In Portuguese this tool was called: Escala de Satisfação de Estudantes e Autoconfiança na Aprendizagem. The results found good psychometric properties and a good potential use. The sampling size and specificity are limitations of this study, but future studies will contribute to consolidate the validity of the scale and strengthen its potential use.
Evaluating the validity and reliability of the V-scale instrument (Turkish version) used to determine nurses' attitudes towards vital sign monitoring.

PubMed

Ertuğ, Nurcan

2018-06-01

The aim of this study was to determine the validity and reliability of the Turkish version of the V-scale, which measures nurses' attitudes towards vital signs monitoring in the detection of clinical deterioration. This validity and reliability study was conducted at a tertiary hospital in Ankara, Turkey, in 2016. A total of 169 ward nurses participated in the study. Exploratory factor analysis, Cronbach's alpha coefficient, and the intraclass correlation coefficient were used to determine the validity and reliability of the scale. A 5-factor, 16-item scale explained 60.823% of the total variance according to the validity analysis. Our version matched the original scale in terms of the number of items and factor structure. Cronbach's alpha coefficient of the Turkish version of the V-scale was 0.764. The test-retest reliability results were 0.855 for the overall intraclass correlation coefficient, and the t-test result was P > 0.05. The V-scale is a reliable and valid instrument to measure Turkish nurses' attitudes towards vital signs monitoring in the detection of clinical deterioration. © 2018 John Wiley & Sons Australia, Ltd.
Validation of the FASH (Functional Assessment Scale for Acute Hamstring Injuries) questionnaire for German-speaking football players.

PubMed

Lohrer, Heinz; Nauck, Tanja; Korakakis, Vasileios; Malliaropoulos, Nikos

2016-10-24

The FASH (Functional Assessment Scale for Acute Hamstring Injuries) questionnaire has been recently developed as a disease-specific self-administered questionnaire for use in Greek, English, and German languages. Its psychometric qualities (validity and reliability) were tested only in Greek-speaking patients mainly representing track and field athletes. As hamstring injuries represent the most common football injury, we tested the validity and reliability of the FASH-G (G = German version) questionnaire in German-speaking footballers suffering from acute hamstring injuries. The FASH-G questionnaire was tested for reliability and validity, in 16 footballers with hamstring injuries (patients' group), 77 asymptomatic footballers (healthy group), and 19 field hockey players (at-risk group). Known-group validity was tested by comparing the total FASH-G scores of the injured and non-injured groups. Reliability of the FASH-G questionnaire was analysed in 18 asymptomatic footballers using the intra-class coefficient. Known-group validity was demonstrated by significant differences between injured and non-injured participants (p < 0.001). The FASH-G exhibited very good test-retest reliability (intra-class correlation coefficient = 0.982, p < 0.001). Internal consistency was excellent (α = 0.938). Compared with the results presented in the original publication, no statistical differences were found between healthy athletes (p = 0.257), but patients' groups and at-risk groups presented scoring differences (p = 0.040 and <0.001, respectively). The FASH-G is a valid and reliable instrument to assess and determine the severity of hamstring injuries in German footballers.
A Psychometric Study of the Bayley Scales of Infant and Toddler Development in Persian Language Children

PubMed Central

AZARI, Nadia; SOLEIMANI, Farin; VAMEGHI, Roshanak; SAJEDI, Firoozeh; SHAHSHAHANI, Soheila; KARIMI, Hossein; KRASKIAN, Adis; SHAHROKHI, Amin; TEYMOURI, Robab; GHARIB, Masoud

2017-01-01

Objective Bayley Scales of infant & toddler development is a well-known diagnostic developmental assessment tool for children aged 1–42 months. Our aim was investigating the validity & reliability of this scale in Persian speaking children. Materials & Methods The method was descriptive-analytic. Translation- back translation and cultural adaptation was done. Content & face validity of translated scale was determined by experts’ opinions. Overall, 403 children aged 1 to 42 months were recruited from health centers of Tehran, during years of 2013-2014 for developmental assessment in cognitive, communicative (receptive & expressive) and motor (fine & gross) domains. Reliability of scale was calculated through three methods; internal consistency using Cronbach’s alpha coefficient, test-retest and interrater methods. Construct validity was calculated using factor analysis and comparison of the mean scores methods. Results Cultural and linguistic changes were made in items of all domains especially on communication subscale. Content and face validity of the test were approved by experts’ opinions. Cronbach’s alpha coefficient was above 0.74 in all domains. Pearson correlation coefficient in various domains, were ≥ 0.982 in test retest method, and ≥0.993 in inter-rater method. Construct validity of the test was approved by factor analysis. Moreover, the mean scores for the different age groups were compared and statistically significant differences were observed between mean scores of different age groups, that confirms validity of the test. Conclusion The Bayley Scales of Infant and Toddler Development is a valid and reliable tool for child developmental assessment in Persian language children. PMID:28277556
Measurement uncertainty analysis techniques applied to PV performance measurements

NASA Astrophysics Data System (ADS)

Wells, C.

1992-10-01

The purpose of this presentation is to provide a brief introduction to measurement uncertainty analysis, outline how it is done, and illustrate uncertainty analysis with examples drawn from the PV field, with particular emphasis toward its use in PV performance measurements. The uncertainty information we know and state concerning a PV performance measurement or a module test result determines, to a significant extent, the value and quality of that result. What is measurement uncertainty analysis? It is an outgrowth of what has commonly been called error analysis. But uncertainty analysis, a more recent development, gives greater insight into measurement processes and tests, experiments, or calibration results. Uncertainty analysis gives us an estimate of the interval about a measured value or an experiment's final result within which we believe the true value of that quantity will lie. Why should we take the time to perform an uncertainty analysis? A rigorous measurement uncertainty analysis: Increases the credibility and value of research results; allows comparisons of results from different labs; helps improve experiment design and identifies where changes are needed to achieve stated objectives (through use of the pre-test analysis); plays a significant role in validating measurements and experimental results, and in demonstrating (through the post-test analysis) that valid data have been acquired; reduces the risk of making erroneous decisions; demonstrates quality assurance and quality control measures have been accomplished; define Valid Data as data having known and documented paths of: Origin, including theory; measurements; traceability to measurement standards; computations; uncertainty analysis of results.
Measurement Techniques and Instruments Suitable for Life-prediction Testing of Photovoltaic Arrays

NASA Technical Reports Server (NTRS)

Noel, G. T.; Wood, V. E.; Mcginniss, V. D.; Hassell, J. A.; Richard, N. A.; Gaines, G. B.; Carmichael, D. C.

1979-01-01

The validation of a 20-year service life for low-cost photovoltaic arrays is a critical requirement in the Low-Cost Solar Array (LSA) Project. The validation is accomplished through accelerated life-prediction tests. A two-phase study was conducted to address the needs before such tests are carried out. The results and recommended techniques from the Phase 1 investigation are summarized in the appendix. Phase 2 of the study is covered in this report and consisted of experimental evaluations of three techniques selected from these recommended as a results of the Phase 1 findings. The three techniques evaluated were specular and nonspecular optical reflectometry, chemiluminescence measurements, and electric current noise measurements.
A simple test of choice stepping reaction time for assessing fall risk in people with multiple sclerosis.

PubMed

Tijsma, Mylou; Vister, Eva; Hoang, Phu; Lord, Stephen R

2017-03-01

Purpose To determine (a) the discriminant validity for established fall risk factors and (b) the predictive validity for falls of a simple test of choice stepping reaction time (CSRT) in people with multiple sclerosis (MS). Method People with MS (n = 210, 21-74y) performed the CSRT, sensorimotor, balance and neuropsychological tests in a single session. They were then followed up for falls using monthly fall diaries for 6 months. Results The CSRT test had excellent discriminant validity with respect to established fall risk factors. Frequent fallers (≥3 falls) performed significantly worse in the CSRT test than non-frequent fallers (0-2 falls). With the odds of suffering frequent falls increasing 69% with each SD increase in CSRT (OR = 1.69, 95% CI: 1.27-2.26, p = <0.001). In regression analysis, CSRT was best explained by sway, time to complete the 9-Hole Peg test, knee extension strength of the weaker leg, proprioception and the time to complete the Trails B test (multiple R 2 = 0.449, p < 0.001). Conclusions A simple low tech CSRT test has excellent discriminative and predictive validity in relation to falls in people with MS. This test may prove useful in documenting longitudinal changes in fall risk in relation to MS disease progression and effects of interventions. Implications for rehabilitation Good choice stepping reaction time (CSRT) is required for maintaining balance. A simple low-tech CSRT test has excellent discriminative and predictive validity in relation to falls in people with MS. This test may prove useful documenting longitudinal changes in fall risk in relation to MS disease progression and effects of interventions.
Soil moisture mapping using Sentinel 1 images: the proposed approach and its preliminary validation carried out in view of an operational product

NASA Astrophysics Data System (ADS)

Paloscia, S.; Pettinato, S.; Santi, E.; Pierdicca, N.; Pulvirenti, L.; Notarnicola, C.; Pace, G.; Reppucci, A.

2011-11-01

The main objective of this research is to develop, test and validate a soil moisture (SMC)) algorithm for the GMES Sentinel-1 characteristics, within the framework of an ESA project. The SMC product, to be generated from Sentinel-1 data, requires an algorithm able to process operationally in near-real-time and deliver the product to the GMES services within 3 hours from observations. Two different complementary approaches have been proposed: an Artificial Neural Network (ANN), which represented the best compromise between retrieval accuracy and processing time, thus allowing compliance with the timeliness requirements and a Bayesian Multi-temporal approach, allowing an increase of the retrieval accuracy, especially in case where little ancillary data are available, at the cost of computational efficiency, taking advantage of the frequent revisit time achieved by Sentinel-1. The algorithm was validated in several test areas in Italy, US and Australia, and finally in Spain with a 'blind' validation. The Multi-temporal Bayesian algorithm was validated in Central Italy. The validation results are in all cases very much in line with the requirements. However, the blind validation results were penalized by the availability of only VV polarization SAR images and MODIS lowresolution NDVI, although the RMS is slightly > 4%.
49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

Code of Federal Regulations, 2013 CFR

2013-10-01

... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2013-10-01 2013-10-01 false What is validity testing, and are laboratories...
49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

Code of Federal Regulations, 2011 CFR

2011-10-01

... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2011-10-01 2011-10-01 false What is validity testing, and are laboratories...
49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

Code of Federal Regulations, 2010 CFR

2010-10-01

... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2010-10-01 2010-10-01 false What is validity testing, and are laboratories...
49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

Code of Federal Regulations, 2012 CFR

2012-10-01

... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2012-10-01 2012-10-01 false What is validity testing, and are laboratories...
49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

Code of Federal Regulations, 2014 CFR

2014-10-01

... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2014-10-01 2014-10-01 false What is validity testing, and are laboratories...
Phase 1 Validation Testing and Simulation for the WEC-Sim Open Source Code

NASA Astrophysics Data System (ADS)

Ruehl, K.; Michelen, C.; Gunawan, B.; Bosma, B.; Simmons, A.; Lomonaco, P.

2015-12-01

WEC-Sim is an open source code to model wave energy converters performance in operational waves, developed by Sandia and NREL and funded by the US DOE. The code is a time-domain modeling tool developed in MATLAB/SIMULINK using the multibody dynamics solver SimMechanics, and solves the WEC's governing equations of motion using the Cummins time-domain impulse response formulation in 6 degrees of freedom. The WEC-Sim code has undergone verification through code-to-code comparisons; however validation of the code has been limited to publicly available experimental data sets. While these data sets provide preliminary code validation, the experimental tests were not explicitly designed for code validation, and as a result are limited in their ability to validate the full functionality of the WEC-Sim code. Therefore, dedicated physical model tests for WEC-Sim validation have been performed. This presentation provides an overview of the WEC-Sim validation experimental wave tank tests performed at the Oregon State University's Directional Wave Basin at Hinsdale Wave Research Laboratory. Phase 1 of experimental testing was focused on device characterization and completed in Fall 2015. Phase 2 is focused on WEC performance and scheduled for Winter 2015/2016. These experimental tests were designed explicitly to validate the performance of WEC-Sim code, and its new feature additions. Upon completion, the WEC-Sim validation data set will be made publicly available to the wave energy community. For the physical model test, a controllable model of a floating wave energy converter has been designed and constructed. The instrumentation includes state-of-the-art devices to measure pressure fields, motions in 6 DOF, multi-axial load cells, torque transducers, position transducers, and encoders. The model also incorporates a fully programmable Power-Take-Off system which can be used to generate or absorb wave energy. Numerical simulations of the experiments using WEC-Sim will be presented. These simulations highlight the code features included in the latest release of WEC-Sim (v1.2), including: wave directionality, nonlinear hydrostatics and hydrodynamics, user-defined wave elevation time-series, state space radiation, and WEC-Sim compatibility with BEMIO (open source AQWA/WAMI/NEMOH coefficient parser).

The analytical validation of the Oncotype DX Recurrence Score assay

PubMed Central

Baehner, Frederick L

2016-01-01

In vitro diagnostic multivariate index assays are highly complex molecular assays that can provide clinically actionable information regarding the underlying tumour biology and facilitate personalised treatment. These assays are only useful in clinical practice if all of the following are established: analytical validation (i.e., how accurately/reliably the assay measures the molecular characteristics), clinical validation (i.e., how consistently/accurately the test detects/predicts the outcomes of interest), and clinical utility (i.e., how likely the test is to significantly improve patient outcomes). In considering the use of these assays, clinicians often focus primarily on the clinical validity/utility; however, the analytical validity of an assay (e.g., its accuracy, reproducibility, and standardisation) should also be evaluated and carefully considered. This review focuses on the rigorous analytical validation and performance of the Oncotype DX® Breast Cancer Assay, which is performed at the Central Clinical Reference Laboratory of Genomic Health, Inc. The assay process includes tumour tissue enrichment (if needed), RNA extraction, gene expression quantitation (using a gene panel consisting of 16 cancer genes plus 5 reference genes and quantitative real-time RT-PCR), and an automated computer algorithm to produce a Recurrence Score® result (scale: 0–100). This review presents evidence showing that the Recurrence Score result reported for each patient falls within a tight clinically relevant confidence interval. Specifically, the review discusses how the development of the assay was designed to optimise assay performance, presents data supporting its analytical validity, and describes the quality control and assurance programmes that ensure optimal test performance over time. PMID:27729940
The analytical validation of the Oncotype DX Recurrence Score assay.

PubMed

Baehner, Frederick L

2016-01-01

In vitro diagnostic multivariate index assays are highly complex molecular assays that can provide clinically actionable information regarding the underlying tumour biology and facilitate personalised treatment. These assays are only useful in clinical practice if all of the following are established: analytical validation (i.e., how accurately/reliably the assay measures the molecular characteristics), clinical validation (i.e., how consistently/accurately the test detects/predicts the outcomes of interest), and clinical utility (i.e., how likely the test is to significantly improve patient outcomes). In considering the use of these assays, clinicians often focus primarily on the clinical validity/utility; however, the analytical validity of an assay (e.g., its accuracy, reproducibility, and standardisation) should also be evaluated and carefully considered. This review focuses on the rigorous analytical validation and performance of the Oncotype DX ® Breast Cancer Assay, which is performed at the Central Clinical Reference Laboratory of Genomic Health, Inc. The assay process includes tumour tissue enrichment (if needed), RNA extraction, gene expression quantitation (using a gene panel consisting of 16 cancer genes plus 5 reference genes and quantitative real-time RT-PCR), and an automated computer algorithm to produce a Recurrence Score ® result (scale: 0-100). This review presents evidence showing that the Recurrence Score result reported for each patient falls within a tight clinically relevant confidence interval. Specifically, the review discusses how the development of the assay was designed to optimise assay performance, presents data supporting its analytical validity, and describes the quality control and assurance programmes that ensure optimal test performance over time.
Examining the Predictive Validity of NIH Peer Review Scores

PubMed Central

Lindner, Mark D.; Nakamura, Richard K.

2015-01-01

The predictive validity of peer review at the National Institutes of Health (NIH) has not yet been demonstrated empirically. It might be assumed that the most efficient and expedient test of the predictive validity of NIH peer review would be an examination of the correlation between percentile scores from peer review and bibliometric indices of the publications produced from funded projects. The present study used a large dataset to examine the rationale for such a study, to determine if it would satisfy the requirements for a test of predictive validity. The results show significant restriction of range in the applications selected for funding. Furthermore, those few applications that are funded with slightly worse peer review scores are not selected at random or representative of other applications in the same range. The funding institutes also negotiate with applicants to address issues identified during peer review. Therefore, the peer review scores assigned to the submitted applications, especially for those few funded applications with slightly worse peer review scores, do not reflect the changed and improved projects that are eventually funded. In addition, citation metrics by themselves are not valid or appropriate measures of scientific impact. The use of bibliometric indices on their own to measure scientific impact would likely increase the inefficiencies and problems with replicability already largely attributed to the current over-emphasis on bibliometric indices. Therefore, retrospective analyses of the correlation between percentile scores from peer review and bibliometric indices of the publications resulting from funded grant applications are not valid tests of the predictive validity of peer review at the NIH. PMID:26039440
JaCVAM-organized international validation study of the in vivo rodent alkaline comet assay for the detection of genotoxic carcinogens: I. Summary of pre-validation study results.

PubMed

Uno, Yoshifumi; Kojima, Hajime; Omori, Takashi; Corvi, Raffaella; Honma, Masamistu; Schechtman, Leonard M; Tice, Raymond R; Burlinson, Brian; Escobar, Patricia A; Kraynak, Andrew R; Nakagawa, Yuzuki; Nakajima, Madoka; Pant, Kamala; Asano, Norihide; Lovell, David; Morita, Takeshi; Ohno, Yasuo; Hayashi, Makoto

2015-07-01

The in vivo rodent alkaline comet assay (comet assay) is used internationally to investigate the in vivo genotoxic potential of test chemicals. This assay, however, has not previously been formally validated. The Japanese Center for the Validation of Alternative Methods (JaCVAM), with the cooperation of the U.S. NTP Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM)/the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM), the European Centre for the Validation of Alternative Methods (ECVAM), and the Japanese Environmental Mutagen Society/Mammalian Mutagenesis Study Group (JEMS/MMS), organized an international validation study to evaluate the reliability and relevance of the assay for identifying genotoxic carcinogens, using liver and stomach as target organs. The ultimate goal of this validation effort was to establish an Organisation for Economic Co-operation and Development (OECD) test guideline. The purpose of the pre-validation studies (i.e., Phase 1 through 3), conducted in four or five laboratories with extensive comet assay experience, was to optimize the protocol to be used during the definitive validation study. Copyright © 2015 Elsevier B.V. All rights reserved.
Are awareness questionnaires valid? Investigating the use of posttest questionnaires for assessing awareness in implicit memory tests.

PubMed

Barnhardt, Terrence M; Geraci, Lisa

2008-01-01

Two experiments--one employing a perceptual implicit memory test and the other a conceptual implicit memory test--investigated the validity of posttest questionnaires for determining the incidence of awareness in implicit memory tests. In both experiments, a condition in which none of the studied words could be used as test responses (i.e., the none-studied condition) was compared with a standard implicit test condition. Results showed that reports of awareness on the posttest questionnaire were much less frequent in the none-studied condition than in the standard condition. This was especially true after deep processing at study. In both experiments, 83% of the participants in the none-studied condition stated they were unaware even though there were strong demands for claiming awareness. Although there was a small bias in the questionnaire (i.e., 17% of the participants in the none-studied condition stated they were aware), overall, there was strong support for the validity of awareness questionnaires.
Pitfalls in efficacy testing--how important is the validation of neutralization of chlorhexidine digluconate?

PubMed

Reichel, Mirja; Heisig, Peter; Kampf, Günter

2008-12-02

Effective neutralization of active agents is essential to obtain valid efficacy results, especially when non-volatile active agents like chlorhexidine digluconate (CHG) are tested. The aim of this study was to determine an effective and non-toxic neutralizing mixture for a propan-1-ol solution containing 2% CHG. Experiments were carried out according to ASTM E 1054-02. The neutralization capacity was tested separately with five challenge microorganisms in suspension, and with a rayon swab carrier. Either 0.5 mL of the antiseptic solution (suspension test) or a saturated swab with the antiseptic solution (carrier test) was added to tryptic soy broth containing neutralizing agents. After the samples were mixed, aliquots were spread immediately and after 3 h of storage at 2 - 8 degrees C onto tryptic soy agar containing a neutralizing mixture. The neutralizer was, however, not consistently effective in the suspension test. Immediate spread yielded a valid neutralization with Staphylococcus aureus, Staphylococcus epidermidis and Corynebacterium jeikeium but not with Micrococcus luteus (p < 0.001) and Candida albicans (p < 0.001). A 3-h storage period of the neutralized active agents in suspension resulted in significant carry-over activity of CHG in addition against Staphylococcus epidermidis (p < 0.001) and Corynebacterium jeikeium (p = 0.044). In the carrier test, the neutralizing mixture was found to be effective and non toxic to all challenge microorganisms when spread immediately. However, after 3 h storage of the neutralized active agents significant carry-over activity of CHG against Micrococcus luteus (p = 0.004; Tukey HSD) was observed. Without effective neutralization in the sampling fluid, non-volatile active ingredients will continue to reduce the number of surviving microorganisms after antiseptic treatment even if the sampling fluid is kept cold straight after testing. This can result in false-positive antiseptic efficacy data. Attention should be paid during the neutralization validation process to the amount of antiseptic solution, the storage time and to the choice of appropriate and sensitive microorganisms.
Simulation of Propagation of Compartment Fire on Building Facades

NASA Astrophysics Data System (ADS)

Simion, A.; Dragne, H.; Stoica, D.; Anghel, I.

2018-06-01

The façade fire simulation of buildings is carried out with Pyrosim numerical fire modeling program, following the implementation of a fire scenario in this simulation program. The scenario that was implemented in the Pyrosim program by researchers from the INCERC Fire Safety Research and Testing Laboratory complied with the requirements of BS 8414. The results obtained following the run of the computational program led to the visual validation of effluents at different time points from the beginning of the thermal load burning, as well as the validation in terms of recorded temperatures. It is considered that the results obtained are reasonable, the test being fully validated from the point of view of the implementation of the fire scenario, of the correct development of the effluents and of the temperature values [1].
Ada Compiler Validation Summary Report. Certificate Number: 900228W1. 11003, Verdix Corporation VADS Sun3 SunOS, VAda-110-1313, Version 6.0 Sun3/280 = Sun3/280

DTIC Science & Technology

1991-01-17

Number: 90-03-08- VRX See Section 3.1 for any additional information about the testing environment. As a result of this validation effort, Validation...20301,Q:’ I.. AVF Control Number: AVF-VSR-365.0191 17 January 1991 90-03-08- VRX Ada COMPILER VALIDATION SUMMARY REPORT: Certificate Number: 900228W1.11003
32 CFR 634.35 - Chemical testing policies and procedures.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 32 National Defense 4 2010-07-01 2010-07-01 true Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...
32 CFR 634.35 - Chemical testing policies and procedures.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 32 National Defense 4 2011-07-01 2011-07-01 false Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...
32 CFR 634.35 - Chemical testing policies and procedures.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 32 National Defense 4 2012-07-01 2011-07-01 true Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...
32 CFR 634.35 - Chemical testing policies and procedures.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 32 National Defense 4 2014-07-01 2013-07-01 true Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...
32 CFR 634.35 - Chemical testing policies and procedures.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 32 National Defense 4 2013-07-01 2013-07-01 false Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...
Continuous coaxial cable sensors for monitoring of RC structures with electrical time domain reflectometry

NASA Astrophysics Data System (ADS)

Chen, Genda; Mu, Huimin; Pommerenke, David; Drewniak, James L.

2003-08-01

This study was aimed at developing and validating a new type of coaxial cable sensors that can be used to detect cracks or measure strains in reinforced concrete (RC) structures. The new sensors were designed based on the change in outer conductor configuration under strain effects in contrast to the geometry-based design in conventional coaxial cable sensors. Both numerical simulations and calibration tests with strain gauges of a specific design of the proposed cables were conducted to study the cables' sensitivity. Four designs of the proposed type of sensors were then respectively mounted near the surface of six 3-foot-long RC beams. They were tested in bending to further validate the cables' sensitivity in concrete members. The calibration test results generally agree with the numerical simulations. They showed that the proposed sensors are over 10~50 times more sensitive than conventional cable sensors. The test results of the beams not only validate the sensitivity of the new sensors but also indicate a good correlation with the measured crack width.
Exploring rationality in schizophrenia

PubMed Central

Mortensen, Erik Lykke; Owen, Gareth; Nordgaard, Julie; Jansson, Lennart; Sæbye, Ditte; Flensborg-Madsen, Trine; Parnas, Josef

2015-01-01

Background Empirical studies of rationality (syllogisms) in patients with schizophrenia have obtained different results. One study found that patients reason more logically if the syllogism is presented through an unusual content. Aims To explore syllogism-based rationality in schizophrenia. Method Thirty-eight first-admitted patients with schizophrenia and 38 healthy controls solved 29 syllogisms that varied in presentation content (ordinary v. unusual) and validity (valid v. invalid). Statistical tests were made of unadjusted and adjusted group differences in models adjusting for intelligence and neuropsychological test performance. Results Controls outperformed patients on all syllogism types, but the difference between the two groups was only significant for valid syllogisms presented with unusual content. However, when adjusting for intelligence and neuropsychological test performance, all group differences became non-significant. Conclusions When taking intelligence and neuropsychological performance into account, patients with schizophrenia and controls perform similarly on syllogism tests of rationality. Declaration of interest None. Copyright and usage © The Royal College of Psychiatrists 2015. This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) licence. PMID:27703730
Validity and reliability of the Short Physical Performance Battery (SPPB)

PubMed Central

Curcio, Carmen-Lucía; Alvarado, Beatriz; Zunzunegui, María Victoria; Guralnik, Jack

2013-01-01

Objectives: To assess the validity (convergent and construct) and reliability of the Short Physical Performance Battery (SPPB) among non-disabled adults between 65 to 74 years of age residing in the Andes Mountains of Colombia. Methods: Design Validation study; Participants: 150 subjects aged 65 to 74 years recruited from elderly associations (day-centers) in Manizales, Colombia. Measurements: The SPPB tests of balance, including time to walk 4 meters and time required to stand from a chair 5 times were administered to all participants. Reliability was analyzed with a 7-day interval between assessments and use of repeated ANOVA testing. Construct validity was assessed using factor analysis and by testing the relationship between SPPB and depressive symptoms, cognitive function, and self rated health (SRH), while the concurrent validity was measured through relationships with mobility limitations and disability in Activities of Daily Living (ADL). ANOVA tests were used to establish these associations. Results: Test-retest reliability of the SPPB was high: 0.87 (CI95%: 0.77-0.96). A one factor solution was found with three SPPB tests. SPPB was related to self-rated health, limitations in walking and climbing steps and to indicators of disability, as well as to cognitive function and depression. There was a graded decrease in the mean SPPB score with increasing disability and poor health. Conclusion: The Spanish version of SPPB is reliable and valid to assess physical performance among older adults from our region. Future studies should establish their clinical applications and explore usage in population studies. PMID:24892614
Large Engine Technology (LET) Short Haul Civil Tiltrotor Contingency Power Materials Knowledge and Lifing Methodologies

NASA Technical Reports Server (NTRS)

Spring, Samuel D.

2006-01-01

This report documents the results of an experimental program conducted on two advanced metallic alloy systems (Rene' 142 directionally solidified alloy (DS) and Rene' N6 single crystal alloy) and the characterization of two distinct internal state variable inelastic constitutive models. The long term objective of the study was to develop a computational life prediction methodology that can integrate the obtained material data. A specialized test matrix for characterizing advanced unified viscoplastic models was specified and conducted. This matrix included strain controlled tensile tests with intermittent relaxtion test with 2 hr hold times, constant stress creep tests, stepped creep tests, mixed creep and plasticity tests, cyclic temperature creep tests and tests in which temperature overloads were present to simulate actual operation conditions for validation of the models. The selected internal state variable models where shown to be capable of representing the material behavior exhibited by the experimental results; however the program ended prior to final validation of the models.
Generalizability and Validity of a Mathematics Performance Assessment.

ERIC Educational Resources Information Center

Lane, Suzanne; And Others

1996-01-01

Evidence from test results of 3,604 sixth and seventh graders is provided for the generalizability and validity of the Quantitative Understanding: Amplifying Student Achievement and Reasoning (QUASAR) Cognitive Assessment Instrument, which is designed to measure program outcomes and growth in mathematics. (SLD)
Results of the Intelligence Test for Visually Impaired Children (ITVIC).

ERIC Educational Resources Information Center

Dekker, R.; And Others

1991-01-01

Statistical analyses of scores on subtests of the Intelligence Test for Visually Impaired Children were done for two groups of children, either with or without usable vision. Results suggest that the battery has differential factorial and predictive validity. (Author/DB)
The UCSF screening exam effectively screens cognitive and behavioral impairment in patients with ALS.

PubMed

Murphy, Jennifer; Ahmed, Fizaa; Lomen-Hoerth, Catherine

2015-03-01

The University of California San Francisco (UCSF) Screening Battery provides clinicians with a uniquely tailored tool to measure ALS patients' cognitive and behavioral changes, adjusting for dysarthria and hand weakness. The battery consists of the ALS-CBS ( 1 ), Written Fluency Test ( 2 ), and a new revision of the Frontal Behavior Inventory (FBI-ALS) ( 3 ). The validity of each component was tested by comparing results with a gold standard neuropsychological exam (GNE). Consensus criteria-based GNE diagnoses ( 4 ) were assigned (n = 24) and concurrent validity was tested for each screening exam component. Results showed that each of the four cognitive and behavioral screening test components were significantly associated with diagnoses confirmed by GNE. GNE diagnoses were significantly associated with FBI-ALS negative score, written S-words score, and ALS-CBS cognitive score. The total FBI-ALS score and C-words tests were less predictive of GNE-diagnosed impairment. In conclusion, the UCSF Cognitive Screening Battery demonstrates good external validity compared with GNE in this modest sample, encouraging its use in larger investigations. These data suggest that this battery may provide an effective screen to identify ALS patients who will then benefit from a full examination to confirm their diagnosis.

Comparison of human skin irritation patch test data with in vitro skin irritation assays and animal data.

PubMed

Jírová, Dagmar; Basketter, David; Liebsch, Manfred; Bendová, Hana; Kejlová, Kristina; Marriott, Marie; Kandárová, Helena

2010-02-01

Efforts to replace the rabbit skin irritation test have been underway for many years, encouraged by the EU Cosmetics Directive and REACH. Recently various in vitro tests have been developed, evaluated and validated. A key difficulty in confirming the validity of in vitro methods is that animal data are scarce and of limited utility for prediction of human effects, which adversely impacts their acceptance. This study examines whether in vivo or in vitro data most accurately predicted human effects. Using the 4-hr human patch test (HPT) we examined a number of chemicals whose EU classification of skin irritancy is known to be borderline, or where in vitro methods provided conflicting results. Of the 16 chemicals classified as irritants in the rabbit, only five substances were found to be significantly irritating to human skin. Concordance of the rabbit test with the 4-hr HPT was only 56%, whereas concordance of human epidermis models with human data was 76% (EpiDerm) and 70% (EPISKIN). The results confirm observations that rabbits overpredict skin effects in humans. Therefore, when validating in vitro methods, all available information, including human data, should be taken into account before making conclusions about their predictive capacity.
Validity and Reliability of the Italian Version of the Functioning Assessment Short Test (FAST) in Bipolar Disorder

PubMed Central

Moro, Maria Francesca; Colom, Francesc; Floris, Francesca; Pintus, Elisa; Pintus, Mirra; Contini, Francesca; Carta, Mauro Giovanni

2012-01-01

Background: Functioning Assessment Short Test (FAST) is a brief instrument designed to assess the main functioning problems experienced by psychiatric patients, specifically bipolar patients. It includes 24 items assessing impairment or disability in six domains of functioning: autonomy, occupational functioning, cognitive functioning, financial issues, interpersonal relationships and leisure time. The aim of this study is to measure the validity and reliability of the Italian version of this instrument. Methods: Twenty-four patients with DSM-IV TR bipolar disorder and 20 healthy controls were recruited and evaluated in three private clinics in Cagliari (Sardinia, Italy). The psychometric properties of FAST (feasibility, internal consistency, concurrent validity, discriminant validity (patients vs controls and eutimic patients vs manic and depressed), and test-retest reliability were analyzed. Results: The internal consistency obtained was very high with a Cronbach's alpha of 0.955. A highly significant negative correlation with GAF was obtained (r = -0.9; p < 0.001) pointing to a reasonable degree of concurrent validity. FAST show a good test-retest reliability between two independent evaluation differing of one week (mean K =0.73). The total FAST scores were lower in controls as compared with Bipolar Patients and in Euthimic patients compared with Depressed or Manic. Conclusion: The Italian version of the FAST showed similar psychometrics properties as far as regard internal consistency and discriminant validity of the original version and show a good test retest reliability measure by means of K statistics. PMID:22905035
Cost-Benefit Analysis for Alternatives to Aliphatic Isocyanate Polyurethanes

NASA Technical Reports Server (NTRS)

Lewis, Pattie

2007-01-01

NASA and Air Force Space Command (AFSPC) have similar missions and therefore similar facilities and structures in similar environments. The standard practice for protecting metallic substrates in atmospheric environments is the application of an applied coating system. The most common topcoats used in coating systems are polyurethanes that contain isocyanates. Isocyanates are classified as potential human carcinogens and are known to cause cancer in animals. The primary objective of this effort was to demonstrate and validate alternatives to aliphatic isocyanate polyurethanes resulting in one or more isocyanate-free coatings qualified for use at AFSPC and NASA installations participating in this project. This Cost-Benefit Analysis (CBA) quantifies the estimated capital and process costs of coating alternatives and cost savings relative to the current coatings. The estimates in this CBA are to be used for assessing the relative merits of the selected alternatives. The actual economic effects at any specific facility will depend on the alternative material or technology implemented, the number of actual applications converted, future workloads, and other factors . The participants initially considered eighteen (18) alternative coatings as described in the Potential Alternatives Report entitled Potential Alternatives Report for Validation of Alternatives to Aliphatic Isocyanate Polyurethanes, prepared by ITB. Of those, 8 alternatives were selected for testing in accordance with the Joint Test Protocol entitled Joint Test Protocol for Validation of Alternatives to Aliphatic Isocyanate Polyurethanes, and the Field Test Plan entitled Field Evaluations Test Plan for Validation of Alternatives 10 Aliphatic Isocyanate Polyurethanes, both of which were prepared by ITB. A joint Test Report entitled Joint Test Report for Validation of Alternatives to Aliphatic Isocyanate Polyurethanes, prepared by ITB, documents the results of the laboratory and field testing, as well as any test modifications made during the execution of the testing. The coatings selected for evaluation in this CBA are shown in the table below. Only one control coating system is considered in this analysis. These coatings were either downselected for Phase II or performed well enough to be included in the Qualified Products List in the NASA technical standard NASA-STD-5008, Protective Coating of Carbon Steel, Stainless Steel, and Aluminum on Launch Structures, Facilities, and Ground Support Equipment.
Assuring the Quality of Test Results in the Field of Nuclear Techniques and Ionizing Radiation. The Practical Implementation of Section 5.9 of the EN ISO/IEC 17025 Standard

NASA Astrophysics Data System (ADS)

Cucu, Daniela; Woods, Mike

2008-08-01

The paper aims to present a practical approach for testing laboratories to ensure the quality of their test results. It is based on the experience gained in assessing a large number of testing laboratories, discussing with management and staff, reviewing results obtained in national and international PTs and ILCs and exchanging information in the EA laboratory committee. According to EN ISO/IEC 17025, an accredited laboratory has to implement a programme to ensure the quality of its test results for each measurand. Pre-analytical, analytical and post-analytical measures shall be applied in a systematic manner. They shall include both quality control and quality assurance measures. When designing the quality assurance programme a laboratory should consider pre-analytical activities (like personnel training, selection and validation of test methods, qualifying equipment), analytical activities ranging from sampling, sample preparation, instrumental analysis and post-analytical activities (like decoding, calculation, use of statistical tests or packages, management of results). Designed on different levels (analyst, quality manager and technical manager), including a variety of measures, the programme shall ensure the validity and accuracy of test results, the adequacy of the management system, prove the laboratory's competence in performing tests under accreditation and last but not least show the comparability of test results. Laboratory management should establish performance targets and review periodically QC/QA results against them, implementing appropriate measures in case of non-compliance.
Evaluating the accuracy of the Wechsler Memory Scale-Fourth Edition (WMS-IV) logical memory embedded validity index for detecting invalid test performance.

PubMed

Soble, Jason R; Bain, Kathleen M; Bailey, K Chase; Kirton, Joshua W; Marceaux, Janice C; Critchfield, Edan A; McCoy, Karin J M; O'Rourke, Justin J F

2018-01-08

Embedded performance validity tests (PVTs) allow for continuous assessment of invalid performance throughout neuropsychological test batteries. This study evaluated the utility of the Wechsler Memory Scale-Fourth Edition (WMS-IV) Logical Memory (LM) Recognition score as an embedded PVT using the Advanced Clinical Solutions (ACS) for WAIS-IV/WMS-IV Effort System. This mixed clinical sample was comprised of 97 total participants, 71 of whom were classified as valid and 26 as invalid based on three well-validated, freestanding criterion PVTs. Overall, the LM embedded PVT demonstrated poor concordance with the criterion PVTs and unacceptable psychometric properties using ACS validity base rates (42% sensitivity/79% specificity). Moreover, 15-39% of participants obtained an invalid ACS base rate despite having a normatively-intact age-corrected LM Recognition total score. Receiving operating characteristic curve analysis revealed a Recognition total score cutoff of < 61% correct improved specificity (92%) while sensitivity remained weak (31%). Thus, results indicated the LM Recognition embedded PVT is not appropriate for use from an evidence-based perspective, and that clinicians may be faced with reconciling how a normatively intact cognitive performance on the Recognition subtest could simultaneously reflect invalid performance validity.
Validity and Reliability of the Turkish Version of Needs Based Biopsychosocial Distress Instrument for Cancer Patients (CANDI)

PubMed Central

Beyhun, Nazim Ercument; Can, Gamze; Tiryaki, Ahmet; Karakullukcu, Serdar; Bulut, Bekir; Yesilbas, Sehbal; Kavgaci, Halil; Topbas, Murat

2016-01-01

Background Needs based biopsychosocial distress instrument for cancer patients (CANDI) is a scale based on needs arising due to the effects of cancer. Objectives The aim of this research was to determine the reliability and validity of the CANDI scale in the Turkish language. Patients and Methods The study was performed with the participation of 172 cancer patients aged 18 and over. Factor analysis (principal components analysis) was used to assess construct validity. Criterion validities were tested by computing Spearman correlation between CANDI and hospital anxiety depression scale (HADS), and brief symptom inventory (BSI) (convergent validity) and quality of life scales (FACT-G) (divergent validity). Test-retest reliabilities and internal consistencies were measured with intraclass correlation (ICC) and Cronbach-α. Results A three-factor solution (emotional, physical and social) was found with factor analysis. Internal reliability (α = 0.94) and test-retest reliability (ICC = 0.87) were significantly high. Correlations between CANDI and HADS (rs = 0.67), and BSI (rs = 0.69) and FACT-G (rs = -0.76) were moderate and significant in the expected direction. Conclusions CANDI is a valid and reliable scale in cancer patients with a three-factor structure (emotional, physical and social) in the Turkish language. PMID:27621931
Improved Healing of Large, Osseous, Segmental Defects by Reverse Dynamization: Evaluation in a Sheep Model

DTIC Science & Technology

2017-12-01

reverse dynamization. This was supplemented by finite element analysis and the use of a strain gauge. This aim was successfully completed, with the...testing deformation results for model validation. Development of a Finite Element (FE) model was conducted through ANSYS 16 to help characterize...Fixators were characterized through mechanical testing by sawbone and ovine cadaver tibiae samples, and data was used to validate a finite element
Circulating Tumor DNA Analysis in Patients With Cancer: American Society of Clinical Oncology and College of American Pathologists Joint Review.

PubMed

Merker, Jason D; Oxnard, Geoffrey R; Compton, Carolyn; Diehn, Maximilian; Hurley, Patricia; Lazar, Alexander J; Lindeman, Neal; Lockwood, Christina M; Rai, Alex J; Schilsky, Richard L; Tsimberidou, Apostolia M; Vasalos, Patricia; Billman, Brooke L; Oliver, Thomas K; Bruinooge, Suanna S; Hayes, Daniel F; Turner, Nicholas C

2018-06-01

Purpose Clinical use of analytical tests to assess genomic variants in circulating tumor DNA (ctDNA) is increasing. This joint review from ASCO and the College of American Pathologists summarizes current information about clinical ctDNA assays and provides a framework for future research. Methods An Expert Panel conducted a literature review on the use of ctDNA assays for solid tumors, including pre-analytical variables, analytical validity, interpretation and reporting, and clinical validity and utility. Results The literature search identified 1,338 references. Of those, 390, plus 31 references supplied by the Expert Panel, were selected for full-text review. There were 77 articles selected for inclusion. Conclusion The evidence indicates that testing for ctDNA is optimally performed on plasma collected in cell stabilization or EDTA tubes, with EDTA tubes processed within 6 hours of collection. Some ctDNA assays have demonstrated clinical validity and utility with certain types of advanced cancer; however, there is insufficient evidence of clinical validity and utility for the majority of ctDNA assays in advanced cancer. Evidence shows discordance between the results of ctDNA assays and genotyping tumor specimens and supports tumor tissue genotyping to confirm undetected results from ctDNA tests. There is no evidence of clinical utility and little evidence of clinical validity of ctDNA assays in early-stage cancer, treatment monitoring, or residual disease detection. There is no evidence of clinical validity and clinical utility to suggest that ctDNA assays are useful for cancer screening, outside of a clinical trial. Given the rapid pace of research, re-evaluation of the literature will shortly be required, along with the development of tools and guidance for clinical practice.
Analytical validation of a new point-of-care assay for serum amyloid A in horses.

PubMed

Schwartz, D; Pusterla, N; Jacobsen, S; Christopher, M M

2018-01-17

Serum amyloid A (SAA) is a major acute phase protein in horses. A new point-of-care (POC) test for SAA (Stablelab) is available, but studies evaluating its analytical accuracy are lacking. To evaluate the analytical performance of the SAA POC test by 1) determining linearity and precision, 2) comparing results in whole blood with those in serum or plasma, and 3) comparing POC results with those obtained using a previously validated turbidimetric immunoassay (TIA). Assay validation. Analytical validation of the POC test was done in accordance with American Society of Veterinary Clinical Pathology guidelines using residual equine serum/plasma and whole blood samples from the Clinical Pathology Laboratory at the University of California-Davis. A TIA was used as the reference method. We also evaluated the effect of haematocrit (HCT). The POC test was linear for SAA concentrations of up to at least 1000 μg/mL (r = 0.991). Intra-assay CVs were 13, 18 and 15% at high (782 μg/mL), intermediate (116 μg/mL) and low (64 μg/mL) concentrations. Inter-assay (inter-batch) CVs were 45, 14 and 15% at high (1372 μg/mL), intermediate (140 μg/mL) and low (56 μg/mL) concentrations. SAA results in whole blood were significantly lower than those in serum/plasma (P = 0.0002), but were positively correlated (r = 0.908) and not affected by HCT (P = 0.261); proportional negative bias was observed in samples with SAA>500 μg/mL. The difference between methods exceeded the 95% confidence interval of the combined imprecision of both methods (15%). Analytical validation could not be performed in whole blood, the sample most likely to be used stall side. The POC test has acceptable accuracy and precision in equine serum/plasma with SAA concentrations of up to at least 1000 μg/mL. Low inter-batch precision at high concentrations may affect serial measurements, and the use of the same test batch and sample type (serum/plasma or whole blood) is recommended. Comparison of results between the POC test and the TIA is not recommended. © 2018 EVJ Ltd.
Use of the FDA nozzle model to illustrate validation techniques in computational fluid dynamics (CFD) simulations

PubMed Central

Hariharan, Prasanna; D’Souza, Gavin A.; Horner, Marc; Morrison, Tina M.; Malinauskas, Richard A.; Myers, Matthew R.

2017-01-01

A “credible” computational fluid dynamics (CFD) model has the potential to provide a meaningful evaluation of safety in medical devices. One major challenge in establishing “model credibility” is to determine the required degree of similarity between the model and experimental results for the model to be considered sufficiently validated. This study proposes a “threshold-based” validation approach that provides a well-defined acceptance criteria, which is a function of how close the simulation and experimental results are to the safety threshold, for establishing the model validity. The validation criteria developed following the threshold approach is not only a function of Comparison Error, E (which is the difference between experiments and simulations) but also takes in to account the risk to patient safety because of E. The method is applicable for scenarios in which a safety threshold can be clearly defined (e.g., the viscous shear-stress threshold for hemolysis in blood contacting devices). The applicability of the new validation approach was tested on the FDA nozzle geometry. The context of use (COU) was to evaluate if the instantaneous viscous shear stress in the nozzle geometry at Reynolds numbers (Re) of 3500 and 6500 was below the commonly accepted threshold for hemolysis. The CFD results (“S”) of velocity and viscous shear stress were compared with inter-laboratory experimental measurements (“D”). The uncertainties in the CFD and experimental results due to input parameter uncertainties were quantified following the ASME V&V 20 standard. The CFD models for both Re = 3500 and 6500 could not be sufficiently validated by performing a direct comparison between CFD and experimental results using the Student’s t-test. However, following the threshold-based approach, a Student’s t-test comparing |S-D| and |Threshold-S| showed that relative to the threshold, the CFD and experimental datasets for Re = 3500 were statistically similar and the model could be considered sufficiently validated for the COU. However, for Re = 6500, at certain locations where the shear stress is close the hemolysis threshold, the CFD model could not be considered sufficiently validated for the COU. Our analysis showed that the model could be sufficiently validated either by reducing the uncertainties in experiments, simulations, and the threshold or by increasing the sample size for the experiments and simulations. The threshold approach can be applied to all types of computational models and provides an objective way of determining model credibility and for evaluating medical devices. PMID:28594889
Use of the FDA nozzle model to illustrate validation techniques in computational fluid dynamics (CFD) simulations.

PubMed

Hariharan, Prasanna; D'Souza, Gavin A; Horner, Marc; Morrison, Tina M; Malinauskas, Richard A; Myers, Matthew R

2017-01-01

A "credible" computational fluid dynamics (CFD) model has the potential to provide a meaningful evaluation of safety in medical devices. One major challenge in establishing "model credibility" is to determine the required degree of similarity between the model and experimental results for the model to be considered sufficiently validated. This study proposes a "threshold-based" validation approach that provides a well-defined acceptance criteria, which is a function of how close the simulation and experimental results are to the safety threshold, for establishing the model validity. The validation criteria developed following the threshold approach is not only a function of Comparison Error, E (which is the difference between experiments and simulations) but also takes in to account the risk to patient safety because of E. The method is applicable for scenarios in which a safety threshold can be clearly defined (e.g., the viscous shear-stress threshold for hemolysis in blood contacting devices). The applicability of the new validation approach was tested on the FDA nozzle geometry. The context of use (COU) was to evaluate if the instantaneous viscous shear stress in the nozzle geometry at Reynolds numbers (Re) of 3500 and 6500 was below the commonly accepted threshold for hemolysis. The CFD results ("S") of velocity and viscous shear stress were compared with inter-laboratory experimental measurements ("D"). The uncertainties in the CFD and experimental results due to input parameter uncertainties were quantified following the ASME V&V 20 standard. The CFD models for both Re = 3500 and 6500 could not be sufficiently validated by performing a direct comparison between CFD and experimental results using the Student's t-test. However, following the threshold-based approach, a Student's t-test comparing |S-D| and |Threshold-S| showed that relative to the threshold, the CFD and experimental datasets for Re = 3500 were statistically similar and the model could be considered sufficiently validated for the COU. However, for Re = 6500, at certain locations where the shear stress is close the hemolysis threshold, the CFD model could not be considered sufficiently validated for the COU. Our analysis showed that the model could be sufficiently validated either by reducing the uncertainties in experiments, simulations, and the threshold or by increasing the sample size for the experiments and simulations. The threshold approach can be applied to all types of computational models and provides an objective way of determining model credibility and for evaluating medical devices.
Validation of sterilizing grade filtration.

PubMed

Jornitz, M W; Meltzer, T H

2003-01-01

Validation consideration of sterilizing grade filters, namely 0.2 micron, changed when FDA voiced concerns about the validity of Bacterial Challenge tests performed in the past. Such validation exercises are nowadays considered to be filter qualification. Filter validation requires more thorough analysis, especially Bacterial Challenge testing with the actual drug product under process conditions. To do so, viability testing is a necessity to determine the Bacterial Challenge test methodology. Additionally to these two compulsory tests, other evaluations like extractable, adsorption and chemical compatibility tests should be considered. PDA Technical Report # 26, Sterilizing Filtration of Liquids, describes all parameters and aspects required for the comprehensive validation of filters. The report is a most helpful tool for validation of liquid filters used in the biopharmaceutical industry. It sets the cornerstones of validation requirements and other filtration considerations.
Aeroacoustic Validation of Installed Low Noise Propulsion for NASA's N+2 Supersonic Airliner

NASA Technical Reports Server (NTRS)

Bridges, James

2018-01-01

An aeroacoustic test was conducted at NASA Glenn Research Center on an integrated propulsion system designed to meet noise regulations of ICAO Chapter 4 with 10EPNdB cumulative margin. The test had two objectives: to demonstrate that the aircraft design did meet the noise goal, and to validate the acoustic design tools used in the design. Variations in the propulsion system design and its installation were tested and the results compared against predictions. Far-field arrays of microphones measured the acoustic spectral directivity, which was transformed to full scale as noise certification levels. Phased array measurements confirmed that the shielding of the installation model adequately simulated the full aircraft and provided data for validating RANS-based noise prediction tools. Particle image velocimetry confirmed that the flow field around the nozzle on the jet rig mimicked that of the full aircraft and produced flow data to validate the RANS solutions used in the noise predictions. The far-field acoustic measurements confirmed the empirical predictions for the noise. Results provided here detail the steps taken to ensure accuracy of the measurements and give insights into the physics of exhaust noise from installed propulsion systems in future supersonic vehicles.
The Reliability and Validity of the Computerized Double Inclinometer in Measuring Lumbar Mobility

PubMed Central

MacDermid, Joy Christine; Arumugam, Vanitha; Vincent, Joshua Israel; Carroll, Krista L

2014-01-01

Study Design : Repeated measures reliability/validity study. Objectives : To determine the concurrent validity, test-retest, inter-rater and intra-rater reliability of lumbar flexion and extension measurements using the Tracker M.E. computerized dual inclinometer (CDI) in comparison to the modified-modified Schober (MMS) Summary of Background : Numerous studies have evaluated the reliability and validity of the various methods of measuring spinal motion, but the results are inconsistent. Differences in equipment and techniques make it difficult to correlate results. Methods : Twenty subjects with back pain and twenty without back pain were selected through convenience sampling. Two examiners measured sagittal plane lumbar range of motion for each subject. Two separate tests with the CDI and one test with the MMS were conducted. Each test consisted of three trials. Instrument and examiner order was randomly assigned. Intra-class correlations (ICCs 2, 2 and 2, 2) and Pearson correlation coefficients (r) were used to calculate reliability and concurrent validity respectively. Results : Intra-trial reliability was high to very high for both the CDI (ICCs 0.85 - 0.96) and MMS (ICCs 0.84 - 0.98). However, the reliability was poor to moderate, when the CDI unit had to be repositioned either by the same rate (ICCs 0.16 - 0.59) or a different rater (ICCs 0.45 - 0.52). Inter-rater reliability for the MMS was moderate to high (ICCs 0.75 - 0.82) which bettered the moderate correlation obtained for the CDI (ICCs 0.45 - 0.52). Correlations between the CDI and MMS were poor for flexion (0.32; p<0.05) and poor to moderate (-0.42 - -0.51; p<0.05) for extension measurements. Conclusion : When using the CDI, an average of subsequent tests is required to obtain moderate reliability. The MMS was highly reliable than the CDI. The MMS and the CDI measure lumbar movement on a different metric that are not highly related to each other. PMID:25352928
A Computational Methodology for Simulating Thermal Loss Testing of the Advanced Stirling Convertor

NASA Technical Reports Server (NTRS)

Reid, Terry V.; Wilson, Scott D.; Schifer, Nicholas A.; Briggs, Maxwell H.

2012-01-01

The U.S. Department of Energy (DOE) and Lockheed Martin Space Systems Company (LMSSC) have been developing the Advanced Stirling Radioisotope Generator (ASRG) for use as a power system for space science missions. This generator would use two highefficiency Advanced Stirling Convertors (ASCs), developed by Sunpower Inc. and NASA Glenn Research Center (GRC). The ASCs convert thermal energy from a radioisotope heat source into electricity. As part of ground testing of these ASCs, different operating conditions are used to simulate expected mission conditions. These conditions require achieving a particular operating frequency, hot end and cold end temperatures, and specified electrical power output for a given net heat input. In an effort to improve net heat input predictions, numerous tasks have been performed which provided a more accurate value for net heat input into the ASCs, including the use of multidimensional numerical models. Validation test hardware has also been used to provide a direct comparison of numerical results and validate the multi-dimensional numerical models used to predict convertor net heat input and efficiency. These validation tests were designed to simulate the temperature profile of an operating Stirling convertor and resulted in a measured net heat input of 244.4 W. The methodology was applied to the multi-dimensional numerical model which resulted in a net heat input of 240.3 W. The computational methodology resulted in a value of net heat input that was 1.7 percent less than that measured during laboratory testing. The resulting computational methodology and results are discussed.
Portuguese version of a stress and well-being evaluation tool (ASSET)at the workplace: validation of the psychometric properties.

PubMed

Heitor Dos Santos, Maria João; Moreira, Sérgio; Carreiras, Joana; Cooper, Cary; Smeed, Matthew; Reis, Maria de Fátima; Pereira Miguel, José

2018-02-12

The main objective of this work was to translate the English version of ASSET (A Shortened Stress Evaluation Tool) into the Portuguese version and to validate its psychometric properties. Additionally, this work tested the convergent validity of the instrument. The translation and retroversion were conducted by experts and submitted to the authors for approval. Within an observational, cross-sectional study, regarding mental health at the workplace, ASSET together with other scales was applied to a sample of 405 participants. The psychometric validity of the subscales was studied using confirmatory factorial analysis. The factorial structure of ASSET is globally supported by the results, with the Perceptions of Your Job and Attitudes Towards your Organisation subscales requiring slight adjustments in the item structure and the Your Health subscales replicating the original structure. The convergent validity also supports the ASSET, showing that all subscales are significantly correlated with variables used to test convergence. Globally, the results constitute an important contribution to ASSET and open the possibility of its usage among Portuguese-speaking countries. The results provide an evidence on the validity of the instrument and, in particular, of the mental and physical health subscales. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
The evaluation of lumbar multifidus muscle function via palpation: reliability and validity of a new clinical test.

PubMed

Hebert, Jeffrey J; Koppenhaver, Shane L; Teyhen, Deydre S; Walker, Bruce F; Fritz, Julie M

2015-06-01

The lumbar multifidus muscle provides an important contribution to lumbar spine stability, and the restoration of lumbar multifidus function is a frequent goal of rehabilitation. Currently, there are no reliable and valid physical examination procedures available to assess lumbar multifidus function among patients with low back pain. To examine the inter-rater reliability and concurrent validity of the multifidus lift test (MLT) to identify lumbar multifidus dysfunction among patients with low back pain. A cross-sectional analysis of reliability and concurrent validity performed in a university outpatient research facility. Thirty-two persons aged 18 to 60 years with current low back pain and a minimum modified Oswestry disability score of 20%. Study participants were excluded if they reported a history of lumbar spine surgery, lumbar radiculopathy, medical red flags, osteoporosis, or had recently been treated with spinal manipulation or trunk stabilization exercises. Concurrent measures of lumbar multifidus muscle function at the L4-L5 and L5-S1 levels were obtained with the MLT (index test) and real-time ultrasound imaging (reference standard). The inter-rater reliability of the MLT was examined by measuring the level of agreement between two blinded examiners. Concurrent validity of the MLT was investigated by comparing clinicians' judgments with real-time ultrasound imaging measures of lumbar multifidus function. Inter-rater reliability of the MLT was substantial to excellent (κ=0.75 to 0.81, p≤.01) and free from errors of bias and prevalence. When performed at L4-L5 or L5-S1, the MLT demonstrated evidence of concurrent validity through its relationship with the reference standard results at L4-L5 (rbis=0.59-0.73, p≤.01). The MLT generally failed to demonstrate a relationship with the reference standard results from the L5-S1 level. Our results provide preliminary evidence supporting the reliability and validity of the MLT to assess lumbar multifidus function at the L4-L5 spinal level. Additional research examining the measurement properties and utility of this test should be undertaken before confident implementation with patients. Copyright © 2015 Elsevier Inc. All rights reserved.
Development and validation of a risk-prediction nomogram for in-hospital mortality in adults poisoned with drugs and nonpharmaceutical agents

PubMed Central

Lionte, Catalina; Sorodoc, Victorita; Jaba, Elisabeta; Botezat, Alina

2017-01-01

Abstract Acute poisoning with drugs and nonpharmaceutical agents represents an important challenge in the emergency department (ED). The objective is to create and validate a risk-prediction nomogram for use in the ED to predict the risk of in-hospital mortality in adults from acute poisoning with drugs and nonpharmaceutical agents. This was a prospective cohort study involving adults with acute poisoning from drugs and nonpharmaceutical agents admitted to a tertiary referral center for toxicology between January and December 2015 (derivation cohort) and between January and June 2016 (validation cohort). We used a program to generate nomograms based on binary logistic regression predictive models. We included variables that had significant associations with death. Using regression coefficients, we calculated scores for each variable, and estimated the event probability. Model validation was performed using bootstrap to quantify our modeling strategy and using receiver operator characteristic (ROC) analysis. The nomogram was tested on a separate validation cohort using ROC analysis and goodness-of-fit tests. Data from 315 patients aged 18 to 91 years were analyzed (n = 180 in the derivation cohort; n = 135 in the validation cohort). In the final model, the following variables were significantly associated with mortality: age, laboratory test results (lactate, potassium, MB isoenzyme of creatine kinase), electrocardiogram parameters (QTc interval), and echocardiography findings (E wave velocity deceleration time). Sex was also included to use the same model for men and women. The resulting nomogram showed excellent survival/mortality discrimination (area under the curve [AUC] 0.976, 95% confidence interval [CI] 0.954–0.998, P < 0.0001 for the derivation cohort; AUC 0.957, 95% CI 0.892–1, P < 0.0001 for the validation cohort). This nomogram provides more precise, rapid, and simple risk-analysis information for individual patients acutely exposed to drugs and nonpharmaceutical agents, and accurately estimates the probability of in-hospital death, exclusively using the results of objective tests available in the ED. PMID:28328838
Translation of the Neck Disability Index and validation of the Greek version in a sample of neck pain patients

PubMed Central

Trouli, Marianna N; Vernon, Howard T; Kakavelakis, Kyriakos N; Antonopoulou, Maria D; Paganas, Aristofanis N; Lionis, Christos D

2008-01-01

Background Neck pain is a highly prevalent condition resulting in major disability. Standard scales for measuring disability in patients with neck pain have a pivotal role in research and clinical settings. The Neck Disability Index (NDI) is a valid and reliable tool, designed to measure disability in activities of daily living due to neck pain. The purpose of our study was the translation and validation of the NDI in a Greek primary care population with neck complaints. Methods The original version of the questionnaire was used. Based on international standards, the translation strategy comprised forward translations, reconciliation, backward translation and pre-testing steps. The validation procedure concerned the exploration of internal consistency (Cronbach alpha), test-retest reliability (Intraclass Correlation Coefficient, Bland and Altman method), construct validity (exploratory factor analysis) and responsiveness (Spearman correlation coefficient, Standard Error of Measurement and Minimal Detectable Change) of the questionnaire. Data quality was also assessed through completeness of data and floor/ceiling effects. Results The translation procedure resulted in the Greek modified version of the NDI. The latter was culturally adapted through the pre-testing phase. The validation procedure raised a large amount of missing data due to low applicability, which were assessed with two methods. Floor or ceiling effects were not observed. Cronbach alpha was calculated as 0.85, which was interpreted as good internal consistency. Intraclass correlation coefficient was found to be 0.93 (95% CI 0.84–0.97), which was considered as very good test-retest reliability. Factor analysis yielded one factor with Eigenvalue 4.48 explaining 44.77% of variance. The Spearman correlation coefficient (0.3; P = 0.02) revealed some relation between the change score in the NDI and Global Rating of Change (GROC). The SEM and MDC were calculated as 0.64 and 1.78 respectively. Conclusion The Greek version of the NDI measures disability in patients with neck pain in a reliable, valid and responsive manner. It is considered a useful tool for research and clinical settings in Greek Primary Health Care. PMID:18647393
Blood collection tubes as medical devices: The potential to affect assays and proposed verification and validation processes for the clinical laboratory.

PubMed

Bowen, Raffick A R; Adcock, Dorothy M

2016-12-01

Blood collection tubes (BCTs) are an often under-recognized variable in the preanalytical phase of clinical laboratory testing. Unfortunately, even the best-designed and manufactured BCTs may not work well in all clinical settings. Clinical laboratories, in collaboration with healthcare providers, should carefully evaluate BCTs prior to putting them into clinical use to determine their limitations and ensure that patients are not placed at risk because of inaccuracies due to poor tube performance. Selection of the best BCTs can be achieved through comparing advertising materials, reviewing the literature, observing the device at a scientific meeting, receiving a demonstration, evaluating the device under simulated conditions, or testing the device with patient samples. Although many publications have discussed method validations, few detail how to perform experiments for tube verification and validation. This article highlights the most common and impactful variables related to BCTs and discusses the validation studies that a typical clinical laboratory should perform when selecting BCTs. We also present a brief review of how in vitro diagnostic devices, particularly BCTs, are regulated in the United States, the European Union, and Canada. The verification and validation of BCTs will help to avoid the economic and human costs associated with incorrect test results, including poor patient care, unnecessary testing, and delays in test results. We urge laboratorians, tube manufacturers, diagnostic companies, and other researchers to take all the necessary steps to protect against the adverse effects of BCT components and their additives on clinical assays. Copyright Â© 2016 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

Cross-cultural adaptation of the Oral Health Impact Profile (OHIP) for the Malaysian adult population.

PubMed

Saub, R; Locker, D; Allison, P; Disman, M

2007-09-01

The aim of this project was to develop an oral health related-quality of life measure for the Malaysian adult population aged 18 and above by the cross-cultural adaption the Oral Health Impact Profile (OHIP). The adaptation of the OHIP was based on the framework proposed by Herdman et al (1998). The OHIP was translated into the Malay language using a forward-backward translation technique. Thirty-six patients were interviewed to assess the conceptual equivalence and relevancy of each item. Based on the translation process and interview results a Malaysian version of the OHIP questionnaire was produced that contained 45 items. It was designated as the OHIP(M). This questionnaire was pre-tested on 20 patients to assess its face validity. A short 14-item version of the questionnaire was completed by 171 patients to assess the suitability of the Likert-type response format. Field-testing was conducted in order to assess the suitability of two modes of administration (mail and interview) and to establish the psychometric properties of the adapted measure. The pre-testing revealed that the OHIP(M) has good face validity. It was found that the five-point frequency Likert scale could be used for the Malaysian population. The OHIP(M) was reliable, where the scale Cronbach's alpha was 0.95 and the ICC value for test-retest reliability was 0.79. Three out four construct validity hypotheses tested were confirmed. OHIP(M) works equally well as the English version. OHIP(M) was found to be reliable and valid regardless of the mode of administration. However, this study only provides initial evidence for the reliability and validity of the measure. Further study is recommended to collect more evidence to support these results.
The Leuven Embedded Figures Test (L-EFT): measuring perception, intelligence or executive function?

PubMed Central

Van der Hallen, Ruth; Wagemans, Johan; de-Wit, Lee; Chamberlain, Rebecca

2018-01-01

Performance on the Embedded Figures Test (EFT) has been interpreted as a reflection of local/global perceptual style, weak central coherence and/or field independence, as well as a measure of intelligence and executive function. The variable ways in which EFT findings have been interpreted demonstrate that the construct validity of this measure is unclear. In order to address this lack of clarity, we investigated to what extent performance on a new Embedded Figures Test (L-EFT) correlated with measures of intelligence, executive functions and estimates of local/global perceptual styles. In addition, we compared L-EFT performance to the original group EFT to directly contrast both tasks. Taken together, our results indicate that performance on the L-EFT does not correlate strongly with estimates of local/global perceptual style, intelligence or executive functions. Additionally, the results show that performance on the L-EFT is similarly associated with memory span and fluid intelligence as the group EFT. These results suggest that the L-EFT does not reflect a general perceptual or cognitive style/ability. These results further emphasize that empirical data on the construct validity of a task do not always align with the face validity of a task. PMID:29607257
Substantiation Data for Advanced Beaded and Tubular Structural Panels. Volume 3: Testing

NASA Technical Reports Server (NTRS)

Hedges, P. C.; Greene, B. E.

1974-01-01

The test program is described, which was conducted to provide the necessary experimental data to verify the design and analysis methods developed for beaded and tubular panels. Test results are summarized and presented for all local buckling and full size panel tests. Selected representative test data from each of these tests is presented in detail. The results of this program established a valid analysis and design procedure for circular tube panels. Test results from three other configurations show deformational modes which are not adequately accounted for in the present analyses.
The Construct Validity of Attitudes toward Career Counseling Scale for Korean College Students

ERIC Educational Resources Information Center

Nam, Suk Kyung; In Park, Hyung

2015-01-01

This study aimed to examine the construct validity of the Attitudes Toward Career Counseling Scale (ATCCS) in Korea. In Study 1, confirmatory factor analysis (CFA) was used for testing the factor structure of the scale. The results supported a two-factor (value and stigma) model, which was theoretically driven from the original study. Results of…
Additional Evidence for the Reliability and Validity of the Student Risk Screening Scale at the High School Level: A Replication and Extension

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy P.; Ennis, Robin Parks; Cox, Meredith Lucille; Schatschneider, Christopher; Lambert, Warren

2013-01-01

This study reports findings from a validation study of the Student Risk Screening Scale for use with 9th- through 12th-grade students (N = 1854) attending a rural fringe school. Results indicated high internal consistency, test-retest stability, and inter-rater reliability. Predictive validity was established across two academic years, with Spring…
Cross-Validation of easyCBM Reading Cut Scores in Oregon: 2009-2010. Technical Report #1108

ERIC Educational Resources Information Center

Park, Bitnara Jasmine; Irvin, P. Shawn; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

2011-01-01

This technical report presents results from a cross-validation study designed to identify optimal cut scores when using easyCBM[R] reading tests in Oregon. The cross-validation study analyzes data from the 2009-2010 academic year for easyCBM[R] reading measures. A sample of approximately 2,000 students per grade, randomly split into two groups of…
The Development of Student’s Activity Sheets (SAS) Based on Multiple Intelligences and Problem-Solving Skills Using Simple Science Tools

NASA Astrophysics Data System (ADS)

Wardani, D. S.; Kirana, T.; Ibrahim, M.

2018-01-01

The aim of this research is to produce SAS based on MI and problem-solving skills using simple science tools that are suitable to be used by elementary school students. The feasibility of SAS is evaluated based on its validity, practicality, and effectiveness. The completion Lesson Plan (LP) implementation and student’s activities are the indicators of SAS practicality. The effectiveness of SAS is measured by indicators of increased learning outcomes and problem-solving skills. The development of SAS follows the 4-D (define, design, develop, and disseminate) phase. However, this study was done until the third stage (develop). The written SAS was then validated through expert evaluation done by two experts of science, before its is tested to the target students. The try-out of SAS used one group with pre-test and post-test design. The result of this research shows that SAS is valid with “good” category. In addition, SAS is considered practical as seen from the increase of student activity at each meeting and LP implementation. Moreover, it was considered effective due to the significant difference between pre-test and post-test result of the learning outcomes and problem-solving skill test. Therefore, SAS is feasible to be used in learning.
Development and psychometric testing of the Cancer Knowledge Scale for Elders.

PubMed

Su, Ching-Ching; Chen, Yuh-Min; Kuo, Bo-Jein

2009-03-01

To develop the Cancer Knowledge Scale for Elders and test its validity and reliability. The number of elders suffering from cancer is increasing. To facilitate cancer prevention behaviours among elders, they shall be educated about cancer-related knowledge. Prior to designing a programme that would respond to the special needs of elders, understanding the cancer-related knowledge within this population was necessary. However, extensive review of the literature revealed a lack of appropriate instruments for measuring cancer-related knowledge. A valid and reliable cancer knowledge scale for elders is necessary. A non-experimental methodological design was used to test the psychometric properties of the Cancer Knowledge Scale for Elders. Item analysis was first performed to screen out items that had low corrected item-total correlation coefficients. Construct validity was examined with a principle component method of exploratory factor analysis. Cancer-related health behaviour was used as the criterion variable to evaluate criterion-related validity. Internal consistency reliability was assessed by the KR-20. Stability was determined by two-week test-retest reliability. The factor analysis yielded a four-factor solution accounting for 49.5% of the variance. For criterion-related validity, cancer knowledge was positively correlated with cancer-related health behaviour (r = 0.78, p < 0.001). The KR-20 coefficients of each factor were 0.85, 0.76, 0.79 and 0.67 and 0.87 for the total scale. Test-retest reliability over a two-week period was 0.83 (p < 0.001). This study provides evidence for content validity, construct validity, criterion-related validity, internal consistency and stability of the Cancer Knowledge Scale for Elders. The results show that this scale is an easy-to-use instrument for elders and has adequate validity and reliability. The scale can be used as an assessment instrument when implementing cancer education programmes for elders. It can also be used to evaluate the effects of education programmes.
Pitfalls in efficacy testing – how important is the validation of neutralization of chlorhexidine digluconate?

PubMed Central

Reichel, Mirja; Heisig, Peter; Kampf, Günter

2008-01-01

Background Effective neutralization of active agents is essential to obtain valid efficacy results, especially when non-volatile active agents like chlorhexidine digluconate (CHG) are tested. The aim of this study was to determine an effective and non-toxic neutralizing mixture for a propan-1-ol solution containing 2% CHG. Methods Experiments were carried out according to ASTM E 1054-02. The neutralization capacity was tested separately with five challenge microorganisms in suspension, and with a rayon swab carrier. Either 0.5 mL of the antiseptic solution (suspension test) or a saturated swab with the antiseptic solution (carrier test) was added to tryptic soy broth containing neutralizing agents. After the samples were mixed, aliquots were spread immediately and after 3 h of storage at 2 – 8°C onto tryptic soy agar containing a neutralizing mixture. Results The neutralizer was, however, not consistently effective in the suspension test. Immediate spread yielded a valid neutralization with Staphylococcus aureus, Staphylococcus epidermidis and Corynebacterium jeikeium but not with Micrococcus luteus (p < 0.001) and Candida albicans (p < 0.001). A 3-h storage period of the neutralized active agents in suspension resulted in significant carry-over activity of CHG in addition against Staphylococcus epidermidis (p < 0.001) and Corynebacterium jeikeium (p = 0.044). In the carrier test, the neutralizing mixture was found to be effective and non toxic to all challenge microorganisms when spread immediately. However, after 3 h storage of the neutralized active agents significant carry-over activity of CHG against Micrococcus luteus (p = 0.004; Tukey HSD) was observed. Conclusion Without effective neutralization in the sampling fluid, non-volatile active ingredients will continue to reduce the number of surviving microorganisms after antiseptic treatment even if the sampling fluid is kept cold straight after testing. This can result in false-positive antiseptic efficacy data. Attention should be paid during the neutralization validation process to the amount of antiseptic solution, the storage time and to the choice of appropriate and sensitive microorganisms. PMID:19046465
Validation of behaviour measurement instrument of patients with diabetes mellitus and hypertension

NASA Astrophysics Data System (ADS)

Saputri, G. Z.; Akrom; Dini, S. M.

2017-11-01

Non-adherence to the treatment of chronic diseases such as hypertension and Diabetes Mellitus (DM) is a major obstacle in achieving patient therapy targets and quality of life of patients. A comprehensive approach involving pharmacists counselling has shown influences on changes in health behaviour and patient compliance. Behaviour changes in patients are one of the parameters to assess the effectiveness of counselling and education by pharmacists. Therefore, it is necessary to develop questionnaires of behaviour change measurement in DM-hypertension patients. This study aims to develop a measurement instrument in the form of questionnaires in assessing the behaviour change of DM-hypertension patients. Preparation of question items from the questionnaire research instrument refers to some guidelines and previous research references. Test of questionnaire instrument valid was done with expert validation, followed by pilot testing on 10 healthy respondents, and 10 DM-hypertension patients included in the inclusion criteria. Furthermore, field validation test was conducted on 37 patients who had undergone outpatient care at the PKU Muhammadiyah Yogyakarta City Hospital and The Gading Clinic in Yogyakarta. The inclusion criteria were male and female patients, aged 18-65, diagnosed with type 2 diabetes with hypertension who received oral antidiabetic drugs and antihypertensives, and who were not illiterate and co-operative. The data were collected by questionnaire interviews by a standardized pharmacist. The result of validation test using Person correlation shows the value of 0.33. The results of the questionnaire validation test on 37 patients showed 5 items of invalid questions with the value of r <0.33, e: questions 2, 3, 6, 10 and 11, while the other 10 questions show the value of Pearson correlation > 0.33. The reliability value is shown from the Cronbach's alpha value of 0.722 (> 0.6), implying that the questionnaire is reliable for DM-hypertension patients. This Behavioural change questionnaire can be used on DM-hypertension patients, and an FGD approach is required for the development of factors affecting this questionnaire.
Development, test-retest reliability and validity of the Pharmacy Value-Added Services Questionnaire (PVASQ)

PubMed Central

Tan, Christine L.; Hassali, Mohamed A.; Saleem, Fahad; Shafie, Asrul A.; Aljadhey, Hisham; Gan, Vincent B.

2015-01-01

Objective: (i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Methods: Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach’s alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. Results: The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach’ s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. Conclusions: This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients’ intention to adopt pharmacy value-added services to collect partial medicine supply. PMID:26445622
[Tests and scales: restrains to use them by general practitioners. Descriptive transversal study].

PubMed

Cario, Camille; Levesque, Jean-Louis; Bouche, Gauthier

2010-12-20

Tests, even though recommended, are only few used by general practitioners (GP's). The aim of this study was to understand the reasons of this underuse. Descriptive transversal study, to explore knowledge, use and restrains to using ten tests related in the first 50 results of consultation in general practice. We questioned 121 GP's from Charente, selected ad random. The oldest tests (MMS, MNA, Fagerström, mini-GDS, IPSS, depression) are known by more than half of the GP's. Only one third is familiar with more recent tests devoted to ambulatory care (TSTS, FACE, venous thromboembolic risk), which are also used less (20% at most). Systematic use of all tests mixed up, never exceeds 30% of all GP's. The principal restrain to use these tests is lack of training (53%), which seems indeed to be inefficient in this domain; 20 to 60% of GP's who know the tests, do not use them, mainly because of doubts regarding their usefulness (38%). What really is the utility of these tests in ambulatory care? Their validity in general practice shows some gaps: their validation results seldom on studies conducted in primary care, impact studies to evaluate the benefits for patients are lacking, and tests designed for specific use by GP's are rare and lacking in validity. Development of research in primary care in this field would be desirable in order to develop relevant, feasible and acceptable tools to help decision making in general practice.
Joint Test Protocol for Validation of Alternatives to Aliphatic Isocyanate Polyurethanes

NASA Technical Reports Server (NTRS)

Lewis, Pattie

2005-01-01

The primary objective of this effort is to demonstrate and validate alternatives to aliphatic isocyanate polyurethanes. Successful completion of this project will result in one or more isocyanate-free coatings qualified for use at AFSPC and NASA installations participating in this project.
Generalization of Selection Test Validity.

ERIC Educational Resources Information Center

Colbert, G. A.; Taylor, L. R.

1978-01-01

This is part three of a three-part series concerned with the empirical development of homogeneous families of insurance company jobs based on data from the Position Analysis Questionnaire (PAQ). This part involves validity generalizations within the job families which resulted from the previous research. (Editor/RK)
Guidance for Classifying Studies Conducted Using the OECD Test Guideline 223 (TG223) (Acute Avian Oral Sequential Dose Study)

EPA Pesticide Factsheets

Guidance based on comparison of results from the TG223 validation studies to results from avian acute oral studies previously submitted to EPA for two test chemicals following EPA's 850.2100 (public draft) guidelines.
Ada compiler validation summary report. Certificate number: 891116W1. 10191. Intel Corporation, IPSC/2 Ada, Release 1. 1, IPSC/2 parallel supercomputer, system resource manager host and IPSC/2 parallel supercomputer, CX-1 nodes target

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1989-11-16

This VSR documents the results of the validation testing performed on an Ada compiler. Testing was carried out for the following purposes: To attempt to identify any language constructs supported by the compiler that do not conform to the Ada Standard; To attempt to identify any language constructs not supported by the compiler but required by the Ada Standard; and To determine that the implementation-dependent behavior is allowed by the Ada Standard. Testing of this compiler was conducted by SofTech, Inc. under the direction of he AVF according to procedures established by the Ada Joint Program Office and administered bymore » the Ada Validation Organization (AVO). On-side testing was completed 16 November 1989 at Aloha OR.« less
The cross-cultural adaptation, reliability, and validity of the Copenhagen Neck Functional Disability Scale in patients with chronic neck pain: Turkish version study.

PubMed

Yapali, Gökmen; Günel, Mintaze Kerem; Karahan, Sevilay

2012-05-15

The study design was cross-cultural adaptation and investigation of reliability and validity of the Copenhagen Neck Functional Disability Scale (CNFDS). The aim of this study was to translate the CNFDS into Turkish language and assess its reliability and validity among patients with neck pain in Turkish population. The CNFDS is a reliable and valid evaluation instrument for disability, but there is no published the Turkish version of the CNFDS. One hundred one subjects who had chronic neck pain were included in this study. The CNFDS, Neck Pain and Disability Scale, and visual analogue scale were administered to all subjects. For investigating test-retest reliability, correlation between CNFDS scores, applied at 1-week interval, intraclass correlation coefficient score for test-retest reliability was 0.86 (95% confidence interval = 0.679-0.935). There was no difference between test-retest scores (P < 0.001). For investigating concurrent validity, correlation between total score of the CNFDS and the mean visual analogue scale was r = 0.73 (P < 0.001). Concurrent validity of the CNFDS was very good. For investigating construct validity, correlation between total score of the CNFDS and the Neck Pain and Disability Scale was r = 0.78 (P < 0.001). Construct validity of the CNFDS was also very good. Our results suggest that the Turkish version of the CNFDS is a reliable and valid instrument for Turkish people.
Test-retest reliability and validity of a web-based food-frequency questionnaire for adolescents aged 13-14 to be used in the Norwegian Mother and Child Cohort Study (MoBa).

PubMed

Overby, Nina Cecilie; Johannesen, Elisabeth; Jensen, Grete; Skjaevesland, Anne-Kirsti; Haugen, Margaretha

2014-01-01

The assessment of food intake is challenging and prone to errors; it is therefore important to consider the reliability and validity of the assessment methods. The aim of this study was to analyze the reproducibility and validity of a developed food-frequency questionnaire (FFQ) for use among adolescents. In total, 58 students (aged 13-14) from four different schools in the southern part of Norway participated in the reproducibility study of filling out the FFQ 4 weeks apart. In addition, 93 students participated in the relative validity study where the FFQ was compared to 2×24-hour dietary recalls, while 92 students participated in the absolute validity study where the intakes of fatty acids and vitamin D from the FFQ were compared to fatty acids and 25-hydroxy-vitamin D3 in whole blood. The median Spearman correlation coefficient for all nutrients in the test-retest reliability study was 0.57. The median Spearman correlation for all nutrients in the relative validity study was 0.26, while the correlations coefficients were low in the absolute validity study with n-3 fatty acid coefficients ranging from 0.05 to 0.25, and absent for vitamin D (r=0.000). The test-retest reproducibility was considered good, the relative validity was considered poor to good, and the absolute validity was considered poor. However, the results are comparable to other studies among adolescents.
Validity of Sensory Systems as Distinct Constructs

PubMed Central

Su, Chia-Ting

2014-01-01

This study investigated the validity of sensory systems as distinct measurable constructs as part of a larger project examining Ayres’s theory of sensory integration. Confirmatory factor analysis (CFA) was conducted to test whether sensory questionnaire items represent distinct sensory system constructs. Data were obtained from clinical records of two age groups, 2- to 5-yr-olds (n = 231) and 6- to 10-yr-olds (n = 223). With each group, we tested several CFA models for goodness of fit with the data. The accepted model was identical for each group and indicated that tactile, vestibular–proprioceptive, visual, and auditory systems form distinct, valid factors that are not age dependent. In contrast, alternative models that grouped items according to sensory processing problems (e.g., over- or underresponsiveness within or across sensory systems) did not yield valid factors. Results indicate that distinct sensory system constructs can be measured validly using questionnaire data. PMID:25184467
The development and validation of testing materials for literacy, numeracy and digital skills in a Dutch context

NASA Astrophysics Data System (ADS)

de Greef, Maurice; Segers, Mien; Nijhuis, Jan; Lam, Jo Fond; van Groenestijn, Mieke; van Hoek, Frans; van Deursen, Alexander J. A. M.; Bohnenn, Ella; Tubbing, Marga

2015-10-01

Besides work-oriented training, most Dutch adult learning courses of formal and non-formal education focus on three basic skills: literacy, numeracy and problem solving in technology-rich environments. In the Netherlands, the Ministry of Education, Culture and Science recently initiated the development of a new adult education framework concerning literacy, numeracy and digital skills. In order to monitor the progress of literacy, numeracy and digital competencies, it is necessary to develop and validate testing materials for specific competencies. This study validates the testing materials which were developed to assess learners' proficiency in literacy (reading and writing), numeracy and digital skills based on the new Dutch framework. The outcome is that the materials proved valid and can be used in different courses referring to basic skills and adult learning, though there are still some limitations. Besides adult education professionals (such teachers and trainers), policy makers can also use the results of these tests in order to describe and monitor the impact of adult education on the lives of adult learners.

Creation and Validation of the Self-esteem/Self-image Female Sexuality (SESIFS) Questionnaire

PubMed Central

Lordello, Maria CO; Ambrogini, Carolina C; Fanganiello, Ana L; Embiruçu, Teresa R; Zaneti, Marina M; Veloso, Laise; Piccirillo, Livia B; Crude, Bianca L; Haidar, Mauro; Silva, Ivaldo

2014-01-01

INTRODUCTION Self-esteem and self-image are psychological aspects that affect sexual function. AIMS To validate a new measurement tool that correlates the concepts of self-esteem, self-image, and sexuality. METHODS A 20-question test (the self-esteem/self-image female sexuality [SESIFS] questionnaire) was created and tested on 208 women. Participants answered: Rosenberg’s self-esteem scale, the female sexual quotient (FSQ), and the SESIFS questionnaire. Pearson’s correlation coefficient was used to test concurrent validity of the SESIFS against Rosenberg’s self-esteem scale and the FSQ. Reliability was tested using the Cronbach’s alpha coefficient. RESULT The new questionnaire had a good overall reliability (Cronbach’s alpha r = 0.862, p < 0.001), but the sexual domain scored lower than expected (r = 0.65). The validity was good: overall score r = 0.38, p < 0.001, self-esteem domain r = 0.32, p < 0.001, self-image domain r = 0.31, p < 0.001, sexual domain r = 0.29, p < 0.001. CONCLUSIONS The SESIFS questionnaire has limitations in measuring the correlation among self-esteem, self-image, and sexuality domains. A new, revised version is being tested and will be presented in an upcoming publication. PMID:25574149
POLYGON - A New Fundamental Movement Skills Test for 8 Year Old Children: Construction and Validation.

PubMed

Zuvela, Frane; Bozanic, Ana; Miletic, Durdica

2011-01-01

Inadequately adopted fundamental movement skills (FMS) in early childhood may have a negative impact on the motor performance in later life (Gallahue and Ozmun, 2005). The need for an efficient FMS testing in Physical Education was recognized. The aim of this paper was to construct and validate a new FMS test for 8 year old children. Ninety-five 8 year old children were used for the testing. A total of 24 new FMS tasks were constructed and only the best representatives of movement areas entered into the final test product - FMS-POLYGON. The ICC showed high values for all 24 tasks (0.83-0.97) and the factorial analysis revealed the best representatives of each movement area that entered the FMS-POLYGON: tossing and catching the volleyball against a wall, running across obstacles, carrying the medicine balls, and straight running. The ICC for the FMS-POLYGON showed a very high result (0.98) and, therefore, confirmed the test's intra-rater reliability. Concurrent validity was tested with the use of the "Test of Gross Motor Development" (TGMD-2). Correlation analysis between the newly constructed FMS-POLYGON and the TGMD-2 revealed the coefficient of -0.82 which indicates a high correlation. In conclusion, the new test for FMS assessment proved to be a reliable and valid instrument for 8 year old children. Application of this test in schools is justified and could play an important factor in physical education and sport practice. Key pointsAll 21 newly constructed tasks demonstrated high intra-rater reliability (0.83-0.97) in FMS assessment. High reliability was also noted in the FMS-POLYGON test (0.98).A high correlation was found between the FMS-POLYGON and TGMD-2 which is a confirmation of the new test's concurrent validity.The research resolved the problem of long and detailed FMS assessment by adding a new dimension using quick and effective norm-referenced approach but also covering all the most important movement areas.New and validated test can be of great use primarily in school practice for physical education teachers and FMS experts.
Care Cascade for targeted tuberculosis testing and linkage to Care in Homeless Populations in the United States: a meta-analysis.

PubMed

Parriott, Andrea; Malekinejad, Mohsen; Miller, Amanda P; Marks, Suzanne M; Horvath, Hacsi; Kahn, James G

2018-04-12

Homelessness increases the risk of tuberculosis (TB) disease and latent TB infection (LTBI), but persons experiencing homelessness often lack access to testing and treatment. We assessed the yield of TB testing and linkage to care for programs targeting homeless populations in the United States. We conducted a comprehensive search of peer-reviewed and grey literature, adapting Cochrane systematic review methods. Two reviewers independently assessed study eligibility and abstracted key data on the testing to care cascade: number of persons reached, recruited for testing, tested for LTBI, with valid test results, referred to follow-up care, and initiating care. We used random effects to calculate pooled proportions and 95% confidence intervals (CI) of persons retained in each step via inverse-variance weighted meta-analysis, and cumulative proportions as products of adjacent step proportions. We identified 23 studies published between 1986 and 2014, conducted in 12 states and 15 cities. Among studies using tuberculin skin tests (TST) we found that 93.7% (CI 72.4-100%) of persons reached were recruited, 97.9% (89.3-100%) of those recruited had tests placed, 85.5% (78.6-91.3%) of those with tests placed returned for reading, 99.9% (99.6-100%) of those with tests read had valid results, and 24.7% (21.0-28.5%) with valid results tested positive. All persons testing positive were referred to follow-up care, and 99.8% attended at least one session of follow-up care. Heterogeneity was high for most pooled proportions. For a hypothetical cohort of 1000 persons experiencing homelessness reached by a targeted testing program using TST, an estimated 917 were tested, 194 were positive, and all of these initiated follow-up care. Targeted TB testing of persons experiencing homelessness appears effective in detecting LTBI and connecting persons to care and potential treatment. Future evaluations should assess diagnostic use of interferon gamma release assays and completion of treatment, and costs of testing and treatment.
Full-Scaled Advanced Systems Testbed: Ensuring Success of Adaptive Control Research Through Project Lifecycle Risk Mitigation

NASA Technical Reports Server (NTRS)

Pavlock, Kate M.

2011-01-01

The National Aeronautics and Space Administration's Dryden Flight Research Center completed flight testing of adaptive controls research on the Full-Scale Advance Systems Testbed (FAST) in January of 2011. The research addressed technical challenges involved with reducing risk in an increasingly complex and dynamic national airspace. Specific challenges lie with the development of validated, multidisciplinary, integrated aircraft control design tools and techniques to enable safe flight in the presence of adverse conditions such as structural damage, control surface failures, or aerodynamic upsets. The testbed is an F-18 aircraft serving as a full-scale vehicle to test and validate adaptive flight control research and lends a significant confidence to the development, maturation, and acceptance process of incorporating adaptive control laws into follow-on research and the operational environment. The experimental systems integrated into FAST were designed to allow for flexible yet safe flight test evaluation and validation of modern adaptive control technologies and revolve around two major hardware upgrades: the modification of Production Support Flight Control Computers (PSFCC) and integration of two, fourth-generation Airborne Research Test Systems (ARTS). Post-hardware integration verification and validation provided the foundation for safe flight test of Nonlinear Dynamic Inversion and Model Reference Aircraft Control adaptive control law experiments. To ensure success of flight in terms of cost, schedule, and test results, emphasis on risk management was incorporated into early stages of design and flight test planning and continued through the execution of each flight test mission. Specific consideration was made to incorporate safety features within the hardware and software to alleviate user demands as well as into test processes and training to reduce human factor impacts to safe and successful flight test. This paper describes the research configuration, experiment functionality, overall risk mitigation, flight test approach and results, and lessons learned of adaptive controls research of the Full-Scale Advanced Systems Testbed.
Selective testing strategies for diagnosing group A streptococcal infection in children with pharyngitis: a systematic review and prospective multicentre external validation study

PubMed Central

Cohen, Jérémie F.; Cohen, Robert; Levy, Corinne; Thollot, Franck; Benani, Mohamed; Bidet, Philippe; Chalumeau, Martin

2015-01-01

Background: Several clinical prediction rules for diagnosing group A streptococcal infection in children with pharyngitis are available. We aimed to compare the diagnostic accuracy of rules-based selective testing strategies in a prospective cohort of children with pharyngitis. Methods: We identified clinical prediction rules through a systematic search of MEDLINE and Embase (1975–2014), which we then validated in a prospective cohort involving French children who presented with pharyngitis during a 1-year period (2010–2011). We diagnosed infection with group A streptococcus using two throat swabs: one obtained for a rapid antigen detection test (StreptAtest, Dectrapharm) and one obtained for culture (reference standard). We validated rules-based selective testing strategies as follows: low risk of group A streptococcal infection, no further testing or antibiotic therapy needed; intermediate risk of infection, rapid antigen detection for all patients and antibiotic therapy for those with a positive test result; and high risk of infection, empiric antibiotic treatment. Results: We identified 8 clinical prediction rules, 6 of which could be prospectively validated. Sensitivity and specificity of rules-based selective testing strategies ranged from 66% (95% confidence interval [CI] 61–72) to 94% (95% CI 92–97) and from 40% (95% CI 35–45) to 88% (95% CI 85–91), respectively. Use of rapid antigen detection testing following the clinical prediction rule ranged from 24% (95% CI 21–27) to 86% (95% CI 84–89). None of the rules-based selective testing strategies achieved our diagnostic accuracy target (sensitivity and specificity > 85%). Interpretation: Rules-based selective testing strategies did not show sufficient diagnostic accuracy in this study population. The relevance of clinical prediction rules for determining which children with pharyngitis should undergo a rapid antigen detection test remains questionable. PMID:25487666
[Validity evidence of the Health-Related Quality of Life for Drug Abusers Test based on the Biaxial Model of Addiction].

PubMed

Lozano, Oscar M; Rojas, Antonio J; Pérez, Cristino; González-Sáiz, Francisco; Ballesta, Rosario; Izaskun, Bilbao

2008-05-01

The aim of this work is to show evidence of the validity of the Health-Related Quality of Life for Drug Abusers Test (HRQoLDA Test). This test was developed to measure specific HRQoL for drugs abusers, within the theoretical addiction framework of the biaxial model. The sample comprised 138 patients diagnosed with opiate drug dependence. In this study, the following constructs and variables of the biaxial model were measured: severity of dependence, physical health status, psychological adjustment and substance consumption. Results indicate that the HRQoLDA Test scores are related to dependency and consumption-related problems. Multiple regression analysis reveals that HRQoL can be predicted from drug dependence, physical health status and psychological adjustment. These results contribute empirical evidence of the theoretical relationships established between HRQoL and the biaxial model, and they support the interpretation of the HRQoLDA Test to measure HRQoL in drug abusers, thus providing a test to measure this specific construct in this population.
Validation of alternative methods for toxicity testing.

PubMed Central

Bruner, L H; Carr, G J; Curren, R D; Chamberlain, M

1998-01-01

Before nonanimal toxicity tests may be officially accepted by regulatory agencies, it is generally agreed that the validity of the new methods must be demonstrated in an independent, scientifically sound validation program. Validation has been defined as the demonstration of the reliability and relevance of a test method for a particular purpose. This paper provides a brief review of the development of the theoretical aspects of the validation process and updates current thinking about objectively testing the performance of an alternative method in a validation study. Validation of alternative methods for eye irritation testing is a specific example illustrating important concepts. Although discussion focuses on the validation of alternative methods intended to replace current in vivo toxicity tests, the procedures can be used to assess the performance of alternative methods intended for other uses. Images Figure 1 PMID:9599695
Optical Closed-Loop Propulsion Control System Development

NASA Technical Reports Server (NTRS)

Poppel, Gary L.

1998-01-01

The overall objective of this program was to design and fabricate the components required for optical closed-loop control of a F404-400 turbofan engine, by building on the experience of the NASA Fiber Optic Control System Integration (FOCSI) program. Evaluating the performance of fiber optic technology at the component and system levels will result in helping to validate its use on aircraft engines. This report includes descriptions of three test plans. The EOI Acceptance Test is designed to demonstrate satisfactory functionality of the EOI, primarily fail-safe throughput of the F404 sensor signals in the normal mode, and validation, switching, and output of the five analog sensor signals as generated from validated optical sensor inputs, in the optical mode. The EOI System Test is designed to demonstrate acceptable F404 ECU functionality as interfaced with the EOI, making use of a production ECU test stand. The Optical Control Engine Test Request describes planned hardware installation, optical signal calibrations, data system coordination, test procedures, and data signal comparisons for an engine test demonstration of the optical closed-loop control.
A practical method to test the validity of the standard Gumbel distribution in logit-based multinomial choice models of travel behavior

DOE PAGES

Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...

2017-11-08

Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
A practical method to test the validity of the standard Gumbel distribution in logit-based multinomial choice models of travel behavior

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ye, Xin; Garikapati, Venu M.; You, Daehyun

Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
A Systematic Review of the Reliability and Validity of Behavioural Tests Used to Assess Behavioural Characteristics Important in Working Dogs.

PubMed

Brady, Karen; Cracknell, Nina; Zulch, Helen; Mills, Daniel Simon

2018-01-01

Working dogs are selected based on predictions from tests that they will be able to perform specific tasks in often challenging environments. However, withdrawal from service in working dogs is still a big problem, bringing into question the reliability of the selection tests used to make these predictions. A systematic review was undertaken aimed at bringing together available information on the reliability and predictive validity of the assessment of behavioural characteristics used with working dogs to establish the quality of selection tests currently available for use to predict success in working dogs. The search procedures resulted in 16 papers meeting the criteria for inclusion. A large range of behaviour tests and parameters were used in the identified papers, and so behaviour tests and their underpinning constructs were grouped on the basis of their relationship with positive core affect (willingness to work, human-directed social behaviour, object-directed play tendencies) and negative core affect (human-directed aggression, approach withdrawal tendencies, sensitivity to aversives). We then examined the papers for reports of inter-rater reliability, within-session intra-rater reliability, test-retest validity and predictive validity. The review revealed a widespread lack of information relating to the reliability and validity of measures to assess behaviour and inconsistencies in terminologies, study parameters and indices of success. There is a need to standardise the reporting of these aspects of behavioural tests in order to improve the knowledge base of what characteristics are predictive of optimal performance in working dog roles, improving selection processes and reducing working dog redundancy. We suggest the use of a framework based on explaining the direct or indirect relationship of the test with core affect.
Development and validation of parenting measures for body image and eating patterns in childhood.

PubMed

Damiano, Stephanie R; Hart, Laura M; Paxton, Susan J

2015-01-01

Evidence-based parenting interventions are important in assisting parents to help their children develop healthy body image and eating patterns. To adequately assess the impact of parenting interventions, valid parent measures are required. The aim of this study was to develop and assess the validity and reliability of two new parent measures, the Parenting Intentions for Body image and Eating patterns in Childhood (Parenting Intentions BEC) and the Knowledge Test for Body image and Eating patterns in Childhood (Knowledge Test BEC). Participants were 27 professionals working in research or clinical treatment of body dissatisfaction or eating disorders, and 75 parents of children aged 2-6 years, who completed the measures via an online questionnaire. Seven scenarios were developed for the Parenting Intentions BEC to describe common experiences about the body and food that parents might need to respond to in front of their child. Parents ranked four behavioural intentions, derived from the current literature on parenting risk factors for body dissatisfaction and unhealthy eating patterns in children. Two subscales were created, one representing positive behavioural intentions, the other negative behavioural intentions. After piloting a larger pool of items, 13 statements were used to construct the Knowledge Test BEC. These were designed to be factual statements about the influence of parent language, media, family meals, healthy eating, and self-esteem on child eating and body image. The validity of both measures was tested by comparing parent and professional scores, and reliability was assessed by comparing parent scores over two testing occasions. Compared with parents, professionals reported significantly higher scores on the Positive Intentions subscale and significantly lower on the Negative Intentions subscale of the Parenting Intentions BEC; confirming the discriminant validity of six out of the seven scenarios. Test-retest reliability was also confirmed as parent scores on the two Parenting Intentions subscales did not differ over time. Eleven out of the 13 Knowledge Test items demonstrated sufficient discriminant validity and test-retest reliability. Overall, results indicated that the six-scenario Parenting Intentions BEC and the 11-item Knowledge Test BEC are valid and reliable measures for parents of young children.
Official Position of the American Academy of Clinical Neuropsychology Social Security Administration Policy on Validity Testing: Guidance and Recommendations for Change.

PubMed

Chafetz, M D; Williams, M A; Ben-Porath, Y S; Bianchini, K J; Boone, K B; Kirkwood, M W; Larrabee, G J; Ord, J S

2015-01-01

The milestone publication by Slick, Sherman, and Iverson (1999) of criteria for determining malingered neurocognitive dysfunction led to extensive research on validity testing. Position statements by the National Academy of Neuropsychology and the American Academy of Clinical Neuropsychology (AACN) recommended routine validity testing in neuropsychological evaluations. Despite this widespread scientific and professional support, the Social Security Administration (SSA) continued to discourage validity testing, a stance that led to a congressional initiative for SSA to reevaluate their position. In response, SSA commissioned the Institute of Medicine (IOM) to evaluate the science concerning the validation of psychological testing. The IOM concluded that validity assessment was necessary in psychological and neuropsychological examinations (IOM, 2015 ). The AACN sought to provide independent expert guidance and recommendations concerning the use of validity testing in disability determinations. A panel of contributors to the science of validity testing and its application to the disability process was charged with describing why the disability process for SSA needs improvement, and indicating the necessity for validity testing in disability exams. This work showed how the determination of malingering is a probability proposition, described how different types of validity tests are appropriate, provided evidence concerning non-credible findings in children and low-functioning individuals, and discussed the appropriate evaluation of pain disorders typically seen outside of mental consultations. A scientific plan for validity assessment that additionally protects test security is needed in disability determinations and in research on classification accuracy of disability decisions.
Results of Mechanical Testing for Pyroceram(tm) Glass-Ceramic

NASA Technical Reports Server (NTRS)

Choi, Sung R.; Gyekenyesi, John P.

2003-01-01

Mechanical testing for Pyroceram (trademark) 9606 glass-ceramic fabricated by Corning was conducted to determine mechanical properties of the material including slow crack growth.Valid testing was not achieved in tension, compression, and shear testing due to inappropriate test specimen configurations provided and primarily due to the existence of fortified layer( in tension).
Model-Based Verification and Validation of Spacecraft Avionics

NASA Technical Reports Server (NTRS)

Khan, M. Omair; Sievers, Michael; Standley, Shaun

2012-01-01

Verification and Validation (V&V) at JPL is traditionally performed on flight or flight-like hardware running flight software. For some time, the complexity of avionics has increased exponentially while the time allocated for system integration and associated V&V testing has remained fixed. There is an increasing need to perform comprehensive system level V&V using modeling and simulation, and to use scarce hardware testing time to validate models; the norm for thermal and structural V&V for some time. Our approach extends model-based V&V to electronics and software through functional and structural models implemented in SysML. We develop component models of electronics and software that are validated by comparison with test results from actual equipment. The models are then simulated enabling a more complete set of test cases than possible on flight hardware. SysML simulations provide access and control of internal nodes that may not be available in physical systems. This is particularly helpful in testing fault protection behaviors when injecting faults is either not possible or potentially damaging to the hardware. We can also model both hardware and software behaviors in SysML, which allows us to simulate hardware and software interactions. With an integrated model and simulation capability we can evaluate the hardware and software interactions and identify problems sooner. The primary missing piece is validating SysML model correctness against hardware; this experiment demonstrated such an approach is possible.
Identifying and classifying hyperostosis frontalis interna via computerized tomography.

PubMed

May, Hila; Peled, Nathan; Dar, Gali; Hay, Ori; Abbas, Janan; Masharawi, Youssef; Hershkovitz, Israel

2010-12-01

The aim of this study was to recognize the radiological characteristics of hyperostosis frontalis interna (HFI) and to establish a valid and reliable method for its identification and classification. A reliability test was carried out on 27 individuals who had undergone a head computerized tomography (CT) scan. Intra-observer reliability was obtained by examining the images three times, by the same researcher, with a 2-week interval between each sample ranking. The inter-observer test was performed by three independent researchers. A validity test was carried out using two methods for identifying and classifying HFI: 46 cadaver skullcaps were ranked twice via computerized tomography scans and then by direct observation. Reliability and validity were calculated using Kappa test (SPSS 15.0). Reliability tests of ranking HFI via CT scans demonstrated good results (K > 0.7). As for validity, a very good consensus was obtained between the CT and direct observation, when moderate and advanced types of HFI were present (K = 0.82). The suggested classification method for HFI, using CT, demonstrated a sensitivity of 84%, specificity of 90.5%, and positive predictive value of 91.3%. In conclusion, volume rendering is a reliable and valid tool for identifying HFI. The suggested three-scale classification is most suitable for radiological diagnosis of the phenomena. Considering the increasing awareness of HFI as an early indicator of a developing malady, this study may assist radiologists in identifying and classifying the phenomena.
Italian validation of the Purpose In Life (PIL) test and the Seeking Of Noetic Goals (SONG) test in a population of cancer patients.

PubMed

Brunelli, C; Bianchi, E; Murru, L; Monformoso, P; Bosisio, M; Gangeri, L; Miccinesi, G; Scrignaro, M; Ripamonti, C; Borreani, C

2012-11-01

The first instruments developed to evaluate specific logotherapeutic dimensions were the Purpose In Life (PIL) and the Seeking Of Noetic Goals (SONG) tests, designed to reflect Frankl's concepts of, respectively, meaning in life attainment and will to meaning. This study aims to perform the Italian cultural adaptation and the psychometric validation of the PIL and SONG questionnaires. We administered the PIL and SONG, culturally adapted into the Italian language, to 266 cancer patients. The psychometric validation appraised construct validity, internal consistency, test-retest reliability, known-group validity, and convergent validity of the two questionnaires with respect to one another. The factorial analysis indicates that the original single-factor solution can be maintained for both instruments (proportion of variance explained by the first factor 77% and 71% for the PIL and SONG, respectively). The results show excellent internal consistency (Cronbach's alpha of 0.91 for the PIL and 0.90 for the SONG) and test-retest reliability (intraclass correlation coefficient of 0.92 for the PIL and 0.81 for the SONG). As expected, males, believers, patients nearer to the diagnosis, and patients not undergoing psychological therapy have higher PIL and lower SONG scores, while expectations for age were not confirmed. The average level for the PIL was 107.3, while for the SONG, it was 66.1, and a negative correlation (-0.47) between PIL and SONG scores indicates good convergent validity of the two instruments. Italian versions of the PIL and SONG are adequate and reliable self-report instruments for evaluating purpose in life and the motivation to find purpose for cancer patient populations.
Development of an Itemwise Efficiency Scoring Method: Concurrent, Convergent, Discriminant, and Neuroimaging-Based Predictive Validity Assessed in a Large Community Sample

PubMed Central

Moore, Tyler M.; Reise, Steven P.; Roalf, David R.; Satterthwaite, Theodore D.; Davatzikos, Christos; Bilker, Warren B.; Port, Allison M.; Jackson, Chad T.; Ruparel, Kosha; Savitt, Adam P.; Baron, Robert B.; Gur, Raquel E.; Gur, Ruben C.

2016-01-01

Traditional “paper-and-pencil” testing is imprecise in measuring speed and hence limited in assessing performance efficiency, but computerized testing permits precision in measuring itemwise response time. We present a method of scoring performance efficiency (combining information from accuracy and speed) at the item level. Using a community sample of 9,498 youths age 8-21, we calculated item-level efficiency scores on four neurocognitive tests, and compared the concurrent, convergent, discriminant, and predictive validity of these scores to simple averaging of standardized speed and accuracy-summed scores. Concurrent validity was measured by the scores' abilities to distinguish men from women and their correlations with age; convergent and discriminant validity were measured by correlations with other scores inside and outside of their neurocognitive domains; predictive validity was measured by correlations with brain volume in regions associated with the specific neurocognitive abilities. Results provide support for the ability of itemwise efficiency scoring to detect signals as strong as those detected by standard efficiency scoring methods. We find no evidence of superior validity of the itemwise scores over traditional scores, but point out several advantages of the former. The itemwise efficiency scoring method shows promise as an alternative to standard efficiency scoring methods, with overall moderate support from tests of four different types of validity. This method allows the use of existing item analysis methods and provides the convenient ability to adjust the overall emphasis of accuracy versus speed in the efficiency score, thus adjusting the scoring to the real-world demands the test is aiming to fulfill. PMID:26866796
Measurement Properties of the Modified Spinal Function Sort (M-SFS): Is It Reliable and Valid in Workers with Chronic Musculoskeletal Pain?

PubMed

Trippolini, Maurizio Alen; Janssen, Svenja; Hilfiker, Roger; Oesch, Peter

2018-06-01

Purpose To analyze the reliability and validity of a picture-based questionnaire, the Modified Spinal Function Sort (M-SFS). Methods Sixty-two injured workers with chronic musculoskeletal disorders (MSD) were recruited from two work rehabilitation centers. Internal consistency was assessed by Cronbach's alpha. Construct validity was tested based on four a priori hypotheses. Structural validity was measured with principal component analysis (PCA). Test-retest reliability and agreement was evaluated using intraclass correlation coefficient (ICC) and measurement error with the limits of agreement (LoA). Results Total score of the M-SFS was 54.4 (SD 16.4) and 56.1 (16.4) for test and retest, respectively. Item distribution showed no ceiling effects. Cronbach's alpha was 0.94 and 0.95 for test and retest, respectively. PCA showed the presence of four components explaining a total of 74% of the variance. Item communalities were >0.6 in 17 out of 20 items. ICC was 0.90, LoA was ±12.6/16.2 points. The correlations between the M-SFS were 0.89 with the original SFS, 0.49 with the Pain Disability Index, -0.37 and -0.33 with the Numeric Rating Scale for actual pain, -0.52 for selfreported disability due to chronic low back pain, and 0.50, 0.56-0.59 with three distinct lifting tests. No a priori defined hypothesis for construct validity was rejected. Conclusions The M-SFS allows reliable and valid assessment of perceived self-efficacy for work-related tasks and can be recommended for use in patients with chronic MSD. Further research should investigate the proposed M-SFS score of <56 for its predictive validity for non-return to work.
Analytic Validation of Immunohistochemistry Assays: New Benchmark Data From a Survey of 1085 Laboratories.

PubMed

Stuart, Lauren N; Volmar, Keith E; Nowak, Jan A; Fatheree, Lisa A; Souers, Rhona J; Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- A cooperative agreement between the College of American Pathologists (CAP) and the United States Centers for Disease Control and Prevention was undertaken to measure laboratories' awareness and implementation of an evidence-based laboratory practice guideline (LPG) on immunohistochemical (IHC) validation practices published in 2014. - To establish new benchmark data on IHC laboratory practices. - A 2015 survey on IHC assay validation practices was sent to laboratories subscribed to specific CAP proficiency testing programs and to additional nonsubscribing laboratories that perform IHC testing. Specific questions were designed to capture laboratory practices not addressed in a 2010 survey. - The analysis was based on responses from 1085 laboratories that perform IHC staining. Ninety-six percent (809 of 844) always documented validation of IHC assays. Sixty percent (648 of 1078) had separate procedures for predictive and nonpredictive markers, 42.7% (220 of 515) had procedures for laboratory-developed tests, 50% (349 of 697) had procedures for testing cytologic specimens, and 46.2% (363 of 785) had procedures for testing decalcified specimens. Minimum case numbers were specified by 85.9% (720 of 838) of laboratories for nonpredictive markers and 76% (584 of 768) for predictive markers. Median concordance requirements were 95% for both types. For initial validation, 75.4% (538 of 714) of laboratories adopted the 20-case minimum for nonpredictive markers and 45.9% (266 of 579) adopted the 40-case minimum for predictive markers as outlined in the 2014 LPG. The most common method for validation was correlation with morphology and expected results. Laboratories also reported which assay changes necessitated revalidation and their minimum case requirements. - Benchmark data on current IHC validation practices and procedures may help laboratories understand the issues and influence further refinement of LPG recommendations.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.