Nelson, Jennifer C.; Marsh, Tracey; Lumley, Thomas; Larson, Eric B.; Jackson, Lisa A.; Jackson, Michael
2014-01-01
Objective Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased due to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias. Study Design and Setting We applied two such methods, imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method’s ability to reduce bias using the control time period prior to influenza circulation. Results Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not utilize the validation sample confounders. Conclusion Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from healthcare database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which data can be imputed or reweighted using the additional validation sample information. PMID:23849144
ERIC Educational Resources Information Center
St. Clair, Travis; Hallberg, Kelly; Cook, Thomas D.
2016-01-01
We explore the conditions under which short, comparative interrupted time-series (CITS) designs represent valid alternatives to randomized experiments in educational evaluations. To do so, we conduct three within-study comparisons, each of which uses a unique data set to test the validity of the CITS design by comparing its causal estimates to…
Nelson, Jennifer Clark; Marsh, Tracey; Lumley, Thomas; Larson, Eric B; Jackson, Lisa A; Jackson, Michael L
2013-08-01
Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased owing to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias. We applied two such methods, namely imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method's ability to reduce bias using the control time period before influenza circulation. Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not use the validation sample confounders. Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from health care database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which the data can be imputed or reweighted using the additional validation sample information. Copyright © 2013 Elsevier Inc. All rights reserved.
Comparing current definitions of return to work: a measurement approach.
Steenstra, I A; Lee, H; de Vroome, E M M; Busse, J W; Hogg-Johnson, S J
2012-09-01
Return-to-work (RTW) status is an often used outcome in work and health research. In low back pain, work is regarded as a normal activity a worker should return to in order to fully recover. Comparing outcomes across studies and even jurisdictions using different definitions of RTW can be challenging for readers in general and when performing a systematic review in particular. In this study, the measurement properties of previously defined RTW outcomes were examined with data from two studies from two countries. Data on RTW in low back pain (LBP) from the Canadian Early Claimant Cohort (ECC); a workers' compensation based study, and the Dutch Amsterdam Sherbrooke Evaluation (ASE) study were analyzed. Correlations between outcomes, differences in predictive validity when using different outcomes and construct validity when comparing outcomes to a functional status outcome were analyzed. In the ECC all definitions were highly correlated and performed similarly in predictive validity. When compared to functional status, RTW definitions in the ECC study performed fair to good on all time points. In the ASE study all definitions were highly correlated and performed similarly in predictive validity. The RTW definitions, however, failed to compare or compared poorly with functional status. Only one definition compared fairly on one time point. Differently defined outcomes are highly correlated, give similar results in prediction, but seem to differ in construct validity when compared to functional status depending on societal context or possibly birth cohort. Comparison of studies using different RTW definitions appears valid as long as RTW status is not considered as a measure of functional status.
2016-08-15
HLA ISSN 2059-2302 A comparative reference study for the validation of HLA-matching algorithms in the search for allogeneic hematopoietic stem cell...from different inter- national donor registries by challenging them with simulated input data and subse- quently comparing the output. This experiment...original work is properly cited, the use is non-commercial and no modifications or adaptations are made. Comparative reference validation of HLA
Overby, Nina Cecilie; Johannesen, Elisabeth; Jensen, Grete; Skjaevesland, Anne-Kirsti; Haugen, Margaretha
2014-01-01
The assessment of food intake is challenging and prone to errors; it is therefore important to consider the reliability and validity of the assessment methods. The aim of this study was to analyze the reproducibility and validity of a developed food-frequency questionnaire (FFQ) for use among adolescents. In total, 58 students (aged 13-14) from four different schools in the southern part of Norway participated in the reproducibility study of filling out the FFQ 4 weeks apart. In addition, 93 students participated in the relative validity study where the FFQ was compared to 2×24-hour dietary recalls, while 92 students participated in the absolute validity study where the intakes of fatty acids and vitamin D from the FFQ were compared to fatty acids and 25-hydroxy-vitamin D3 in whole blood. The median Spearman correlation coefficient for all nutrients in the test-retest reliability study was 0.57. The median Spearman correlation for all nutrients in the relative validity study was 0.26, while the correlations coefficients were low in the absolute validity study with n-3 fatty acid coefficients ranging from 0.05 to 0.25, and absent for vitamin D (r=0.000). The test-retest reproducibility was considered good, the relative validity was considered poor to good, and the absolute validity was considered poor. However, the results are comparable to other studies among adolescents.
Øverby, Nina Cecilie; Johannesen, Elisabeth; Jensen, Grete; Skjaevesland, Anne-Kirsti; Haugen, Margaretha
2014-01-01
Background The assessment of food intake is challenging and prone to errors; it is therefore important to consider the reliability and validity of the assessment methods. Objective The aim of this study was to analyze the reproducibility and validity of a developed food-frequency questionnaire (FFQ) for use among adolescents. Design In total, 58 students (aged 13–14) from four different schools in the southern part of Norway participated in the reproducibility study of filling out the FFQ 4 weeks apart. In addition, 93 students participated in the relative validity study where the FFQ was compared to 2×24-hour dietary recalls, while 92 students participated in the absolute validity study where the intakes of fatty acids and vitamin D from the FFQ were compared to fatty acids and 25-hydroxy-vitamin D3 in whole blood. Results The median Spearman correlation coefficient for all nutrients in the test–retest reliability study was 0.57. The median Spearman correlation for all nutrients in the relative validity study was 0.26, while the correlations coefficients were low in the absolute validity study with n-3 fatty acid coefficients ranging from 0.05 to 0.25, and absent for vitamin D (r=0.000). Conclusion The test–retest reproducibility was considered good, the relative validity was considered poor to good, and the absolute validity was considered poor. However, the results are comparable to other studies among adolescents. PMID:25371661
Matsuzaki, Mika; Sullivan, Ruth; Ekelund, Ulf; Krishna, K V Radha; Kulkarni, Bharati; Collier, Tim; Ben-Shlomo, Yoav; Kinra, Sanjay; Kuper, Hannah
2016-01-19
There is limited availability of context-specific physical activity questionnaires in low and middle income countries. The aim of this study was to develop and examine the validity of a new Indian physical activity questionnaire, the Andhra Pradesh Children and Parent Study Physical Activity Questionnaire (APCAPS-PAQ). The current study was conducted with the cohort from the Hyderabad DXA Study (n = 2321), recruited in 2009-2010. Criterion validity (n = 245) was examined by comparing the APCAPS-PAQ to a combined heart rate and motion sensor worn for 8 days. Construct validity (n = 2321) was assessed with linear regression, comparing APCAPS-PAQ against BMI, percent body fat, and pulse rate. The APCAPS-PAQ criterion validity was variable depending on the PA intensity groups (ρ = 0.26, 0.07, 0.39; к = 0.14, 0.04, 0.16 for sedentary, light, moderate/vigorous physical activity (MVPA) respectively). Sedentary and light intensity activities from the questionnaire were underestimated when compared to the criterion data while MVPA in APCAPS-PAQ was overestimated. Higher time spent in sedentary activity in APCAPS-PAQ was associated with higher BMI and percent body fat, suggesting construct validity. The APCAPS-PAQ validity is comparable to other physical activity questionnaires. This tool is able to assess sedentary behavior, moderate/vigorous activity and physical activity energy expenditure on a group level with reasonable validity. This new questionnaire may be used for ranking individuals according to their sedentary time and physical activity in southern India.
Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda
2016-01-01
Background Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules’ performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Methods Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. Results A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2–4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Conclusion Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved. PMID:26730980
Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda
2016-01-01
Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules' performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2-4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved.
A Comparative Study of Adolescent Risk Assessment Instruments: Predictive and Incremental Validity
ERIC Educational Resources Information Center
Welsh, Jennifer L.; Schmidt, Fred; McKinnon, Lauren; Chattha, H. K.; Meyers, Joanna R.
2008-01-01
Promising new adolescent risk assessment tools are being incorporated into clinical practice but currently possess limited evidence of predictive validity regarding their individual and/or combined use in risk assessments. The current study compares three structured adolescent risk instruments, Youth Level of Service/Case Management Inventory…
Dynamic Time Warping compared to established methods for validation of musculoskeletal models.
Gaspar, Martin; Welke, Bastian; Seehaus, Frank; Hurschler, Christof; Schwarze, Michael
2017-04-11
By means of Multi-Body musculoskeletal simulation, important variables such as internal joint forces and moments can be estimated which cannot be measured directly. Validation can ensued by qualitative or by quantitative methods. Especially when comparing time-dependent signals, many methods do not perform well and validation is often limited to qualitative approaches. The aim of the present study was to investigate the capabilities of the Dynamic Time Warping (DTW) algorithm for comparing time series, which can quantify phase as well as amplitude errors. We contrast the sensitivity of DTW with other established metrics: the Pearson correlation coefficient, cross-correlation, the metric according to Geers, RMSE and normalized RMSE. This study is based on two data sets, where one data set represents direct validation and the other represents indirect validation. Direct validation was performed in the context of clinical gait-analysis on trans-femoral amputees fitted with a 6 component force-moment sensor. Measured forces and moments from amputees' socket-prosthesis are compared to simulated forces and moments. Indirect validation was performed in the context of surface EMG measurements on a cohort of healthy subjects with measurements taken of seven muscles of the leg, which were compared to simulated muscle activations. Regarding direct validation, a positive linear relation between results of RMSE and nRMSE to DTW can be seen. For indirect validation, a negative linear relation exists between Pearson correlation and cross-correlation. We propose the DTW algorithm for use in both direct and indirect quantitative validation as it correlates well with methods that are most suitable for one of the tasks. However, in DV it should be used together with methods resulting in a dimensional error value, in order to be able to interpret results more comprehensible. Copyright © 2017 Elsevier Ltd. All rights reserved.
Montoya, A; Llopis, N; Gilaberte, I
2011-12-01
DISCERN is an instrument designed to help patients assess the reliability of written information on treatment choices. Originally created in English, there is no validated Spanish version of this instrument. This study seeks to validate the Spanish translation of the DISCERN instrument used as a primary measure on a multicenter study aimed to assess the reliability of web-based information on treatment choices for attention deficit/hyperactivity disorder (ADHD). We used a modified version of a method for validating translated instruments in which the original source-language version is formally compared with the back-translated source-language version. Each item was ranked in terms of comparability of language, similarity of interpretability, and degree of understandability. Responses used Likert scales ranging from 1 to 7, where 1 indicates the best interpretability, language and understandability, and 7 indicates the worst. Assessments were performed by 20 raters fluent in the source language. The Spanish translation of DISCERN, based on ratings of comparability, interpretability and degree of understandability (mean score (SD): 1.8 (1.1), 1.4 (0.9) and 1.6 (1.1), respectively), was considered extremely comparable. All items received a score of less than three, therefore no further revision of the translation was needed. The validation process showed that the quality of DISCERN translation was high, validating the comparable language of the tool translated on assessing written information on treatment choices for ADHD.
Kastner, Rebecca M; Sellbom, Martin; Lilienfeld, Scott O
2012-03-01
The Psychopathic Personality Inventory (PPI) has shown promising construct validity as a measure of psychopathy. Because of its relative efficiency, a short-form version of the PPI (PPI-SF) was developed and has proven useful in many psychopathy studies. The validity of the PPI-SF, however, has not been thoroughly examined, and no studies have directly compared the validity of the short form with that of the full-length version. The current study was designed to compare the psychometric properties of both PPI versions, with an emphasis on convergent and discriminant validity in predicting external criteria conceptually relevant to psychopathy. We used both prison (n = 558) and college samples (n = 322) for this investigation. PPI scale scores were more reliable and more strongly correlated with the conceptually relevant criterion measures compared with the PPI-SF, particularly in the prison sample. There were no differences in relative discriminant validity. Thus, overall, the PPI full-length version showed more evidence of construct validity than did the short form, and the consequences of this psychometric difference should be considered when evaluating the clinical utility of each measure.
ERIC Educational Resources Information Center
Isonio, Steven
In May 1991, Golden West College (California) conducted a validation study of the English portion of the Assessment and Placement Services for Community Colleges (APS), followed by a predictive validity study in July 1991. The initial study was designed to aid in the implementation of the new test at GWC by comparing data on APS use at other…
Assessing Discriminative Performance at External Validation of Clinical Prediction Models
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.
2016-01-01
Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W
2016-01-01
External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
VALUE - A Framework to Validate Downscaling Approaches for Climate Change Studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilke, Renate A. I.
2015-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. Here, we present the key ingredients of this framework. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
VALUE: A framework to validate downscaling approaches for climate change studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilcke, Renate A. I.
2015-01-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. In this paper, we present the key ingredients of this framework. VALUE's main approach to validation is user- focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
Majumdar, Subhabrata; Basak, Subhash C
2018-04-26
Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Benni, Paul B; MacLeod, David; Ikeda, Keita; Lin, Hung-Mo
2018-04-01
We describe the validation methodology for the NIRS based FORE-SIGHT ELITE ® (CAS Medical Systems, Inc., Branford, CT, USA) tissue oximeter for cerebral and somatic tissue oxygen saturation (StO 2 ) measurements for adult subjects submitted to the United States Food and Drug Administration (FDA) to obtain clearance for clinical use. This validation methodology evolved from a history of NIRS validations in the literature and FDA recommended use of Deming regression and bootstrapping statistical validation methods. For cerebral validation, forehead cerebral StO 2 measurements were compared to a weighted 70:30 reference (REF CX B ) of co-oximeter internal jugular venous and arterial blood saturation of healthy adult subjects during a controlled hypoxia sequence, with a sensor placed on the forehead. For somatic validation, somatic StO 2 measurements were compared to a weighted 70:30 reference (REF CX S ) of co-oximetry central venous and arterial saturation values following a similar protocol, with sensors place on the flank, quadriceps muscle, and calf muscle. With informed consent, 25 subjects successfully completed the cerebral validation study. The bias and precision (1 SD) of cerebral StO 2 compared to REF CX B was -0.14 ± 3.07%. With informed consent, 24 subjects successfully completed the somatic validation study. The bias and precision of somatic StO 2 compared to REF CX S was 0.04 ± 4.22% from the average of flank, quadriceps, and calf StO 2 measurements to best represent the global whole body REF CX S . The NIRS validation methods presented potentially provide a reliable means to test NIRS monitors and qualify them for clinical use.
Vasak, Christoph; Strbac, Georg D; Huber, Christian D; Lettner, Stefan; Gahleitner, André; Zechner, Werner
2015-02-01
The study aims to evaluate the accuracy of the NobelGuide™ (Medicim/Nobel Biocare, Göteborg, Sweden) concept maximally reducing the influence of clinical and surgical parameters. Moreover, the study was to compare and validate two validation procedures versus a reference method. Overall, 60 implants were placed in 10 artificial edentulous mandibles according to the NobelGuide™ protocol. For merging the pre- and postoperative DICOM data sets, three different fusion methods (Triple Scan Technique, NobelGuide™ Validation software, and AMIRA® software [VSG - Visualization Sciences Group, Burlington, MA, USA] as reference) were applied. Discrepancies between the virtual and the actual implant positions were measured. The mean deviations measured with AMIRA® were 0.49 mm (implant shoulder), 0.69 mm (implant apex), and 1.98°mm (implant axis). The Triple Scan Technique as well as the NobelGuide™ Validation software revealed similar deviations compared with the reference method. A significant correlation between angular and apical deviations was seen (r = 0.53; p < .001). A greater implant diameter was associated with greater deviations (p = .03). The Triple Scan Technique as a system-independent validation procedure as well as the NobelGuide™ Validation software are in accordance with the AMIRA® software. The NobelGuide™ system showed similar or less spatial and angular deviations compared with others. © 2013 Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
Burdick, Hal; Swartz, Carl W.; Stenner, A. Jackson; Fitzgerald, Jill; Burdick, Don; Hanlon, Sean T.
2013-01-01
The purpose of the study was to explore the validity of a novel computer-analytic developmental scale, the Writing Ability Developmental Scale. On the whole, collective results supported the validity of the scale. It was sensitive to writing ability differences across grades and sensitive to within-grade variability as compared to human-rated…
Assessing the Validity of an Annual Survey for Measuring the Enacted Literacy Curriculum
ERIC Educational Resources Information Center
Camburn, Eric M.; Han, Seong Won; Sebastian, James
2017-01-01
Surveys are frequently used to inform consequential decisions about teachers, policies, and programs. Consequently, it is important to understand the validity of these instruments. This study assesses the validity of measures of instruction captured by an annual survey by comparing survey data with those of a validated daily log. The two…
ERIC Educational Resources Information Center
Tolin, David F.; Steenkamp, Maria M.; Marx, Brian P.; Litz, Brett T.
2010-01-01
Although validity scales of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2; J. N. Butcher, W. G. Dahlstrom, J. R. Graham, A. Tellegen, & B. Kaemmer, 1989) have proven useful in the detection of symptom exaggeration in criterion-group validation (CGV) studies, usually comparing instructed feigners with known patient groups, the…
Dutch translation and cross-cultural validation of the Adult Social Care Outcomes Toolkit (ASCOT).
van Leeuwen, Karen M; Bosmans, Judith E; Jansen, Aaltje Pd; Rand, Stacey E; Towers, Ann-Marie; Smith, Nick; Razik, Kamilla; Trukeschitz, Birgit; van Tulder, Maurits W; van der Horst, Henriette E; Ostelo, Raymond W
2015-05-13
The Adult Social Care Outcomes Toolkit was developed to measure outcomes of social care in England. In this study, we translated the four level self-completion version (SCT-4) of the ASCOT for use in the Netherlands and performed a cross-cultural validation. The ASCOT SCT-4 was translated into Dutch following international guidelines, including two forward and back translations. The resulting version was pilot tested among frail older adults using think-aloud interviews. Furthermore, using a subsample of the Dutch ACT-study, we investigated test-retest reliability and construct validity and compared response distributions with data from a comparable English study. The pilot tests showed that translated items were in general understood as intended, that most items were reliable, and that the response distributions of the Dutch translation and associations with other measures were comparable to the original English version. Based on the results of the pilot tests, some small modifications and a revision of the Dignity items were proposed for the final translation, which were approved by the ASCOT development team. The complete original English version and the final Dutch translation can be obtained after registration on the ASCOT website ( http://www.pssru.ac.uk/ascot ). This study provides preliminary evidence that the Dutch translation of the ASCOT is valid, reliable and comparable to the original English version. We recommend further research to confirm the validity of the modified Dutch ASCOT translation.
Binge Eating Disorder: Reliability and Validity of a New Diagnostic Category.
ERIC Educational Resources Information Center
Brody, Michelle L.; And Others
1994-01-01
Examined reliability and validity of binge eating disorder (BED), proposed for inclusion in Diagnostic and Statistical Manual of Mental Disorders (DSM), fourth edition. Interrater reliability of BED diagnosis compared favorably with that of most diagnoses in DSM revised third edition. Study comparing obese individuals with and without BED and…
Sullivan, Ruth; Kinra, Sanjay; Ekelund, Ulf; Bharathi, A V; Vaz, Mario; Kurpad, Anura; Collier, Tim; Reddy, K Srinath; Prabhakaran, Dorairaj; Ebrahim, Shah; Kuper, Hannah
2012-02-09
Socio-cultural differences for country-specific activities are rarely addressed in physical activity questionnaires. We examined the reliability and validity of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ) in urban and rural groups in India. A sub-sample of IMS participants (n = 479) was used to examine short term (≤ 1 month [n = 158]) and long term (> 1 month [n = 321]) IMS-PAQ reliability for levels of total, sedentary, light and moderate/vigorous activity (MVPA) intensity using intraclass correlation (ICC) and kappa coefficients (k). Criterion validity (n = 157) was examined by comparing the IMS-PAQ to a uniaxial accelerometer (ACC) worn ≥ 4 days, via Spearman's rank correlations (ρ) and k, using Bland-Altman plots to check for systematic bias. Construct validity (n = 7,000) was established using linear regression, comparing IMS-PAQ against theoretical constructs associated with physical activity (PA): BMI [kg/m2], percent body fat and pulse rate. IMS-PAQ reliability ranged from ICC 0.42-0.88 and k = 0.37-0.61 (≤ 1 month) and ICC 0.26 to 0.62; kappa 0.17 to 0.45 (> 1 month). Criterion validity was ρ = 0.18-0.48; k = 0.08-0.34. Light activity was underestimated and MVPA consistently and substantially overestimated for the IMS-PAQ vs. the accelerometer. Criterion validity was moderate for total activity and MVPA. Reliability and validity were comparable for urban and rural participants but lower in women than men. Increasing time spent in total activity or MVPA, and decreasing time in sedentary activity were associated with decreasing BMI, percent body fat and pulse rate, thereby demonstrating construct validity. IMS-PAQ reliability and validity is similar to comparable self-reported instruments. It is an appropriate tool for ranking PA of individuals in India. Some refinements may be required for sedentary populations and women in India.
2012-01-01
Background Socio-cultural differences for country-specific activities are rarely addressed in physical activity questionnaires. We examined the reliability and validity of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ) in urban and rural groups in India. Methods A sub-sample of IMS participants (n = 479) was used to examine short term (≤1 month [n = 158]) and long term (> 1 month [n = 321]) IMS-PAQ reliability for levels of total, sedentary, light and moderate/vigorous activity (MVPA) intensity using intraclass correlation (ICC) and kappa coefficients (k). Criterion validity (n = 157) was examined by comparing the IMS-PAQ to a uniaxial accelerometer (ACC) worn ≥4 days, via Spearman's rank correlations (ρ) and k, using Bland-Altman plots to check for systematic bias. Construct validity (n = 7,000) was established using linear regression, comparing IMS-PAQ against theoretical constructs associated with physical activity (PA): BMI [kg/m2], percent body fat and pulse rate. Results IMS-PAQ reliability ranged from ICC 0.42-0.88 and k = 0.37-0.61 (≤1 month) and ICC 0.26 to 0.62; kappa 0.17 to 0.45 (> 1 month). Criterion validity was ρ = 0.18-0.48; k = 0.08-0.34. Light activity was underestimated and MVPA consistently and substantially overestimated for the IMS-PAQ vs. the accelerometer. Criterion validity was moderate for total activity and MVPA. Reliability and validity were comparable for urban and rural participants but lower in women than men. Increasing time spent in total activity or MVPA, and decreasing time in sedentary activity were associated with decreasing BMI, percent body fat and pulse rate, thereby demonstrating construct validity. Conclusion IMS-PAQ reliability and validity is similar to comparable self-reported instruments. It is an appropriate tool for ranking PA of individuals in India. Some refinements may be required for sedentary populations and women in India. PMID:22321669
ERIC Educational Resources Information Center
Randles, Clint; Muhonen, Sari
2015-01-01
The purpose of this study was to validate a measure of creative identity with a population of pre-service teachers in the USA, to further validate the measure with a Finnish population, and to compare both populations regarding their perceptions of themselves as creative musicians. The researcher developed a tool, the "Creative Identity…
Face Validity of Test and Acceptance of Generalized Personality Interpretations
ERIC Educational Resources Information Center
Delprato, Dennis J.
1975-01-01
The degree to which variations in the face validity of psychological tests affected students' willingness to accept personality interpretations was studied. Acceptance of personality interpretations was compared for four types of tests which varied in face validity. The relationship between judged accuracy and rated likability of the…
The Impact of Overreporting on MMPI-2-RF Substantive Scale Score Validity
ERIC Educational Resources Information Center
Burchett, Danielle L.; Ben-Porath, Yossef S.
2010-01-01
This study examined the impact of overreporting on the validity of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) substantive scale scores by comparing correlations with relevant external criteria (i.e., validity coefficients) of individuals who completed the instrument under instructions to (a) feign psychopathology…
Murumkar, Prashant R; Giridhar, Rajani; Yadav, Mange Ram
2008-04-01
A set of 29 benzothiadiazepine hydroxamates having selective tumor necrosis factor-alpha converting enzyme inhibitory activity were used to compare the quality and predictive power of 3D-quantitative structure-activity relationship, comparative molecular field analysis, and comparative molecular similarity indices models for the atom-based, centroid/atom-based, data-based, and docked conformer-based alignment. Removal of two outliers from the initial training set of molecules improved the predictivity of models. Among the 3D-quantitative structure-activity relationship models developed using the above four alignments, the database alignment provided the optimal predictive comparative molecular field analysis model for the training set with cross-validated r(2) (q(2)) = 0.510, non-cross-validated r(2) = 0.972, standard error of estimates (s) = 0.098, and F = 215.44 and the optimal comparative molecular similarity indices model with cross-validated r(2) (q(2)) = 0.556, non-cross-validated r(2) = 0.946, standard error of estimates (s) = 0.163, and F = 99.785. These models also showed the best test set prediction for six compounds with predictive r(2) values of 0.460 and 0.535, respectively. The contour maps obtained from 3D-quantitative structure-activity relationship studies were appraised for activity trends for the molecules analyzed. The comparative molecular similarity indices models exhibited good external predictivity as compared with that of comparative molecular field analysis models. The data generated from the present study helped us to further design and report some novel and potent tumor necrosis factor-alpha converting enzyme inhibitors.
Singh, Devita; Deogracias, Joseph J; Johnson, Laurel L; Bradley, Susan J; Kibblewhite, Sarah J; Owen-Anderson, Allison; Peterson-Badali, Michele; Meyer-Bahlburg, Heino F L; Zucker, Kenneth J
2010-01-01
This study aimed to provide further validity evidence for the dimensional measurement of gender identity and gender dysphoria in both adolescents and adults. Adolescents and adults with gender identity disorder (GID) were compared to clinical control (CC) adolescents and adults on the Gender Identity/Gender Dysphoria Questionnaire for Adolescents and Adults (GIDYQ-AA), a 27-item scale originally developed by Deogracias et al. (2007). In Study 1, adolescents with GID (n = 44) were compared to CC adolescents (n = 98); and in Study 2, adults with GID (n = 41) were compared to CC adults (n = 94). In both studies, clients with GID self-reported significantly more gender dysphoria than did the CCs, with excellent sensitivity and specificity rates. In both studies, degree of self-reported gender dysphoria was significantly correlated with recall of cross-gender behavior in childhood-a test of convergent validity. The research and clinical utility of the GIDYQ-AA is discussed, including directions for further research in distinct clinical populations.
ERIC Educational Resources Information Center
Hildebrand, Myrene; Hoover, H. D.
This study compared the reliability and validity of two different measures of reading ability, the Degrees of Reading Power (DRP) and the Iowa Tests of Basic Skills (ITBS) Reading test and the ITBS Vocabulary test. The data consisted of scores of 377 grade 5 and grade 6 students on these tests, along with their assigned reading levels in the…
Willis, Brian H; Riley, Richard D
2017-09-20
An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Comparative Validity of the Shedler and Westen Assessment Procedure-200
ERIC Educational Resources Information Center
Mullins-Sweatt, Stephanie N.; Widiger, Thomas A.
2008-01-01
A predominant dimensional model of general personality structure is the five-factor model (FFM). Quite a number of alternative instruments have been developed to assess the domains of the FFM. The current study compares the validity of 2 alternative versions of the Shedler and Westen Assessment Procedure (SWAP-200) FFM scales, 1 that was developed…
[Validation of the German version of the Oxford Elbow Score : A cross-sectional study].
Marquardt, J; Schöttker-Königer, T; Schäfer, A
2016-08-01
Elbow complaints are complex problems leading to severe consequences for affected people and the healthcare system. The German version of the Oxford Elbow Score (OES) is the first German-speaking instrument that specifically measures elbow complaints from the patient's perspective and changes of their health status. The aim of this study is the validation of the German version of the OES. In this context the internal consistency and the construct validity were investigated. 59 patients with elbow complaints completed the German version of the OES, the DASH and the SF-36 in a cross-sectional study. The internal consistency was calculated with Cronbach's alpha coefficients. Spearman's correlation coefficients were used to confirm construct validity. Cronbach's alpha for pain, function and psychological subscales was 0.88, 0.81 and 0.90, respectively. The whole questionnaire presents a Cronbach's alpha value of 0.93. Convergent construct validity was confirmed with correlation coefficients containing values of -0.84, -0.77 and -0.82 compared to DASH and values ranging from 0.41 to 0.80 compared with the physical domains of the SF-36. The divergent construct validity presented values ranging from 0.07 to 0.20 with the SF-36 domains of "general health perception" and "mental health". The German OES is an internal consistent instrument with good convergent and divergent construct validity. Other aspects of the validity, the reliability and the responsiveness should be confirmed through further studies.
Barrero, Lope H; Katz, Jeffrey N; Dennerlein, Jack T
2012-01-01
Objectives To describe the relation of the measured validity of self-reported mechanical demands (self-reports) with the quality of validity assessments and the variability of the assessed exposure in the study population. Methods We searched for original articles, published between 1990 and 2008, reporting the validity of self-reports in three major databases: EBSCOhost, Web of Science, and PubMed. Identified assessments were classified by methodological characteristics (eg, type of self-report and reference method) and exposure dimension was measured. We also classified assessments by the degree of comparability between the self-report and the employed reference method, and the variability of the assessed exposure in the study population. Finally, we examined the association of the published validity (r) with this degree of comparability, as well as with the variability of the exposure variable in the study population. Results Of the 490 assessments identified, 75% used observation-based reference measures and 55% tested self-reports of posture duration and movement frequency. Frequently, validity studies did not report demographic information (eg, education, age, and gender distribution). Among assessments reporting correlations as a measure of validity, studies with a better match between the self-report and the reference method, and studies conducted in more heterogeneous populations tended to report higher correlations [odds ratio (OR) 2.03, 95% confidence interval (95% CI) 0.89–4.65 and OR 1.60, 95% CI 0.96–2.61, respectively]. Conclusions The reported data support the hypothesis that validity depends on study-specific factors often not examined. Experimentally manipulating the testing setting could lead to a better understanding of the capabilities and limitations of self-reported information. PMID:19562235
Bem Sex Role Inventory Validation in the International Mobility in Aging Study.
Ahmed, Tamer; Vafaei, Afshin; Belanger, Emmanuelle; Phillips, Susan P; Zunzunegui, Maria-Victoria
2016-09-01
This study investigated the measurement structure of the Bem Sex Role Inventory (BSRI) with different factor analysis methods. Most previous studies on validity applied exploratory factor analysis (EFA) to examine the BSRI. We aimed to assess the psychometric properties and construct validity of the 12-item short-form BSRI in a sample administered to 1,995 older adults from wave 1 of the International Mobility in Aging Study (IMIAS). We used Cronbach's alpha to assess internal consistency reliability and confirmatory factor analysis (CFA) to assess psychometric properties. EFA revealed a three-factor model, further confirmed by CFA and compared with the original two-factor structure model. Results revealed that a two-factor solution (instrumentality-expressiveness) has satisfactory construct validity and superior fit to data compared to the three-factor solution. The two-factor solution confirms expected gender differences in older adults. The 12-item BSRI provides a brief, psychometrically sound, and reliable instrument in international samples of older adults.
Setyonugroho, Winny; Kropmans, Thomas; Murphy, Ruth; Hayes, Peter; van Dalen, Jan; Kennedy, Kieran M
2018-01-01
Comparing outcome of clinical skills assessment is challenging. This study proposes reliable and valid comparison of communication skills (1) assessment as practiced in Objective Structured Clinical Examinations (2). The aim of the present study is to compare CS assessment, as standardized according to the MAAS Global, between stations in a single undergraduate medical year. An OSCE delivered in an Irish undergraduate curriculum was studied. We chose the MAAS-Global as an internationally recognized and validated instrument to calibrate the OSCE station items. The MAAS-Global proportion is the percentage of station checklist items that can be considered as 'true' CS. The reliability of the OSCE was calculated with G-Theory analysis and nested ANOVA was used to compare mean scores of all years. MAAS-Global scores in psychiatry stations were significantly higher than those in other disciplines (p<0.03) and above the initial pass mark of 50%. The higher students' scores in psychiatry stations were related to higher MAAS-Global proportions when compared to the general practice stations. Comparison of outcome measurements, using the MAAS Global as a standardization instrument, between interdisciplinary station checklists was valid and reliable. The MAAS-Global was used as a single validated instrument and is suggested as gold standard. Copyright © 2017. Published by Elsevier B.V.
Validity of a digital diet estimation method for use with preschool children
USDA-ARS?s Scientific Manuscript database
The validity of using the Remote Food Photography Method (RFPM) for measuring food intake of minority preschool children's intake is not well documented. The aim of the study was to determine the validity of intake estimations made by human raters using the RFPM compared with those obtained by weigh...
Validity of the Children's Orientation to Book Reading Rating Scale
ERIC Educational Resources Information Center
Kaderavek, Joan N.; Guo, Ying; Justice, Laura M.
2014-01-01
The present study investigates the validity of a 4-point rating scale used to measure the level of preschool children's orientation to literacy during shared book reading. Validity was explored by (a) comparing the children's level of literacy orientation as measured with the "Children's Orientation to Book Reading Rating Scale" (COB)…
Study design elements for rigorous quasi-experimental comparative effectiveness research.
Maciejewski, Matthew L; Curtis, Lesley H; Dowd, Bryan
2013-03-01
Quasi-experiments are likely to be the workhorse study design used to generate evidence about the comparative effectiveness of alternative treatments, because of their feasibility, timeliness, affordability and external validity compared with randomized trials. In this review, we outline potential sources of discordance in results between quasi-experiments and experiments, review study design choices that can improve the internal validity of quasi-experiments, and outline innovative data linkage strategies that may be particularly useful in quasi-experimental comparative effectiveness research. There is an urgent need to resolve the debate about the evidentiary value of quasi-experiments since equal consideration of rigorous quasi-experiments will broaden the base of evidence that can be brought to bear in clinical decision-making and governmental policy-making.
van Ballegooijen, Wouter; Riper, Heleen; Donker, Tara; Martin Abello, Katherina; Marks, Isaac; Cuijpers, Pim
2012-01-01
The advent of web-based treatments for anxiety disorders creates a need for quick and valid online screening instruments, suitable for a range of social groups. This study validates a single-item multimedia screening instrument for agoraphobia, part of the Visual Screener for Common Mental Disorders (VS-CMD), and compares it with the text-based agoraphobia items of the PDSS-SR. The study concerned 85 subjects in an RCT of the effects of web-based therapy for panic symptoms. The VS-CMD item and items 4 and 5 of the PDSS-SR were validated by comparing scores to the outcomes of the CIDI diagnostic interview. Screening for agoraphobia was found moderately valid for both the multimedia item (sensitivity.81, specificity.66, AUC.734) and the text-based items (AUC.607–.697). Single-item multimedia screening for anxiety disorders should be further developed and tested in the general population and in patient, illiterate and immigrant samples. PMID:22844391
Prabhu, Roshan S; Press, Robert H; Boselli, Danielle M; Miller, Katherine R; Lankford, Scott P; McCammon, Robert J; Moeller, Benjamin J; Heinzerling, John H; Fasola, Carolina E; Patel, Kirtesh R; Asher, Anthony L; Sumrall, Ashley L; Curran, Walter J; Shu, Hui-Kuo G; Burri, Stuart H
2018-03-01
Patients treated with stereotactic radiosurgery (SRS) for brain metastases (BM) are at increased risk of distant brain failure (DBF). Two nomograms have been recently published to predict individualized risk of DBF after SRS. The goal of this study was to assess the external validity of these nomograms in an independent patient cohort. The records of consecutive patients with BM treated with SRS at Levine Cancer Institute and Emory University between 2005 and 2013 were reviewed. Three validation cohorts were generated based on the specific nomogram or recursive partitioning analysis (RPA) entry criteria: Wake Forest nomogram (n = 281), Canadian nomogram (n = 282), and Canadian RPA (n = 303) validation cohorts. Freedom from DBF at 1-year in the Wake Forest study was 30% compared with 50% in the validation cohort. The validation c-index for both the 6-month and 9-month freedom from DBF Wake Forest nomograms was 0.55, indicating poor discrimination ability, and the goodness-of-fit test for both nomograms was highly significant (p < 0.001), indicating poor calibration. The 1-year actuarial DBF in the Canadian nomogram study was 43.9% compared with 50.9% in the validation cohort. The validation c-index for the Canadian 1-year DBF nomogram was 0.56, and the goodness-of-fit test was also highly significant (p < 0.001). The validation accuracy and c-index of the Canadian RPA classification was 53% and 0.61, respectively. The Wake Forest and Canadian nomograms for predicting risk of DBF after SRS were found to have limited predictive ability in an independent bi-institutional validation cohort. These results reinforce the importance of validating predictive models in independent patient cohorts.
Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Souers, Rhona J; Fatheree, Lisa A; Volmar, Keith E; Stuart, Lauren N; Nowak, Jan A; Astles, J Rex; Nakhleh, Raouf E
2017-09-01
- Laboratories must demonstrate analytic validity before any test can be used clinically, but studies have shown inconsistent practices in immunohistochemical assay validation. - To assess changes in immunohistochemistry analytic validation practices after publication of an evidence-based laboratory practice guideline. - A survey on current immunohistochemistry assay validation practices and on the awareness and adoption of a recently published guideline was sent to subscribers enrolled in one of 3 relevant College of American Pathologists proficiency testing programs and to additional nonsubscribing laboratories that perform immunohistochemical testing. The results were compared with an earlier survey of validation practices. - Analysis was based on responses from 1085 laboratories that perform immunohistochemical staining. Of 1057 responses, 65.4% (691) were aware of the guideline recommendations before this survey was sent and 79.9% (550 of 688) of those have already adopted some or all of the recommendations. Compared with the 2010 survey, a significant number of laboratories now have written validation procedures for both predictive and nonpredictive marker assays and specifications for the minimum numbers of cases needed for validation. There was also significant improvement in compliance with validation requirements, with 99% (100 of 102) having validated their most recently introduced predictive marker assay, compared with 74.9% (326 of 435) in 2010. The difficulty in finding validation cases for rare antigens and resource limitations were cited as the biggest challenges in implementing the guideline. - Dissemination of the 2014 evidence-based guideline validation practices had a positive impact on laboratory performance; some or all of the recommendations have been adopted by nearly 80% of respondents.
An empirical assessment of validation practices for molecular classifiers
Castaldi, Peter J.; Dahabreh, Issa J.
2011-01-01
Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21–61%) and 29% (IQR, 15–65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04–5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice. PMID:21300697
Jefford, Elaine; Hollins Martin, Caroline J; Martin, Colin R
2018-02-01
The 10-item Birth Satisfaction Scale-Revised (BSS-R) has recently been endorsed by international expert consensus for global use as the birth satisfaction outcome measure of choice. English-language versions of the tool include validated UK and US versions; however, the instrument has not, to date, been contextualised and validated in an Australian English-language version. The current investigation sought to develop and validate an English-language version of the tool for use within the Australian context. A two-stage study. Following review and modification by expert panel, the Australian BSS-R (A-BSS-R) was (Stage 1) evaluated for factor structure, internal consistency, known-groups discriminant validity and divergent validity. Stage 2 directly compared the A-BSS-R data set with the original UK data set to determine the invariance characteristics of the new instrument. Participants were a purposive sample of Australian postnatal women (n = 198). The A-BSS-R offered a good fit to data consistent with the BSS-R tridimensional measurement model and was found to be conceptually and measurement equivalent to the UK version. The A-BSS-R demonstrated excellent known-groups discriminant validity, generally good divergent validity and overall good internal consistency. The A-BSS-R represents a robust and valid measure of the birth satisfaction concept suitable for use within Australia and appropriate for application to International comparative studies.
Heritage, Brody; Gilbert, Jessica M.; Roberts, Lynne D.
2016-01-01
Job embeddedness is a construct that describes the manner in which employees can be enmeshed in their jobs, reducing their turnover intentions. Recent questions regarding the properties of quantitative job embeddedness measures, and their predictive utility, have been raised. Our study compared two competing reflective measures of job embeddedness, examining their convergent, criterion, and incremental validity, as a means of addressing these questions. Cross-sectional quantitative data from 246 Australian university employees (146 academic; 100 professional) was gathered. Our findings indicated that the two compared measures of job embeddedness were convergent when total scale scores were examined. Additionally, job embeddedness was capable of demonstrating criterion and incremental validity, predicting unique variance in turnover intention. However, this finding was not readily apparent with one of the compared job embeddedness measures, which demonstrated comparatively weaker evidence of validity. We discuss the theoretical and applied implications of these findings, noting that job embeddedness has a complementary place among established determinants of turnover intention. PMID:27199817
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
USDA-ARS?s Scientific Manuscript database
The authors evaluated the validity of a 152-item semiquantitative food frequency questionnaire (SFFQ) by comparing it with two 7-day dietary records (7DDRs) or up to 4 automated self-administered 24-hour recalls (ASA24s) over a 1-year period in the women's Lifestyle Validation Study (2010-2012), con...
ERIC Educational Resources Information Center
Shriver, Mark D.; Frerichs, Lynae J.; Williams, Melissa; Lancaster, Blake M.
2013-01-01
Direct observation is often considered the "gold standard" for assessing the function, frequency, and intensity of problem behavior. Currently, the literature investigating the construct validity of direct observation conducted in the clinic setting reveals conflicting results. Previous studies on the construct validity of clinic-based…
Crary, Michael A.; Carnaby, Giselle D.; Sia, Isaac
2017-01-01
Background The aim of this study was to compare spontaneous swallow frequency analysis (SFA) with clinical screening protocols for identification of dysphagia in acute stroke. Methods In all, 62 patients with acute stroke were evaluated for spontaneous swallow frequency rates using a validated acoustic analysis technique. Independent of SFA, these same patients received a routine nurse-administered clinical dysphagia screening as part of standard stroke care. Both screening tools were compared against a validated clinical assessment of dysphagia for acute stroke. In addition, psychometric properties of SFA were compared against published, validated clinical screening protocols. Results Spontaneous SFA differentiates patients with versus without dysphagia after acute stroke. Using a previously identified cut point based on swallows per minute, spontaneous SFA demonstrated superior ability to identify dysphagia cases compared with a nurse-administered clinical screening tool. In addition, spontaneous SFA demonstrated equal or superior psychometric properties to 4 validated, published clinical dysphagia screening tools. Conclusions Spontaneous SFA has high potential to identify dysphagia in acute stroke with psychometric properties equal or superior to clinical screening protocols. PMID:25088166
Crary, Michael A; Carnaby, Giselle D; Sia, Isaac
2014-09-01
The aim of this study was to compare spontaneous swallow frequency analysis (SFA) with clinical screening protocols for identification of dysphagia in acute stroke. In all, 62 patients with acute stroke were evaluated for spontaneous swallow frequency rates using a validated acoustic analysis technique. Independent of SFA, these same patients received a routine nurse-administered clinical dysphagia screening as part of standard stroke care. Both screening tools were compared against a validated clinical assessment of dysphagia for acute stroke. In addition, psychometric properties of SFA were compared against published, validated clinical screening protocols. Spontaneous SFA differentiates patients with versus without dysphagia after acute stroke. Using a previously identified cut point based on swallows per minute, spontaneous SFA demonstrated superior ability to identify dysphagia cases compared with a nurse-administered clinical screening tool. In addition, spontaneous SFA demonstrated equal or superior psychometric properties to 4 validated, published clinical dysphagia screening tools. Spontaneous SFA has high potential to identify dysphagia in acute stroke with psychometric properties equal or superior to clinical screening protocols. Copyright © 2014 National Stroke Association. Published by Elsevier Inc. All rights reserved.
Using standardised patients to measure physicians' practice: validation study using audio recordings
Luck, Jeff; Peabody, John W
2002-01-01
Objective To assess the validity of standardised patients to measure the quality of physicians' practice. Design Validation study of standardised patients' assessments. Physicians saw unannounced standardised patients presenting with common outpatient conditions. The standardised patients covertly tape recorded their visit and completed a checklist of quality criteria immediately afterwards. Their assessments were compared against independent assessments of the recordings by a trained medical records abstractor. Setting Four general internal medicine primary care clinics in California. Participants 144 randomly selected consenting physicians. Main outcome measures Rates of agreement between the patients' assessments and independent assessment. Results 40 visits, one per standardised patient, were recorded. The overall rate of agreement between the standardised patients' checklists and the independent assessment of the audio transcripts was 91% (κ=0.81). Disaggregating the data by medical condition, site, level of physicians' training, and domain (stage of the consultation) gave similar rates of agreement. Sensitivity of the standardised patients' assessments was 95%, and specificity was 85%. The area under the receiver operator characteristic curve was 90%. Conclusions Standardised patients' assessments seem to be a valid measure of the quality of physicians' care for a variety of common medical conditions in actual outpatient settings. Properly trained standardised patients compare well with independent assessment of recordings of the consultations and may justify their use as a “gold standard” in comparing the quality of care across sites or evaluating data obtained from other sources, such as medical records and clinical vignettes. What is already known on this topicStandardised patients are valid and reliable reporters of physicians' practice in the medical education settingHowever, validating standardised patients' measurements of quality of care in actual primary practice is more difficult and has not been done in a prospective studyWhat this study addsReports of physicians' quality of care by unannounced standardised patients compare well with independent assessment of the consultations PMID:12351358
Validation of the Female Sexual Function Index (FSFI) for web-based administration.
Crisp, Catrina C; Fellner, Angela N; Pauls, Rachel N
2015-02-01
Web-based questionnaires are becoming increasingly valuable for clinical research. The Female Sexual Function Index (FSFI) is the gold standard for evaluating female sexual function; yet, it has not been validated in this format. We sought to validate the Female Sexual Function Index (FSFI) for web-based administration. Subjects enrolled in a web-based research survey of sexual function from the general population were invited to participate in this validation study. The first 151 respondents were included. Validation participants completed the web-based version of the FSFI followed by a mailed paper-based version. Demographic data were collected for all subjects. Scores were compared using the paired t test and the intraclass correlation coefficient. One hundred fifty-one subjects completed both web- and paper-based versions of the FSFI. Those subjects participating in the validation study did not differ in demographics or FSFI scores from the remaining subjects in the general population study. Total web-based and paper-based FSFI scores were not significantly different (mean 20.31 and 20.29 respectively, p = 0.931). The six domains or subscales of the FSFI were similar when comparing web and paper scores. Finally, intraclass correlation analysis revealed a high degree of correlation between total and subscale scores, r = 0.848-0.943, p < 0.001. Web-based administration of the FSFI is a valid alternative to the paper-based version.
Design and validation of a comprehensive fecal incontinence questionnaire.
Macmillan, Alexandra K; Merrie, Arend E H; Marshall, Roger J; Parry, Bryan R
2008-10-01
Fecal incontinence can have a profound effect on quality of life. Its prevalence remains uncertain because of stigma, lack of consistent definition, and dearth of validated measures. This study was designed to develop a valid clinical and epidemiologic questionnaire, building on current literature and expertise. Patients and experts undertook face validity testing. Construct validity, criterion validity, and test-retest reliability was undertaken. Construct validity comprised factor analysis and internal consistency of the quality of life scale. The validity of known groups was tested against 77 control subjects by using regression models. Questionnaire results were compared with a stool diary for criterion validity. Test-retest reliability was calculated from repeated questionnaire completion. The questionnaire achieved good face validity. It was completed by 104 patients. The quality of life scale had four underlying traits (factor analysis) and high internal consistency (overall Cronbach alpha = 0.97). Patients and control subjects answered the questionnaire significantly differently (P < 0.01) in known-groups validity testing. Criterion validity assessment found mean differences close to zero. Median reliability for the whole questionnaire was 0.79 (range, 0.35-1). This questionnaire compares favorably with other available instruments, although the interpretation of stool consistency requires further research. Its sensitivity to treatment still needs to be investigated.
Yılmaz, Emel; Eser, Erhan; Şekuri, Cevad; Kültürsay, Hakan
2011-08-01
The purpose of this study was to describe the psychometric properties of the Myocardial Infarction Dimensional Assessment Scale (MIDAS). This is a methodological cultural adaptation study. The MIDAS consists of 35-items covering seven domains: physical activity, insecurity, emotional reaction, dependency, diet, concerns over medication, and side effects which are rated on a five-point Likert scale from 1: never to 5:always. The highest score of MIDAS is 100.Quality of life (QOL) decreases as the score of scale increases. Overall 185 myocardial infarction (MI) patients were enrolled in this study. Cronbach alpha was used for the reliability analysis. The criterion validity, structural validity, and sensitivity analysis approach was used for validity analysis. New York Heart Association (NYHA) and the Canadian Cardiovascular Society Functional Classifications (CCSFC) for testing the criterion validity; SF-36 for construct validity testing of the Turkish version of the MIDAS were used. The range of Cronbach alpha values is 0.79-0.90 for seven domains of the scale. No problematic items were observed for the entire scale. Medication related domains of the MIDAS showed considerable floor effects (35.7%-22.7%). Confirmatory Factor analysis indicators [Comparative Fit Index (CFI) =0.95 and Root Mean Square Error of Approximation (RMSEA) =0.075] supported the construct validity of MIDAS. Convergent validity of the MIDAS was confirmed with correlation of SF-36 scale where appropriate. Criterion validity results was also satisfactory by comparing different stages of the NYHA and the CCSFC (p<0.05). Overall results revealed that Turkish version of the MIDAS is a reliable and valid instrument.
ERIC Educational Resources Information Center
Somers, Marie-Andrée; Zhu, Pei; Jacob, Robin; Bloom, Howard
2013-01-01
In this paper, we examine the validity and precision of two nonexperimental study designs (NXDs) that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. In a CITS design, program impacts are evaluated by looking at whether the treatment group deviates from its…
ERIC Educational Resources Information Center
Kardanova, Elena; Loyalka, Prashant; Chirikov, Igor; Liu, Lydia; Li, Guirong; Wang, Huan; Enchikova, Ekaterina; Shi, Henry; Johnson, Natalie
2016-01-01
Relatively little is known about differences in the quality of engineering education within and across countries because of the lack of valid instruments that allow for the assessment and comparison of engineering students' skill gains. The purpose of our study is to develop and validate instruments that can be used to compare student skill gains…
Loudon, Kirsty; Zwarenstein, Merrick; Sullivan, Frank; Donnan, Peter; Treweek, Shaun
2013-04-27
If you want to know which of two or more healthcare interventions is most effective, the randomised controlled trial is the design of choice. Randomisation, however, does not itself promote the applicability of the results to situations other than the one in which the trial was done. A tool published in 2009, PRECIS (PRagmatic Explanatory Continuum Indicator Summaries) aimed to help trialists design trials that produced results matched to the aim of the trial, be that supporting clinical decision-making, or increasing knowledge of how an intervention works. Though generally positive, groups evaluating the tool have also found weaknesses, mainly that its inter-rater reliability is not clear, that it needs a scoring system and that some new domains might be needed. The aim of the study is to: Produce an improved and validated version of the PRECIS tool. Use this tool to compare the internal validity of, and effect estimates from, a set of explanatory and pragmatic trials matched by intervention. The study has four phases. Phase 1 involves brainstorming and a two-round Delphi survey of authors who cited PRECIS. In Phase 2, the Delphi results will then be discussed and alternative versions of PRECIS-2 developed and user-tested by experienced trialists. Phase 3 will evaluate the validity and reliability of the most promising PRECIS-2 candidate using a sample of 15 to 20 trials rated by 15 international trialists. We will assess inter-rater reliability, and raters' subjective global ratings of pragmatism compared to PRECIS-2 to assess convergent and face validity. Phase 4, to determine if pragmatic trials sacrifice internal validity in order to achieve applicability, will compare the internal validity and effect estimates of matched explanatory and pragmatic trials of the same intervention, condition and participants. Effect sizes for the trials will then be compared in a meta-regression. The Cochrane Risk of Bias scores will be compared with the PRECIS-2 scores of pragmatism. We have concrete suggestions for improving PRECIS and a growing list of enthusiastic individuals interested in contributing to this work. By early 2014 we expect to have a validated PRECIS-2.
Comparative study between EDXRF and ASTM E572 methods using two-way ANOVA
NASA Astrophysics Data System (ADS)
Krummenauer, A.; Veit, H. M.; Zoppas-Ferreira, J.
2018-03-01
Comparison with reference method is one of the necessary requirements for the validation of non-standard methods. This comparison was made using the experiment planning technique with two-way ANOVA. In ANOVA, the results obtained using the EDXRF method, to be validated, were compared with the results obtained using the ASTM E572-13 standard test method. Fisher's tests (F-test) were used to comparative study between of the elements: molybdenum, niobium, copper, nickel, manganese, chromium and vanadium. All F-tests of the elements indicate that the null hypothesis (Ho) has not been rejected. As a result, there is no significant difference between the methods compared. Therefore, according to this study, it is concluded that the EDXRF method was approved in this method comparison requirement.
Hubert, C; Houari, S; Rozet, E; Lebrun, P; Hubert, Ph
2015-05-22
When using an analytical method, defining an analytical target profile (ATP) focused on quantitative performance represents a key input, and this will drive the method development process. In this context, two case studies were selected in order to demonstrate the potential of a quality-by-design (QbD) strategy when applied to two specific phases of the method lifecycle: the pre-validation study and the validation step. The first case study focused on the improvement of a liquid chromatography (LC) coupled to mass spectrometry (MS) stability-indicating method by the means of the QbD concept. The design of experiments (DoE) conducted during the optimization step (i.e. determination of the qualitative design space (DS)) was performed a posteriori. Additional experiments were performed in order to simultaneously conduct the pre-validation study to assist in defining the DoE to be conducted during the formal validation step. This predicted protocol was compared to the one used during the formal validation. A second case study based on the LC/MS-MS determination of glucosamine and galactosamine in human plasma was considered in order to illustrate an innovative strategy allowing the QbD methodology to be incorporated during the validation phase. An operational space, defined by the qualitative DS, was considered during the validation process rather than a specific set of working conditions as conventionally performed. Results of all the validation parameters conventionally studied were compared to those obtained with this innovative approach for glucosamine and galactosamine. Using this strategy, qualitative and quantitative information were obtained. Consequently, an analyst using this approach would be able to select with great confidence several working conditions within the operational space rather than a given condition for the routine use of the method. This innovative strategy combines both a learning process and a thorough assessment of the risk involved. Copyright © 2015 Elsevier B.V. All rights reserved.
Comparison of seven fall risk assessment tools in community-dwelling Korean older women.
Kim, Taekyoung; Xiong, Shuping
2017-03-01
This study aimed to compare seven widely used fall risk assessment tools in terms of validity and practicality, and to provide a guideline for choosing appropriate fall risk assessment tools for elderly Koreans. Sixty community-dwelling Korean older women (30 fallers and 30 matched non-fallers) were evaluated. Performance measures of all tools were compared between the faller and non-faller groups through two sample t-tests. Receiver Operating Characteristic curves were generated with odds ratios for discriminant analysis. Results showed that four tools had significant discriminative power, and the shortened version of Falls Efficacy Scale (SFES) showed excellent discriminant validity, followed by Berg Balance Scale (BBS) with acceptable discriminant validity. The Mini Balance Evaluation System Test and Timed Up and Go, however, had limited discriminant validities. In terms of practicality, SFES was also excellent. These findings suggest that SFES is the most suitable tool for assessing the fall risks of community-dwelling Korean older women, followed by BBS. Practitioner Summary: There is no general guideline on which fall risk assessment tools are suitable for community-dwelling Korean older women. This study compared seven widely used assessment tools in terms of validity and practicality. Results suggested that the short Falls Efficacy Scale is the most suitable tool, followed by Berg Balance Scale.
The CPT Reading Comprehension Test: A Validity Study.
ERIC Educational Resources Information Center
Napoli, Anthony R.; Raymond, Lanette A.; Coffey, Cheryl A.; Bosco, Diane M.
1998-01-01
Describes a study done at Suffolk County Community College (New York) that assessed the validity of the College Board's Computerized Placement Test in Reading Comprehension (CPT-R) by comparing test results of 1,154 freshmen with the results of the Degree of Power Reading Test. Results confirmed the CPT-R's reliability in identifying basic…
Cross-Cultural Validation of the Counselor Burnout Inventory in Hong Kong
ERIC Educational Resources Information Center
Shin, Hyojung; Yuen, Mantak; Lee, Jayoung; Lee, Sang Min
2013-01-01
This study investigated the cross-cultural validation of the Chinese translation of the Counselor Burnout Inventory (CBI) with a sample of school counselors in Hong Kong. Specifically, this study examined the CBI's factor structure using confirmatory factor analysis and calculated the effect size, to compare burnout scores among the counselors of…
Comparability of the Eating Disorder Inventory-2 Between Women and Men
ERIC Educational Resources Information Center
Spillane, Nichea S.; Boerner, Laura M.; Anderson, Kristen G.; Smith, Gregory T.
2004-01-01
Researchers studying eating disorders in men often use eating-disorder risk and symptom measures that have been validated only on women. Using a sample of 215 college women and 214 college men, this article reports on the validity the Eating Disorder Inventory2 (EDI-2), one of the best-validated among women and the most widely used risk and…
Fuermaier, Anselm B M; Tucha, Oliver; Koerts, Janneke; Lange, Klaus W; Weisbrod, Matthias; Aschenbrenner, Steffen; Tucha, Lara
2017-12-01
The assessment of performance validity is an essential part of the neuropsychological evaluation of adults with attention-deficit/hyperactivity disorder (ADHD). Most available tools, however, are inaccurate regarding the identification of noncredible performance. This study describes the development of a visuospatial working memory test, including a validity indicator for noncredible cognitive performance of adults with ADHD. Visuospatial working memory of adults with ADHD (n = 48) was first compared to the test performance of healthy individuals (n = 48). Furthermore, a simulation design was performed including 252 individuals who were randomly assigned to either a control group (n = 48) or to 1 of 3 simulation groups who were requested to feign ADHD (n = 204). Additional samples of 27 adults with ADHD and 69 instructed simulators were included to cross-validate findings from the first samples. Adults with ADHD showed impaired visuospatial working memory performance of medium size as compared to healthy individuals. Simulation groups committed significantly more errors and had shorter response times as compared to patients with ADHD. Moreover, binary logistic regression analysis was carried out to derive a validity index that optimally differentiates between true and feigned ADHD. ROC analysis demonstrated high classification rates of the validity index, as shown in excellent specificity (95.8%) and adequate sensitivity (60.3%). The visuospatial working memory test as presented in this study therefore appears sensitive in indicating cognitive impairment of adults with ADHD. Furthermore, the embedded validity index revealed promising results concerning the detection of noncredible cognitive performance of adults with ADHD. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Validity of Bioelectrical Impedance Analysis to Estimation Fat-Free Mass in the Army Cadets.
Langer, Raquel D; Borges, Juliano H; Pascoa, Mauro A; Cirolini, Vagner X; Guerra-Júnior, Gil; Gonçalves, Ezequiel M
2016-03-11
Bioelectrical Impedance Analysis (BIA) is a fast, practical, non-invasive, and frequently used method for fat-free mass (FFM) estimation. The aims of this study were to validate predictive equations of BIA to FFM estimation in Army cadets and to develop and validate a specific BIA equation for this population. A total of 396 males, Brazilian Army cadets, aged 17-24 years were included. The study used eight published predictive BIA equations, a specific equation in FFM estimation, and dual-energy X-ray absorptiometry (DXA) as a reference method. Student's t-test (for paired sample), linear regression analysis, and Bland-Altman method were used to test the validity of the BIA equations. Predictive BIA equations showed significant differences in FFM compared to DXA (p < 0.05) and large limits of agreement by Bland-Altman. Predictive BIA equations explained 68% to 88% of FFM variance. Specific BIA equations showed no significant differences in FFM, compared to DXA values. Published BIA predictive equations showed poor accuracy in this sample. The specific BIA equations, developed in this study, demonstrated validity for this sample, although should be used with caution in samples with a large range of FFM.
How to test validity in orthodontic research: a mixed dentition analysis example.
Donatelli, Richard E; Lee, Shin-Jae
2015-02-01
The data used to test the validity of a prediction method should be different from the data used to generate the prediction model. In this study, we explored whether an independent data set is mandatory for testing the validity of a new prediction method and how validity can be tested without independent new data. Several validation methods were compared in an example using the data from a mixed dentition analysis with a regression model. The validation errors of real mixed dentition analysis data and simulation data were analyzed for increasingly large data sets. The validation results of both the real and the simulation studies demonstrated that the leave-1-out cross-validation method had the smallest errors. The largest errors occurred in the traditional simple validation method. The differences between the validation methods diminished as the sample size increased. The leave-1-out cross-validation method seems to be an optimal validation method for improving the prediction accuracy in a data set with limited sample sizes. Copyright © 2015 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.
Larsen, Lisbeth Runge; Jørgensen, Martin Grønbech; Junge, Tina; Juul-Kristensen, Birgit; Wedderkopp, Niels
2014-06-10
Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children's movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Fifty-four 10-14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB.
2014-01-01
Background Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children’s movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Methods Fifty-four 10–14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Results Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Conclusion Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB. PMID:24913461
Poljak, Mario; Oštrbenk, Anja
2013-01-01
Human papillomavirus (HPV) testing has become an essential part of current clinical practice in the management of cervical cancer and precancerous lesions. We reviewed the most important validation studies of a next-generation real-time polymerase chain reaction-based assay, the RealTime High Risk HPV test (RealTime)(Abbott Molecular, Des Plaines, IL, USA), for triage in referral population settings and for use in primary cervical cancer screening in women 30 years and older published in peer-reviewed journals from 2009 to 2013. RealTime is designed to detect 14 high-risk HPV genotypes with concurrent distinction of HPV-16 and HPV-18 from 12 other HPV genotypes. The test was launched on the European market in January 2009 and is currently used in many laboratories worldwide for routine detection of HPV. We concisely reviewed validation studies of a next-generation real-time polymerase chain reaction (PCR)-based assay: the Abbott RealTime High Risk HPV test. Eight validation studies of RealTime in referral settings showed its consistently high absolute clinical sensitivity for both CIN2+ (range 88.3-100%) and CIN3+ (range 93.0-100%), as well as comparative clinical sensitivity relative to the currently most widely used HPV test: the Qiagen/Digene Hybrid Capture 2 HPV DNA Test (HC2). Due to the significantly different composition of the referral populations, RealTime absolute clinical specificity for CIN2+ and CIN3+ varied greatly across studies, but was comparable relative to HC2. Four validation studies of RealTime performance in cervical cancer screening settings showed its consistently high absolute clinical sensitivity for both CIN2+ and CIN3+, as well as comparative clinical sensitivity and specificity relative to HC2 and GP5+/6+ PCR. RealTime has been extensively evaluated in the last 4 years. RealTime can be considered clinically validated for triage in referral population settings and for use in primary cervical cancer screening in women 30 years and older.
Impact of External Cue Validity on Driving Performance in Parkinson's Disease
Scally, Karen; Charlton, Judith L.; Iansek, Robert; Bradshaw, John L.; Moss, Simon; Georgiou-Karistianis, Nellie
2011-01-01
This study sought to investigate the impact of external cue validity on simulated driving performance in 19 Parkinson's disease (PD) patients and 19 healthy age-matched controls. Braking points and distance between deceleration point and braking point were analysed for red traffic signals preceded either by Valid Cues (correctly predicting signal), Invalid Cues (incorrectly predicting signal), and No Cues. Results showed that PD drivers braked significantly later and travelled significantly further between deceleration and braking points compared with controls for Invalid and No-Cue conditions. No significant group differences were observed for driving performance in response to Valid Cues. The benefit of Valid Cues relative to Invalid Cues and No Cues was significantly greater for PD drivers compared with controls. Trail Making Test (B-A) scores correlated with driving performance for PDs only. These results highlight the importance of external cues and higher cognitive functioning for driving performance in mild to moderate PD. PMID:21789275
Reliability and Validity of the Greek Migraine Disability Assessment (MIDAS) Questionnaire.
Oikonomidi, Theodora; Vikelis, Michail; Artemiadis, Artemios; Chrousos, George P; Darviri, Christina
2018-03-01
The Migraine Disability Assessment (MIDAS) Questionnaire is a reliable and valid instrument for migraine-related disability. Such a tool is needed to quantify migraine-related disability in the Greek population. This validation study aims to assess the test-retest reliability, internal consistency, item discriminant and convergent validity of the Greek translation of the MIDAS. Adults diagnosed with migraine completed the MIDAS Questionnaire on two occasions 3 weeks apart to assess reliability, and completed the RAND-36 to assess validity. Participants (n = 152) had a median MIDAS score of 24 and mostly severe disability (58% were grade IV). The test-retest reliability analysis (N = 59) revealed excellent reliability for the total score. Internal consistency was α = 0.71 for initial and α = 0.82 for retest completion. For item discriminant validity, the correlations between each question and the total score were significant, with high correlations for questions 2-5 (range 0.67 ≤ r ≤ 0.79; p < 0.01). For convergent validity, there was significant negative correlation between the total score and all RAND-36 subscales except for 'emotional wellbeing'. The negative correlation indicates that patients with a lower degree of disability according to their MIDAS score tended to have better wellbeing. Psychometric properties are comparable with those of other published validation studies of the MIDAS and the original. Findings on question 1 show that missing work/school days may be closely related with increased affect issues. The Greek version of the MIDAS Questionnaire has good reliability and validity. This study allowed for cross-cultural comparability of research findings.
Onwujekwe, Obinna
2004-02-01
Contingent valuation question formats that will be used to elicit willingness to pay for goods and services need to be relevant to the area they will be used in order for responses to be valid. A novel contingent valuation question format called the "structured haggling technique" (SH) that resembles the bargaining system in Nigerian markets was designed and its criterion and content validity compared with those of the bidding game (BG) and binary-with-follow-up (BWFU) technique. This was achieved by determining the willingness to pay (WTP) for insecticide-treated nets (ITNs) in Southeast Nigeria. Content validity was determined through observation of actual trading of untreated nets together with interviews with sellers and consumers. Criterion validity was determined by comparing stated and actual WTP. Stated WTP was determined using a questionnaire administered to 810 household heads and actual WTP was determined by offering the nets for sale to all respondents one month later. The phi (correlation) coefficient was used to compare criterion validity across question formats. The phi coefficients were SH (0.60: 95% C.I. 0.50-0.71), BG (0.42: 95% C.I. 0.29-0.54) and the BWFU (0.32: 95% C.I. 0.20-0.44), implying that the BG and SH had similar levels of criterion-validity while the BWFU was the least criterion-valid. However, the SH was the most content-valid. It is necessary to validate the findings in other areas where haggling is common. Future studies should establish the content validity of question formats in the contexts in which they will be used before administering questionnaires.
Jovanović, Veljko
2016-12-01
The validity of the life satisfaction measures commonly used among adults has been rarely examined in adolescent samples. The present research had two main goals: (1) to evaluate the structural validity of the Satisfaction with Life Scale (SWLS) among adolescents and to test measurement invariance across gender; (2) to compare the criterion and convergent validity of the SWLS and single-item life satisfaction measures among adolescents. Three samples of Serbian adolescents were recruited for the present research. Study 1 (N = 481, M age = 17.01 years) examined the structure of the SWLS via confirmatory factor analysis (CFA) and evaluated measurement invariance of the SWLS across gender by a multi-group CFA. Study 2 (N = 283, M age = 17.34 years) and Study 3 (N = 220, M age = 16.73 years) compared the convergent validity of the SWLS and single-item life satisfaction measures. The results of Study 1 supported the original one-factor model of the SWLS among adolescents and provided evidence for strong measurement invariance of the SWLS across gender. The findings of Study 2 and Study 3 showed that the SWLS and single-item measures were equally valid and strongly associated (r = .734 in Study 2 and r = .668 in Study 3). No substantial differences in correlations with school success and well-being indicators were found between the SWLS and single-item measures. Our findings support the use of the SWLS among adolescents and indicate that single-item life satisfaction measures perform as well as the SWLS in adolescent samples.
NASA Astrophysics Data System (ADS)
Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra
2013-03-01
SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.
Medicinal plants used by the Tamang community in the Makawanpur district of central Nepal
2014-01-01
Background We can conserve cultural heritage and gain extensive knowledge of plant species with pharmacological potential to cure simple to life-threatening diseases by studying the use of plants in indigenous communities. Therefore, it is important to conduct ethnobotanical studies in indigenous communities and to validate the reported uses of plants by comparing ethnobotanical studies with phytochemical and pharmacological studies. Materials and methods This study was conducted in a Tamang community dwelling in the Makawanpur district of central Nepal. We used semi-structured and structured questionnaires during interviews to collect information. We compared use reports with available phytochemical and pharmacological studies for validation. Results A total of 161 plant species belonging to 86 families and 144 genera to cure 89 human ailments were documented. Although 68 plant species were cited as medicinal in previous studies, 55 different uses described by the Tamang people were not found in any of the compared studies. Traditional uses for 60 plant species were consistent with pharmacological and phytochemical studies. Conclusions The Tamang people in Makawanpur are rich in ethnopharmacological understanding. The present study highlights important medicinal plant species by validating their traditional uses. Different plant species can improve local economies through proper harvesting, adequate management and development of modern techniques to maximize their use. PMID:24410808
ERIC Educational Resources Information Center
Shogren, Karrie A.; Wehmeyer, Michael L.; Seo, Hyojeong; Thompson, James R.; Schalock, Robert L.; Hughes, Carolyn; Little, Todd D.; Palmer, Susan B.
2017-01-01
This study compared the reliability, validity, and measurement properties of the "Supports Intensity Scale-Children's Version" (SIS-C) in children with autism and intellectual disability (n = 2,124) and children with intellectual disability only (n = 1,861). The results suggest that SIS-C is a valid and reliable tool in both populations.…
2014-01-01
Background Fatigue is a disabling symptom associated with reduced quality of life in various populations living with chronic illnesses. The transfer of knowledge about fatigue from one group to another is crucial in both research and healthcare. Outcomes should be validly and reliably comparable between groups and should not be unduly influenced by diagnostic variations. The present study evaluates whether the Fatigue Severity Scale 7-item version (FSS-7) demonstrates similar item hierarchy across people with multiple sclerosis, stroke or HIV/AIDS to ensure valid comparisons between groups, and provide further evidence of internal scale validity. Methods A secondary comparative analysis was performed using data from three different studies of three different chronic illnesses: multiple sclerosis, stroke and HIV/AIDS. Each of these studies had previously concluded that the FSS-7 has better psychometric properties than the original FSS for measuring fatigue interference. Data from 224 people with multiple sclerosis, 104 people with stroke and 316 people with HIV/AIDS were examined. Item response theory and a Rasch model were chosen to analyze the similarity of the FSS-7 item hierarchy across the three diagnostic groups Results Cross-sample differences were found for items #3, #5, #6 and #9 for two of the three samples, which raise questions about item validity across groups. However, disease-specific and disease-generic Rasch measures were similar across samples, indicating that individual fatigue interference measures in these three chronic illnesses might still be reliably comparable using the FSS-7. Conclusions Some items performed differently between the three samples but did not bias person measures, thereby indicating that fatigue interference in these illnesses might still be reliably compared using FSS-7 scores. However, caution is warranted when comparing fatigue raw sum scores directly across diagnostic groups using the FSS-7. Further studies of the scale are needed in other types of chronic illnesses. PMID:24559076
Educational testing validity and reliability in pharmacy and medical education literature.
Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J
2013-12-16
To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
Elaboration Preferences and Differences in Learning Proficiency.
ERIC Educational Resources Information Center
Rohwer, William D., Jr.; Levin, Joel R.
The major emphasis of this study is on the comparative validities of paired-associate learning tests and IQ tests in predicting reading achievement. The study engages in a brief review of earlier research in order to examine the validity of two assumptions--that the construction and/or the use of a tactic that simplifies a learning task is one of…
Concurrent Validity of the Classroom Strategies Scale for Elementary School--Observer Form
ERIC Educational Resources Information Center
Reddy, Linda A.; Fabiano, Gregory A.; Dudek, Christopher M.
2013-01-01
The present study is an initial investigation of the concurrent validity of a new assessment, the Classroom Strategies Scale (CSS version 2.0) for Elementary School--Observer Form. The CSS assesses teachers' use of instructional and behavioral management strategies. In the present study, the CSS is compared to the Classroom Assessment Scoring…
The Validity of Truant Youths' Marijuana Use and Its Impact on Alcohol Use and Sexual Risk Taking
ERIC Educational Resources Information Center
Dembo, Richard; Briones-Robinson, Rhissa; Barrett, Kimberly; Winters, Ken C.; Ungaro, Rocío; Karas, Lora; Belenko, Steven; Wareham, Jennifer
2015-01-01
Few studies investigating the validity of marijuana use have used samples of truant youths. In the current study, self-reports of marijuana use are compared with urine test results for marijuana to identify marijuana underreporting among adolescents participating in a longitudinal brief intervention for drug-involved truant youths. It was…
Defining Surrogate Endpoints for Clinical Trials in Severe Falciparum Malaria
Plewes, Katherine; Maude, Richard J.; Hanson, Josh; Herdman, M. Trent; Leopold, Stije J.; Ngernseng, Thatsanun; Charunwatthana, Prakaykaew; Phu, Nguyen Hoan; Ghose, Aniruddha; Hasan, M. Mahtab Uddin; Fanello, Caterina I.; Faiz, Md Abul; Hien, Tran Tinh; Day, Nicholas P. J.; White, Nicholas J.; Dondorp, Arjen M.
2017-01-01
Background Clinical trials in severe falciparum malaria require a large sample size to detect clinically meaningful differences in mortality. This means few interventions can be evaluated at any time. Using a validated surrogate endpoint for mortality would provide a useful alternative allowing a smaller sample size. Here we evaluate changes in coma score and plasma lactate as surrogate endpoints for mortality in severe falciparum malaria. Methods Three datasets of clinical studies in severe malaria were re-evaluated: studies from Chittagong, Bangladesh (adults), the African ‘AQUAMAT’ trial comparing artesunate and quinine (children), and the Vietnamese ‘AQ’ study (adults) comparing artemether with quinine. The absolute change, relative change, slope of the normalization over time, and time to normalization were derived from sequential measurements of plasma lactate and coma score, and validated for their use as surrogate endpoint, including the proportion of treatment effect on mortality explained (PTE) by these surrogate measures. Results Improvements in lactate concentration or coma scores over the first 24 hours of admission, were strongly prognostic for survival in all datasets. In hyperlactataemic patients in the AQ study (n = 173), lower mortality with artemether compared to quinine closely correlated with faster reduction in plasma lactate concentration, with a high PTE of the relative change in plasma lactate at 8 and 12 hours of 0.81 and 0.75, respectively. In paediatric patients enrolled in the ‘AQUAMAT’ study with cerebral malaria (n = 785), mortality was lower with artesunate compared to quinine, but this was not associated with faster coma recovery. Conclusions The relative changes in plasma lactate concentration assessed at 8 or 12 hours after admission are valid surrogate endpoints for severe malaria studies on antimalarial drugs or adjuvant treatments aiming at improving the microcirculation. Measures of coma recovery are not valid surrogate endpoints for mortality. PMID:28052109
Defining Surrogate Endpoints for Clinical Trials in Severe Falciparum Malaria.
Jeeyapant, Atthanee; Kingston, Hugh W; Plewes, Katherine; Maude, Richard J; Hanson, Josh; Herdman, M Trent; Leopold, Stije J; Ngernseng, Thatsanun; Charunwatthana, Prakaykaew; Phu, Nguyen Hoan; Ghose, Aniruddha; Hasan, M Mahtab Uddin; Fanello, Caterina I; Faiz, Md Abul; Hien, Tran Tinh; Day, Nicholas P J; White, Nicholas J; Dondorp, Arjen M
2017-01-01
Clinical trials in severe falciparum malaria require a large sample size to detect clinically meaningful differences in mortality. This means few interventions can be evaluated at any time. Using a validated surrogate endpoint for mortality would provide a useful alternative allowing a smaller sample size. Here we evaluate changes in coma score and plasma lactate as surrogate endpoints for mortality in severe falciparum malaria. Three datasets of clinical studies in severe malaria were re-evaluated: studies from Chittagong, Bangladesh (adults), the African 'AQUAMAT' trial comparing artesunate and quinine (children), and the Vietnamese 'AQ' study (adults) comparing artemether with quinine. The absolute change, relative change, slope of the normalization over time, and time to normalization were derived from sequential measurements of plasma lactate and coma score, and validated for their use as surrogate endpoint, including the proportion of treatment effect on mortality explained (PTE) by these surrogate measures. Improvements in lactate concentration or coma scores over the first 24 hours of admission, were strongly prognostic for survival in all datasets. In hyperlactataemic patients in the AQ study (n = 173), lower mortality with artemether compared to quinine closely correlated with faster reduction in plasma lactate concentration, with a high PTE of the relative change in plasma lactate at 8 and 12 hours of 0.81 and 0.75, respectively. In paediatric patients enrolled in the 'AQUAMAT' study with cerebral malaria (n = 785), mortality was lower with artesunate compared to quinine, but this was not associated with faster coma recovery. The relative changes in plasma lactate concentration assessed at 8 or 12 hours after admission are valid surrogate endpoints for severe malaria studies on antimalarial drugs or adjuvant treatments aiming at improving the microcirculation. Measures of coma recovery are not valid surrogate endpoints for mortality.
Al-Quwaidhi, Abdulkareem J.; Pearce, Mark S.; Sobngwi, Eugene; Critchley, Julia A.; O’Flaherty, Martin
2014-01-01
Aims To compare the estimates and projections of type 2 diabetes mellitus (T2DM) prevalence in Saudi Arabia from a validated Markov model against other modelling estimates, such as those produced by the International Diabetes Federation (IDF) Diabetes Atlas and the Global Burden of Disease (GBD) project. Methods A discrete-state Markov model was developed and validated that integrates data on population, obesity and smoking prevalence trends in adult Saudis aged ≥25 years to estimate the trends in T2DM prevalence (annually from 1992 to 2022). The model was validated by comparing the age- and sex-specific prevalence estimates against a national survey conducted in 2005. Results Prevalence estimates from this new Markov model were consistent with the 2005 national survey and very similar to the GBD study estimates. Prevalence in men and women in 2000 was estimated by the GBD model respectively at 17.5% and 17.7%, compared to 17.7% and 16.4% in this study. The IDF estimates of the total diabetes prevalence were considerably lower at 16.7% in 2011 and 20.8% in 2030, compared with 29.2% in 2011 and 44.1% in 2022 in this study. Conclusion In contrast to other modelling studies, both the Saudi IMPACT Diabetes Forecast Model and the GBD model directly incorporated the trends in obesity prevalence and/or body mass index (BMI) to inform T2DM prevalence estimates. It appears that such a direct incorporation of obesity trends in modelling studies results in higher estimates of the future prevalence of T2DM, at least in countries where obesity has been rapidly increasing. PMID:24447810
López-Ortega, Mariana; Torres-Castro, Sara; Rosas-Carrasco, Oscar
2016-12-09
The Satisfaction with Life Scale (SWLS) has been widely used and has proven to be a valid and reliable instrument for assessing satisfaction with life in diverse population groups, however, research on satisfaction with life and validation of different measuring instruments in Mexican adults is still lacking. The objective was to evaluate the psychometric properties of the Satisfaction with Life Scale (SWLS) in a representative sample of Mexican adults. This is a methodological study to evaluate a satisfaction with life scale in a sample of 13,220 Mexican adults 50 years of age or older from the 2012 Mexican Health and Aging Study. The scale's reliability (internal consistency) was analysed using Cronbach's alpha and inter-item correlations. An exploratory factor analysis was also performed. Known-groups validity was evaluated comparing good-health and bad-health participants. Comorbidity, perceived financial situation, self-reported general health, depression symptoms, and social support were included to evaluate the validity between these measures and the total score of the scale using Spearman's correlations. The analysis of the scale's reliability showed good internal consistency (α = 0.74). The exploratory factor analysis confirmed the existence of a unique factor structure that explained 54% of the variance. SWLS was related to depression, perceived health, financial situation, and social support, and these relations were all statistically significant (P < .01). There was significant difference in life satisfaction between the good- and bad-health groups. Results show good internal consistency and construct validity of the SWLS. These results are comparable with results from previous studies. Meeting the study's objective to validate the scale, the results show that the Spanish version of the SWLS is a reliable and valid measure of satisfaction with life in the Mexican context.
ERIC Educational Resources Information Center
Beck-Ellsworth, Danielle
2011-01-01
Purpose of study: The purpose of the validity study was to establish known-groups validity of the two measures used for the main study by comparing the responses of the Eating Disorder experts and non-experts. The purpose of the main study was to develop an on-line introductory workshop on eating disorders and investigate the levels of competency…
Campbell, J Q; Coombs, D J; Rao, M; Rullkoetter, P J; Petrella, A J
2016-09-06
The purpose of this study was to seek broad verification and validation of human lumbar spine finite element models created using a previously published automated algorithm. The automated algorithm takes segmented CT scans of lumbar vertebrae, automatically identifies important landmarks and contact surfaces, and creates a finite element model. Mesh convergence was evaluated by examining changes in key output variables in response to mesh density. Semi-direct validation was performed by comparing experimental results for a single specimen to the automated finite element model results for that specimen with calibrated material properties from a prior study. Indirect validation was based on a comparison of results from automated finite element models of 18 individual specimens, all using one set of generalized material properties, to a range of data from the literature. A total of 216 simulations were run and compared to 186 experimental data ranges in all six primary bending modes up to 7.8Nm with follower loads up to 1000N. Mesh convergence results showed less than a 5% difference in key variables when the original mesh density was doubled. The semi-direct validation results showed that the automated method produced results comparable to manual finite element modeling methods. The indirect validation results showed a wide range of outcomes due to variations in the geometry alone. The studies showed that the automated models can be used to reliably evaluate lumbar spine biomechanics, specifically within our intended context of use: in pure bending modes, under relatively low non-injurious simulated in vivo loads, to predict torque rotation response, disc pressures, and facet forces. Copyright © 2016 Elsevier Ltd. All rights reserved.
Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H
2017-10-23
Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.
ERIC Educational Resources Information Center
Adeyemo, Emily Oluseyi
2012-01-01
This study examined the impact of publication bias on a meta-analysis of empirical studies on validity of University Matriculation Examinations in Nigeria with a view to determine the level of difference between published and unpublished articles. Specifically, the design was an ex-post facto, a causal comparative design. The sample size consisted…
Leonardi, Matilde; Chatterji, Somnath; Koskinen, Seppo; Ayuso-Mateos, Jose Luis; Haro, Josep Maria; Frisoni, Giovanni; Frattura, Lucilla; Martinuzzi, Andrea; Tobiasz-Adamczyk, Beata; Gmurek, Michal; Serrano, Ramon; Finocchiaro, Carla
2014-01-01
COURAGE in Europe was a 3-year project involving 12 partners from four European countries and the World Health Organization. It was inspired by the pressing need to integrate international studies on disability and ageing in light of an innovative perspective based on a validated data-collection protocol. COURAGE in Europe Project collected data on the determinants of health and disability in an ageing population, with specific tools for the evaluation of the role of the built environment and social networks on health, disability, quality of life and well-being. The main survey was conducted by partners in Finland, Poland and Spain where the survey has been administered to a sample of 10,800 persons, which was completed in March 2012. The newly developed and validated COURAGE Protocol for Ageing Studies has proven to be a valid tool for collecting comparable data in ageing population, and the COURAGE in Europe Project has created valid and reliable scientific evidence, demonstrating cross-country comparability, for disability and ageing research and policy development. It is therefore recommended that future studies exploring determinants of health and disability in ageing use the COURAGE-derived methodology. COURAGE in Europe Project collected data on the determinants of health and disability in an ageing population, with specific tools for the evaluation of the role of built environment and social networks on health, disability quality of life and well-being. The COURAGE Protocol for Ageing Studies has proven to be a valid tool for collecting comparable data in the ageing population. The COURAGE in Europe Consortium recommends that future studies exploring determinants of health and disability in ageing use COURAGE-derived methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Sleep-Wake Evaluation from Whole-Night Non-Contact Audio Recordings of Breathing Sounds
Dafna, Eliran; Tarasiuk, Ariel; Zigel, Yaniv
2015-01-01
Study Objectives To develop and validate a novel non-contact system for whole-night sleep evaluation using breathing sounds analysis (BSA). Design Whole-night breathing sounds (using ambient microphone) and polysomnography (PSG) were simultaneously collected at a sleep laboratory (mean recording time 7.1 hours). A set of acoustic features quantifying breathing pattern were developed to distinguish between sleep and wake epochs (30 sec segments). Epochs (n = 59,108 design study and n = 68,560 validation study) were classified using AdaBoost classifier and validated epoch-by-epoch for sensitivity, specificity, positive and negative predictive values, accuracy, and Cohen's kappa. Sleep quality parameters were calculated based on the sleep/wake classifications and compared with PSG for validity. Setting University affiliated sleep-wake disorder center and biomedical signal processing laboratory. Patients One hundred and fifty patients (age 54.0±14.8 years, BMI 31.6±5.5 kg/m2, m/f 97/53) referred for PSG were prospectively and consecutively recruited. The system was trained (design study) on 80 subjects; validation study was blindly performed on the additional 70 subjects. Measurements and Results Epoch-by-epoch accuracy rate for the validation study was 83.3% with sensitivity of 92.2% (sleep as sleep), specificity of 56.6% (awake as awake), and Cohen's kappa of 0.508. Comparing sleep quality parameters of BSA and PSG demonstrate average error of sleep latency, total sleep time, wake after sleep onset, and sleep efficiency of 16.6 min, 35.8 min, and 29.6 min, and 8%, respectively. Conclusions This study provides evidence that sleep-wake activity and sleep quality parameters can be reliably estimated solely using breathing sound analysis. This study highlights the potential of this innovative approach to measure sleep in research and clinical circumstances. PMID:25710495
Fooken, Jonas
2017-03-10
The present study investigates the external validity of emotional value measured in economic laboratory experiments by using a physiological indicator of stress, heart rate variability (HRV). While there is ample evidence supporting the external validity of economic experiments, there is little evidence comparing the magnitude of internal levels of emotional stress during decision making with external stress. The current study addresses this gap by comparing the magnitudes of decision stress experienced in the laboratory with the stress from outside the laboratory. To quantify a large change in HRV, measures observed in the laboratory during decision-making are compared to the difference between HRV during a university exam and other mental activity for the same individuals in and outside of the laboratory. The results outside the laboratory inform about the relevance of laboratory findings in terms of their relative magnitude. Results show that psychologically induced HRV changes observed in the laboratory, particularly in connection with social preferences, correspond to large effects outside. This underscores the external validity of laboratory findings and shows the magnitude of emotional value connected to pro-social economic decisions in the laboratory.
Toth, Anna M.; Bliss, Donna Z.; Savik, Kay; Wyman, Jean F.
2011-01-01
Perineal dermatitis is one of the main complications of incontinence and increases the cost of health care. The Minimum Data Set (MDS) contains data about factors associated with perineal dermatitis identified in a published conceptual model of perineal dermatitis. The purpose of this study was to determine the validity of MDS data related to perineal dermatitis risk factors by comparing them with data in nursing home chart records. Findings indicate that MDS items defining factors associated with perineal dermatitis were valid and supported use of the MDS in further investigation of a significant, costly, and understudied health problem of nursing home residents. PMID:18512629
Cerruto, Maria A; D'Elia, Carolina; Siracusano, Salvatore; Porcaro, Antonio B; Cacciamani, Giovanni; De Marchi, Davide; Niero, Mauro; Lonardi, Cristina; Iafrate, Massimo; Bassi, Pierfrancesco; Belgrano, Emanuele; Imbimbo, Ciro; Racioppi, Marco; Talamini, Renato; Ciciliato, Stefano; Toffoli, Laura; Rizzo, Michele; Visalli, Francesco; Verze, Paolo; Artibani, Walter
2017-07-01
From the most recent systematic revision of the literature, an orthotopic neobladder would seem to show marginally better health related quality of life (HR-QoL) scores compared with an ileal conduit. The aim of this study was to review all relevant published studies about the comparison between ileal orthotopic neobladder (IONB) and ileal conduit using validated HR-QoL questionnaires. Studies were identified by searching multiple literature databases. Data were synthesized using meta-analytic methods conformed to the PRISMA statement. The literature search identified 10 papers; pooled effect sizes of combined quality of life outcomes for ileal conduit versus IONB showed a significantly better HR-QoL in patients with IONB (Hedges' g = 0.278; p = 0.000);. The present study has an important limitation due to the type of the analyzed comparative studies, all retrospective and not randomized. This meta-analysis of not-randomized, retrospective comparative studies on the impact of ileal conduit versus IONB on HR-QoL showed a significant advantage of IONB subgroups.
Bayesian data analysis in observational comparative effectiveness research: rationale and examples.
Olson, William H; Crivera, Concetta; Ma, Yi-Wen; Panish, Jessica; Mao, Lian; Lynch, Scott M
2013-11-01
Many comparative effectiveness research and patient-centered outcomes research studies will need to be observational for one or both of two reasons: first, randomized trials are expensive and time-consuming; and second, only observational studies can answer some research questions. It is generally recognized that there is a need to increase the scientific validity and efficiency of observational studies. Bayesian methods for the design and analysis of observational studies are scientifically valid and offer many advantages over frequentist methods, including, importantly, the ability to conduct comparative effectiveness research/patient-centered outcomes research more efficiently. Bayesian data analysis is being introduced into outcomes studies that we are conducting. Our purpose here is to describe our view of some of the advantages of Bayesian methods for observational studies and to illustrate both realized and potential advantages by describing studies we are conducting in which various Bayesian methods have been or could be implemented.
Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A
2010-11-01
The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p < 0.001) with a systematic bias of approximately 7 cm, even though random errors were low (2.7 cm) and intraclass correlation coefficients (ICCs) where high (>0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p < 0.001), with high random errors (>12 cm), high limits of agreement ratios (>36%), and low ICCs (<0.75), that is, poor validity. As regards reliability, Myotest-T showed high ICCs (range: 0.92-0.96), whereas Myotest-V showed low ICCs (range: 0.56-0.89), and high random errors (>9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.
Devenish, Gemma; Mukhtar, Aqif; Begley, Andrea; Do, Loc; Scott, Jane
2017-11-08
Background : Dental research into early childhood caries is hindered by a lack of suitable dietary assessment tools that have been developed and validated for the population and outcomes of interest. The aim of this study was to develop and investigate the relative validity and reproducibility of the Study of Mothers' and Infants' Life Events Food Frequency Questionnaire (SMILE-FFQ), to assess the total and free sugars intakes of Australian toddlers. Methods : The SMILE-FFQ was designed to capture the leading dietary contributors to dental caries risk in toddlers aged 18-30 months via a proxy report. Ninety-five parents of Australian toddlers completed the questionnaire online before and after providing three 24-h recalls (24HR), collected on non-consecutive days using the multipass method. Total and free sugars were compared between the two SMILE-FFQ administrations and between each SMILE-FFQ and the 24HR using multiple statistical tests and standardised validity criteria. Correlation (Pearson), mean difference (Wilcoxon rank test) and Bland Altman analyses were conducted to compare absolute values, with cross-classification (Chi-Square and Weighted Kappa) used to compare agreement across tertiles. Results : All reproducibility tests showed good agreement except weighted kappa, which showed acceptable agreement. Relative validity tests revealed a mix of good and acceptable agreement, with total sugars performing better at the individual level than free sugars. Compared to the 24HR, the SMILE-FFQ tended to underestimate absolute values at lower levels and overestimate them at higher levels. Conclusions : The combined findings of the various tests indicate that the SMILE-FFQ performs comparably to the 24HR for assessing both total and free sugars among individuals, is most effective for ranking participants rather than determining absolute intakes, and is therefore suitable for use in observational studies of Australian toddlers.
Ganna, Andrea; Lee, Donghwan; Ingelsson, Erik; Pawitan, Yudi
2015-07-01
It is common and advised practice in biomedical research to validate experimental or observational findings in a population different from the one where the findings were initially assessed. This practice increases the generalizability of the results and decreases the likelihood of reporting false-positive findings. Validation becomes critical when dealing with high-throughput experiments, where the large number of tests increases the chance to observe false-positive results. In this article, we review common approaches to determine statistical thresholds for validation and describe the factors influencing the proportion of significant findings from a 'training' sample that are replicated in a 'validation' sample. We refer to this proportion as rediscovery rate (RDR). In high-throughput studies, the RDR is a function of false-positive rate and power in both the training and validation samples. We illustrate the application of the RDR using simulated data and real data examples from metabolomics experiments. We further describe an online tool to calculate the RDR using t-statistics. We foresee two main applications. First, if the validation study has not yet been collected, the RDR can be used to decide the optimal combination between the proportion of findings taken to validation and the size of the validation study. Secondly, if a validation study has already been done, the RDR estimated using the training data can be compared with the observed RDR from the validation data; hence, the success of the validation study can be assessed. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Chmielewski, Michael; Zhu, Jiani; Burchett, Danielle; Bury, Alison S; Bagby, R Michael
2017-02-01
The current study expands on past research examining the comparative capacity of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2; Butcher et al., 2001) and MMPI-2 Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008/2011) overreporting validity scales to detect suspected malingering, as assessed by the Miller Forensic Assessment of Symptoms Test (M-FAST; Miller, 2001), in a sample of public insurance disability claimants (N = 742) who were considered to have potential incentives to malinger. Results provide support for the capacity of both the MMPI-2 and the MMPI-2-RF overreporting validity scales to predict suspected malingering of psychopathology. The MMPI-2-RF overreporting validity scales proved to be modestly better predictors of suspected psychopathology malingering-compared with the MMPI-2 overreporting scales-in dimensional predictive models and categorical classification accuracy analyses. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M
2015-03-01
It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Wei, Meifen; Russell, Daniel W; Mallinckrodt, Brent; Vogel, David L
2007-04-01
We developed a 12-item, short form of the Experiences in Close Relationship Scale (ECR; Brennan, Clark, & Shaver, 1998) across 6 studies. In Study 1, we examined the reliability and factor structure of the measure. In Studies 2 and 3, we cross-validated the reliability, factor structure, and validity of the short form measure; whereas in Study 4, we examined test-retest reliability over a 1-month period. In Studies 5 and 6, we further assessed the reliability, factor structure, and validity of the short version of the ECR when administered as a stand-alone instrument. Confirmatory factor analyses indicated that 2 factors, labeled Anxiety and Avoidance, provided a good fit to the data after removing the influence of response sets. We found validity to be equivalent for the short and the original versions of the ECR across studies. Finally, the results were comparable when we embedded the short form within the original version of the ECR and when we administered it as a stand-alone measure.
Kim, Hee-Ju; Abraham, Ivo
2017-01-01
Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright © 2016 Elsevier Ltd. All rights reserved.
Mâsse, Louise C; Fulton, Janet E; Watson, Kathleen B; Tortolero, Susan; Kohl, Harold W; Meyers, Michael C; Blair, Steven N; Wong, William W
2012-02-01
The purpose of this study was to compare the validity of 2 physical activity questionnaire formats--one that lists activities (Checklist questionnaire) and one that assesses overall activities (Global questionnaire) by domain. Two questionnaire formats were validated among 260 African-American and Hispanic women (age 40-70) using 3 validation standards: 1) accelerometers to validate activities of ambulation; 2) diaries to validate physical activity domains (occupation, household, exercise, yard, family, volunteer/church work, and transportation); and 3) doubly-labeled water to validate physical activity energy expenditure (DLW-PAEE). The proportion of total variance explained by the Checklist questionnaire was 38.4% with diaries, 9.0% with accelerometers, and 6.4% with DLW-PAEE. The Global questionnaire explained 17.6% of the total variance with diaries and about 5% with both accelerometers and with DLWPAEE. Overall, associations with the 3 validation standards were slightly better with the Checklist questionnaire. However, agreement with DLW-PAEE was poor with both formats and the Checklist format resulted in greater overestimation. Validity results also indicated the Checklist format was better suited to recall household, family, and transportation activities. Overall, the Checklist format had slightly better measurement properties than the Global format. Both questionnaire formats are better suited to rank individuals.
Validation of the Older Adult Social Evaluative Scale (OASES) as a measure of social anxiety.
Kok, Brian C; Ma, Vanessa K; Gould, Christine E
2018-03-21
Social anxiety disorder (SAD) (formerly called social phobia) is among the most common mental health diagnoses among older adults; however, the research on late-life social anxiety is scarce. A limited number of studies have examined the assessment and diagnosis of social anxiety disorder in this population, and there are few social anxiety measures that are validated for use with older adults. One such measure, the Older Adult Social Evaluative Scale (OASES), was designed for use with this population, but until now has lacked validation against a gold-standard diagnostic interview. Using a sample of 47 community-dwelling older adults (aged 60 years and over) with anxiety, the present study compared OASES performance to that of the Structured Clinical Interview for DSM-5 Disorders (SCID-5), as well as other measures of anxiety and depression. The OASES demonstrated convergent validity with other measures of anxiety, and demonstrated discriminant validity on other measures (e.g. depression, somatic symptoms). Receiver operating characteristic (ROC) analysis revealed that a cut-point of ≥76 optimized sensitivity and specificity compared to SCID-5 derived diagnoses of social anxiety disorder. This study is the first study to provide psychometric validation for the OASES and one of the first to administer the SCID-5 to an older adult sample. In addition to establishing a clinically significant cut-off, this study also describes the clinical utility of the OASES, which can be used to identify distressing situations, track anxiety severity, and monitor behavioral avoidance across a variety of social situations.
2013-01-01
Background If you want to know which of two or more healthcare interventions is most effective, the randomised controlled trial is the design of choice. Randomisation, however, does not itself promote the applicability of the results to situations other than the one in which the trial was done. A tool published in 2009, PRECIS (PRagmatic Explanatory Continuum Indicator Summaries) aimed to help trialists design trials that produced results matched to the aim of the trial, be that supporting clinical decision-making, or increasing knowledge of how an intervention works. Though generally positive, groups evaluating the tool have also found weaknesses, mainly that its inter-rater reliability is not clear, that it needs a scoring system and that some new domains might be needed. The aim of the study is to: Produce an improved and validated version of the PRECIS tool. Use this tool to compare the internal validity of, and effect estimates from, a set of explanatory and pragmatic trials matched by intervention. Methods The study has four phases. Phase 1 involves brainstorming and a two-round Delphi survey of authors who cited PRECIS. In Phase 2, the Delphi results will then be discussed and alternative versions of PRECIS-2 developed and user-tested by experienced trialists. Phase 3 will evaluate the validity and reliability of the most promising PRECIS-2 candidate using a sample of 15 to 20 trials rated by 15 international trialists. We will assess inter-rater reliability, and raters’ subjective global ratings of pragmatism compared to PRECIS-2 to assess convergent and face validity. Phase 4, to determine if pragmatic trials sacrifice internal validity in order to achieve applicability, will compare the internal validity and effect estimates of matched explanatory and pragmatic trials of the same intervention, condition and participants. Effect sizes for the trials will then be compared in a meta-regression. The Cochrane Risk of Bias scores will be compared with the PRECIS-2 scores of pragmatism. Discussion We have concrete suggestions for improving PRECIS and a growing list of enthusiastic individuals interested in contributing to this work. By early 2014 we expect to have a validated PRECIS-2. PMID:23782862
Morgan, Patrick; Nissi, Mikko J; Hughes, John; Mortazavi, Shabnam; Ellerman, Jutta
2017-07-01
Objectives The purpose of this study was to validate T2* mapping as an objective, noninvasive method for the prediction of acetabular cartilage damage. Methods This is the second step in the validation of T2*. In a previous study, we established a quantitative predictive model for identifying and grading acetabular cartilage damage. In this study, the model was applied to a second cohort of 27 consecutive hips to validate the model. A clinical 3.0-T imaging protocol with T2* mapping was used. Acetabular regions of interest (ROI) were identified on magnetic resonance and graded using the previously established model. Each ROI was then graded in a blinded fashion by arthroscopy. Accurate surgical location of ROIs was facilitated with a 2-dimensional map projection of the acetabulum. A total of 459 ROIs were studied. Results When T2* mapping and arthroscopic assessment were compared, 82% of ROIs were within 1 Beck group (of a total 6 possible) and 32% of ROIs were classified identically. Disease prediction based on receiver operating characteristic curve analysis demonstrated a sensitivity of 0.713 and a specificity of 0.804. Model stability evaluation required no significant changes to the predictive model produced in the initial study. Conclusions These results validate that T2* mapping provides statistically comparable information regarding acetabular cartilage when compared to arthroscopy. In contrast to arthroscopy, T2* mapping is quantitative, noninvasive, and can be used in follow-up. Unlike research quantitative magnetic resonance protocols, T2* takes little time and does not require a contrast agent. This may facilitate its use in the clinical sphere.
Baron, Emily Claire; Davies, Thandi; Lund, Crick
2017-01-09
The 10-item Centre for Epidemiological Studies Depression Scale (CES-D-10) is a depression screening tool that has been used in the South African National Income Dynamics Study (NIDS), a national household panel study. This screening tool has not yet been validated in South Africa. This study aimed to establish the reliability and validity of the CES-D-10 in Zulu, Xhosa and Afrikaans. The CES-D-10's psychometric properties were also compared to the Patient Health Questionnaire (PHQ-9), a depression screening tool already validated in South Africa. Stratified random samples of Xhosa, Afrikaans and Zulu-speaking participants aged 15 years or older (N = 944) were recruited from Cape Town Metro and Ethekwini districts. Face-to-face interviews included socio-demographic questions, the CES-D-10, Patient Health Questionnaire (PHQ-9), and WHO Disability Assessment Schedule 2.0 (WHODAS). Major depression was determined using the Mini International Neuropsychiatric Interview. All instruments were translated and back-translated to English. Construct validity was examined using exploratory factor analysis with varimax rotation. Receiver Operating Characteristics (ROC) curves were used to investigate the CES-D-10 and PHQ-9's criterion validity, and compared using the DeLong method. Overall, 6.6, 18.0 and 6.9% of the Zulu, Afrikaans and Xhosa samples were diagnosed with depression, respectively. The CES-D-10 had acceptable internal consistency across samples (α = 0.69-0.89), and adequate concurrent validity, when compared to the PHQ-9 and WHODAS. The CES-D-10 area under the Receiver Operator Characteristic curve was good to excellent: 0.81 (95% CI 0.71-0.90) for Zulu, 0.93 (95% CI 0.90-0.96) for Afrikaans, and 0.94 (95% CI 0.89-0.99) for Xhosa. A cut-off of 12, 11 and 13 for Zulu, Afrikaans and Xhosa, respectively, generated the most balanced sensitivity, specificity and positive predictive value (Zulu: 71.4, 72.6% and 16.1%; Afrikaans: 84.6%, 84.0%, 53.7%; Xhosa: 81.0%, 95.0%, 54.8%). These were slightly higher than those generated for the PHQ-9. The CES-D-10 and PHQ-9 otherwise performed similarly across samples. The CES-D-10 is a valid, reliable screening tool for depression in Zulu, Xhosa and coloured Afrikaans populations.
Validation of the Vanderbilt Holistic Face Processing Test.
Wang, Chao-Chih; Ross, David A; Gauthier, Isabel; Richler, Jennifer J
2016-01-01
The Vanderbilt Holistic Face Processing Test (VHPT-F) is a new measure of holistic face processing with better psychometric properties relative to prior measures developed for group studies (Richler et al., 2014). In fields where psychologists study individual differences, validation studies are commonplace and the concurrent validity of a new measure is established by comparing it to an older measure with established validity. We follow this approach and test whether the VHPT-F measures the same construct as the composite task, which is group-based measure at the center of the large literature on holistic face processing. In Experiment 1, we found a significant correlation between holistic processing measured in the VHPT-F and the composite task. Although this correlation was small, it was comparable to the correlation between holistic processing measured in the composite task with the same faces, but different target parts (top or bottom), which represents a reasonable upper limit for correlations between the composite task and another measure of holistic processing. These results confirm the validity of the VHPT-F by demonstrating shared variance with another measure of holistic processing based on the same operational definition. These results were replicated in Experiment 2, but only when the demographic profile of our sample matched that of Experiment 1.
Validation of the Vanderbilt Holistic Face Processing Test
Wang, Chao-Chih; Ross, David A.; Gauthier, Isabel; Richler, Jennifer J.
2016-01-01
The Vanderbilt Holistic Face Processing Test (VHPT-F) is a new measure of holistic face processing with better psychometric properties relative to prior measures developed for group studies (Richler et al., 2014). In fields where psychologists study individual differences, validation studies are commonplace and the concurrent validity of a new measure is established by comparing it to an older measure with established validity. We follow this approach and test whether the VHPT-F measures the same construct as the composite task, which is group-based measure at the center of the large literature on holistic face processing. In Experiment 1, we found a significant correlation between holistic processing measured in the VHPT-F and the composite task. Although this correlation was small, it was comparable to the correlation between holistic processing measured in the composite task with the same faces, but different target parts (top or bottom), which represents a reasonable upper limit for correlations between the composite task and another measure of holistic processing. These results confirm the validity of the VHPT-F by demonstrating shared variance with another measure of holistic processing based on the same operational definition. These results were replicated in Experiment 2, but only when the demographic profile of our sample matched that of Experiment 1. PMID:27933014
A French validation study of the Coma Recovery Scale-Revised (CRS-R).
Schnakers, Caroline; Majerus, Steve; Giacino, Joseph; Vanhaudenhuyse, Audrey; Bruno, Marie-Aurelie; Boly, Melanie; Moonen, Gustave; Damas, Pierre; Lambermont, Bernard; Lamy, Maurice; Damas, Francois; Ventura, Manfredi; Laureys, Steven
2008-09-01
The aim of the present study was to explore the concurrent validity, inter-rater agreement and diagnostic sensitivity of a French adaptation of the Coma Recovery Scale-Revised (CRS-R) as compared to other coma scales such as the Glasgow Coma Scale (GCS), the Full Outline of UnResponsiveness scale (FOUR) and the Wessex Head Injury Matrix (WHIM). Multi-centric prospective study. To test concurrent validity and diagnostic sensitivity, the four behavioural scales were administered in a randomized order in 77 vegetative and minimally conscious patients. Twenty-four clinicians with different professional backgrounds, levels of expertise and CRS-R experience were recruited to assess inter-rater agreement. Good concurrent validity was obtained between the CRS-R and the three other standardized behavioural scales. Inter-rater reliability for the CRS-R total score and sub-scores was good, indicating that the scale yields reproducible findings across examiners and does not appear to be systematically biased by profession, level of expertise or CRS-R experience. Finally, the CRS-R demonstrated a significantly higher sensitivity to detect MCS patients, as compared to the GCS, the FOUR and the WHIM. The results show that the French version of the CRS-R is a valid and sensitive scale which can be used in severely brain damaged patients by all members of the medical staff.
Ahmed, Adil; Vairavan, Srinivasan; Akhoundi, Abbasali; Wilson, Gregory; Chiofolo, Caitlyn; Chbat, Nicolas; Cartin-Ceba, Rodrigo; Li, Guangxi; Kashani, Kianoush
2015-10-01
Timely detection of acute kidney injury (AKI) facilitates prevention of its progress and potentially therapeutic interventions. The study objective is to develop and validate an electronic surveillance tool (AKI sniffer) to detect AKI in 2 independent retrospective cohorts of intensive care unit (ICU) patients. The primary aim is to compare the sensitivity, specificity, and positive and negative predictive values of AKI sniffer performance against a reference standard. This study is conducted in the ICUs of a tertiary care center. The derivation cohort study subjects were Olmsted County, MN, residents admitted to all Mayo Clinic ICUs from July 1, 2010, through December 31, 2010, and the validation cohort study subjects were all patients admitted to a Mayo Clinic, Rochester, campus medical/surgical ICU on January 12, 2010, through March 23, 2010. All included records were reviewed by 2 independent investigators who adjudicated AKI using the Acute Kidney Injury Network criteria; disagreements were resolved by a third reviewer. This constituted the reference standard. An electronic algorithm was developed; its precision and reliability were assessed in comparison with the reference standard in 2 separate cohorts, derivation and validation. Of 1466 screened patients, a total of 944 patients were included in the study: 482 for derivation and 462 for validation. Compared with the reference standard in the validation cohort, the sensitivity and specificity of the AKI sniffer were 88% and 96%, respectively. The Cohen κ (95% confidence interval) agreement between the electronic and the reference standard was 0.84 (0.78-0.89) and 0.85 (0.80-0.90) in the derivation and validation cohorts. Acute kidney injury can reliably and accurately be detected electronically in ICU patients. The presented method is applicable for both clinical (decision support) and research (enrollment for clinical trials) settings. Prospective validation is required. Copyright © 2015 Elsevier Inc. All rights reserved.
DOT National Transportation Integrated Search
2010-08-01
This study was intended to recommend future directions for the development of TxDOTs Mechanistic-Empirical : (TexME) design system. For stress predictions, a multi-layer linear elastic system was evaluated and its validity was : verified by compar...
Clinical Validation of Heart Rate Apps: Mixed-Methods Evaluation Study
Stans, Jelle; Mortelmans, Christophe; Van Haelst, Ruth; Van Schelvergem, Gertjan; Pelckmans, Caroline; Smeets, Christophe JP; Lanssens, Dorien; De Cannière, Hélène; Storms, Valerie; Thijs, Inge M; Vaes, Bert; Vandervoort, Pieter M
2017-01-01
Background Photoplethysmography (PPG) is a proven way to measure heart rate (HR). This technology is already available in smartphones, which allows measuring HR only by using the smartphone. Given the widespread availability of smartphones, this creates a scalable way to enable mobile HR monitoring. An essential precondition is that these technologies are as reliable and accurate as the current clinical (gold) standards. At this moment, there is no consensus on a gold standard method for the validation of HR apps. This results in different validation processes that do not always reflect the veracious outcome of comparison. Objective The aim of this paper was to investigate and describe the necessary elements in validating and comparing HR apps versus standard technology. Methods The FibriCheck (Qompium) app was used in two separate prospective nonrandomized studies. In the first study, the HR of the FibriCheck app was consecutively compared with 2 different Food and Drug Administration (FDA)-cleared HR devices: the Nonin oximeter and the AliveCor Mobile ECG. In the second study, a next step in validation was performed by comparing the beat-to-beat intervals of the FibriCheck app to a synchronized ECG recording. Results In the first study, the HR (BPM, beats per minute) of 88 random subjects consecutively measured with the 3 devices showed a correlation coefficient of .834 between FibriCheck and Nonin, .88 between FibriCheck and AliveCor, and .897 between Nonin and AliveCor. A single way analysis of variance (ANOVA; P=.61 was executed to test the hypothesis that there were no significant differences between the HRs as measured by the 3 devices. In the second study, 20,298 (ms) R-R intervals (RRI)–peak-to-peak intervals (PPI) from 229 subjects were analyzed. This resulted in a positive correlation (rs=.993, root mean square deviation [RMSE]=23.04 ms, and normalized root mean square error [NRMSE]=0.012) between the PPI from FibriCheck and the RRI from the wearable ECG. There was no significant difference (P=.92) between these intervals. Conclusions Our findings suggest that the most suitable method for the validation of an HR app is a simultaneous measurement of the HR by the smartphone app and an ECG system, compared on the basis of beat-to-beat analysis. This approach could lead to more correct assessments of the accuracy of HR apps. PMID:28842392
Use of Daily Phone Diary to study religiosity and mood: Convergent validity
Szczesniak, Rhonda D.; Zou, Yuanshu; Dimitriou, Sophia M.; Quittner, Alexandra L.; Grossoehme, Daniel H.
2017-01-01
Studies of religious/spiritual behavior frequently rely on self-reported questionnaire data, which is susceptible to bias. The Daily Phone Diary (DPD) was developed to minimize bias in reporting activities and behavior across a 24-hour period. A cross-sectional study of 126 parents of children with cystic fibrosis was used to establish the validity of the DPD to study religious/spiritual behaviors. Longitudinal models were used to determine the odds of improved mood during religious/spiritual activities. Convergent validity was found. Participants had increased odds of improved mood during religious/spiritual activities compared to non-religious/spiritual activities. Associations with gender and religious affiliations were found. The DPD is a valid tool for studying religious/spiritual activities and opens novel avenues for chaplaincy research and the development of chaplaincy interventions incorporating these findings. PMID:27869567
Suzuki, T; Sato, Y; Sotome, S; Arai, H; Arai, A; Yoshida, H
2017-06-01
This study was designed to investigate the reliability and validity of measurements of finger diameters with a ring gauge. A reliability study enrolled two independent samples (50 participants and seven examiners in Study I; 26 participants and 26 examiners in Study II). The sizes of each participant's little fingers were measured twice with a ring gauge by each examiner. To investigate the validity of the measurements, five hand therapists compared the finger size and hand volume of 30 participants with the ring gauge and with a figure-of-eight technique (Study III). The intra-class correlation coefficient for intra-observer reliability ranged from 0.97 to 0.99 in Study I, and 0.90 to 0.97 in Study II. The intra-class correlation coefficient for inter-observer reliability was 0.95 in Study I and 0.94 in Study II. The validity study showed a Pearson product moment correlation coefficient of 0.75. The ring gauge showed high reliability and validity for measurement of finger size. III, diagnostic.
ERIC Educational Resources Information Center
Tang, Yang; Cook, Thomas D.; Kisbu-Sakarya, Yasemin
2015-01-01
Regression discontinuity design (RD) has been widely used to produce reliable causal estimates. Researchers have validated the accuracy of RD design using within study comparisons (Cook, Shadish & Wong, 2008; Cook & Steiner, 2010; Shadish et al, 2011). Within study comparisons examines the validity of a quasi-experiment by comparing its…
Development, linguistic and clinimetric validation of the WOMAC VA3.01 Bangla for Bangladesh Index.
Rabbani, M G; Haq, S A; Bellamy, N; Islam, M N; Choudhury, M R; Naheed, A; Ahmed, S; Shahin, A
2015-06-01
The aim of this study was to develop and to validate a Bengali version of the Western Ontario and McMaster Osteoarthritis (WOMAC) index in Bangladesh. The WOMAC was translated into the local language of Bangladesh (Bengali) and adapted in the local sociocultural context, following the standard guidelines by Beaton et al. Content validity of the preliminary Bengali version was assessed by using the index of content validity (ICV) and floor and ceiling effects. Patients were assessed at the Department of Rheumatology of Bangabandhu Sheikh Mujib Medical University and were diagnosed to have knee OA by American College of Rheumatology criteria and recruited according to the requirements of the validation study. Convergent and divergent validity were measured by comparing with Health Assessment Questionnaire (HAQ) and the Short Form-36 (SF-36), and internal consistency was assessed using Cronbach's alpha coefficient. The questionnaire was readministered to 40 patients within a week for assessing reliability by using intra-class correlation coefficient (ICC) and Spearman's rank correlation coefficient. In addition, factor analysis of Bengali WOMAC questionnaire was performed to examine the number of factors influencing a common set of items. A Bengali version was developed with changes in three items to suit local practices. The ICV of the content validity was 1 for all items. The Bengali WOMAC had similar construct validity when compared to the HAQ (ρ 0.74, n = 70) and SF-36 bodily pain and physical functioning. It had dissimilar construct validity to SF-36 mental health domain except WOMAC pain. Factor analysis revealed five factors with eigenvalues of more than 1.0. Cronbach's alpha and ICC exceeded 0.7 in all domains. In the test-retest reliability testing, Spearman's ρ for all items exceeded 0.4 (n = 40). This study has demonstrated that the Bengali version of WOMAC is a valid tool for assessing quality of life of patients with knee osteoarthritis in Bangladesh and is reliable.
Graafland, Maurits; Bok, Kiki; Schreuder, Henk W R; Schijven, Marlies P
2014-06-01
Untrained laparoscopic camera assistants in minimally invasive surgery (MIS) may cause suboptimal view of the operating field, thereby increasing risk for errors. Camera navigation is often performed by the least experienced member of the operating team, such as inexperienced surgical residents, operating room nurses, and medical students. The operating room nurses and medical students are currently not included as key user groups in structured laparoscopic training programs. A new virtual reality laparoscopic camera navigation (LCN) module was specifically developed for these key user groups. This multicenter prospective cohort study assesses face validity and construct validity of the LCN module on the Simendo virtual reality simulator. Face validity was assessed through a questionnaire on resemblance to reality and perceived usability of the instrument among experts and trainees. Construct validity was assessed by comparing scores of groups with different levels of experience on outcome parameters of speed and movement proficiency. The results obtained show uniform and positive evaluation of the LCN module among expert users and trainees, signifying face validity. Experts and intermediate experience groups performed significantly better in task time and camera stability during three repetitions, compared to the less experienced user groups (P < .007). Comparison of learning curves showed significant improvement of proficiency in time and camera stability for all groups during three repetitions (P < .007). The results of this study show face validity and construct validity of the LCN module. The module is suitable for use in training curricula for operating room nurses and novice surgical trainees, aimed at improving team performance in minimally invasive surgery. © The Author(s) 2013.
A systematic review of the quality of homeopathic clinical trials
Jonas, Wayne B; Anderson, Rachel L; Crawford, Cindy C; Lyons, John S
2001-01-01
Background While a number of reviews of homeopathic clinical trials have been done, all have used methods dependent on allopathic diagnostic classifications foreign to homeopathic practice. In addition, no review has used established and validated quality criteria allowing direct comparison of the allopathic and homeopathic literature. Methods In a systematic review, we compared the quality of clinical-trial research in homeopathy to a sample of research on conventional therapies using a validated and system-neutral approach. All clinical trials on homeopathic treatments with parallel treatment groups published between 1945–1995 in English were selected. All were evaluated with an established set of 33 validity criteria previously validated on a broad range of health interventions across differing medical systems. Criteria covered statistical conclusion, internal, construct and external validity. Reliability of criteria application is greater than 0.95. Results 59 studies met the inclusion criteria. Of these, 79% were from peer-reviewed journals, 29% used a placebo control, 51% used random assignment, and 86% failed to consider potentially confounding variables. The main validity problems were in measurement where 96% did not report the proportion of subjects screened, and 64% did not report attrition rate. 17% of subjects dropped out in studies where this was reported. There was practically no replication of or overlap in the conditions studied and most studies were relatively small and done at a single-site. Compared to research on conventional therapies the overall quality of studies in homeopathy was worse and only slightly improved in more recent years. Conclusions Clinical homeopathic research is clearly in its infancy with most studies using poor sampling and measurement techniques, few subjects, single sites and no replication. Many of these problems are correctable even within a "holistic" paradigm given sufficient research expertise, support and methods. PMID:11801202
Roy, Tapta Kanchan; Kopysov, Vladimir; Nagornova, Natalia S; Rizzo, Thomas R; Boyarkin, Oleg V; Gerber, R Benny
2015-05-18
Calculated structures of the two most stable conformers of a protonated decapeptide gramicidin S in the gas phase have been validated by comparing the vibrational spectra, calculated from first- principles and measured in a wide spectral range using infrared (IR)-UV double resonance cold ion spectroscopy. All the 522 vibrational modes of each conformer were calculated quantum mechanically and compared with the experiment without any recourse to an empirical scaling. The study demonstrates that first-principles calculations, when accounting for vibrational anharmonicity, can reproduce high-resolution experimental spectra well enough for validating structures of molecules as large as of 200 atoms. The validated accurate structures of the peptide may serve as templates for in silico drug design and absolute calibration of ion mobility measurements. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bulley, Catherine; Coutts, Fiona; Blyth, Christine; Jack, Wilma; Chetty, Udi; Barber, Matthew; Tan, Chee Wee
2014-04-01
This study aimed to investigate validity of a newly developed Morbidity Screening Tool (MST) to screen for fatigue, pain, swelling (lymphedema) and arm function after breast cancer treatment. A cross-sectional study included women attending reviews after completing treatment (surgery, chemotherapy and radiotherapy), without recurrence, who could read English. They completed the MST and comparator questionnaires: Disability of the Arm, Shoulder and Hand questionnaire (DASH), Chronic Pain Grade Questionnaire (CPGQ), Lymphedema and Breast Cancer Questionnaire (LBCQ) and Functional Assessment of Cancer Therapy questionnaire with subscales for fatigue (FACT F) and breast cancer (FACT B + 4). Bilateral combined shoulder ranges of motion were compared (upward reach; hand behind back) and percentage upper limb volume difference (%LVD =/>10% diagnosed as lymphedema) measured with the vertical perometer (400T). 613 of 617 participants completed questionnaires (mean age 62.3 years, SD 10.0; mean time since treatment 63.0 months, SD 46.6) and 417 completed objective testing. Morbidity prevalence was estimated as 35.8%, 21.9%, 19.8% and 34.4% for fatigue, impaired upper limb function, lymphedema and pain respectively. Comparing those self-reporting the presence or absence of each type of morbidity, statistically significant differences in comparator variables supported validity of the MST. Statistically significant correlations resulted between MST scores focussing on impact of morbidity, and comparator variables that reflect function and quality of life. Analysis supports the validity of all four short-forms of the MST as providing indications of both presence of morbidity and impacts on participants' lives. This may facilitate early and appropriate referral for intervention. Copyright © 2013 Elsevier Ltd. All rights reserved.
Hippisley-Cox, Julia; Coupland, Carol; Brindle, Peter
2014-01-01
Objectives To validate the performance of a set of risk prediction algorithms developed using the QResearch database, in an independent sample from general practices contributing to the Clinical Research Data Link (CPRD). Setting Prospective open cohort study using practices contributing to the CPRD database and practices contributing to the QResearch database. Participants The CPRD validation cohort consisted of 3.3 million patients, aged 25–99 years registered at 357 general practices between 1 Jan 1998 and 31 July 2012. The validation statistics for QResearch were obtained from the original published papers which used a one-third sample of practices separate to those used to derive the score. A cohort from QResearch was used to compare incidence rates and baseline characteristics and consisted of 6.8 million patients from 753 practices registered between 1 Jan 1998 and until 31 July 2013. Outcome measures Incident events relating to seven different risk prediction scores: QRISK2 (cardiovascular disease); QStroke (ischaemic stroke); QDiabetes (type 2 diabetes); QFracture (osteoporotic fracture and hip fracture); QKidney (moderate and severe kidney failure); QThrombosis (venous thromboembolism); QBleed (intracranial bleed and upper gastrointestinal haemorrhage). Measures of discrimination and calibration were calculated. Results Overall, the baseline characteristics of the CPRD and QResearch cohorts were similar though QResearch had higher recording levels for ethnicity and family history. The validation statistics for each of the risk prediction scores were very similar in the CPRD cohort compared with the published results from QResearch validation cohorts. For example, in women, the QDiabetes algorithm explained 50% of the variation within CPRD compared with 51% on QResearch and the receiver operator curve value was 0.85 on both databases. The scores were well calibrated in CPRD. Conclusions Each of the algorithms performed practically as well in the external independent CPRD validation cohorts as they had in the original published QResearch validation cohorts. PMID:25168040
Białek, Michał; Markiewicz, Łukasz; Sawicki, Przemysław
2015-01-01
The delayed lotteries are much more common in everyday life than are pure lotteries. Usually, we need to wait to find out the outcome of the risky decision (e.g., investing in a stock market, engaging in a relationship). However, most research has studied the time discounting and probability discounting in isolation using the methodologies designed specifically to track changes in one parameter. Most commonly used method is adjusting, but its reported validity and time stability in research on discounting are suboptimal. The goal of this study was to introduce the novel method for analyzing delayed lotteries-conjoint analysis-which hypothetically is more suitable for analyzing individual preferences in this area. A set of two studies compared the conjoint analysis with adjusting. The results suggest that individual parameters of discounting strength estimated with conjoint have higher predictive value (Study 1 and 2), and they are more stable over time (Study 2) compared to adjusting. We discuss these findings, despite the exploratory character of reported studies, by suggesting that future research on delayed lotteries should be cross-validated using both methods.
Validity and reliability of the Diagnostic Adaptive Behaviour Scale.
Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P
2016-01-01
The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Jacob, Robin; Somers, Marie-Andree; Zhu, Pei; Bloom, Howard
2016-06-01
In this article, we examine whether a well-executed comparative interrupted time series (CITS) design can produce valid inferences about the effectiveness of a school-level intervention. This article also explores the trade-off between bias reduction and precision loss across different methods of selecting comparison groups for the CITS design and assesses whether choosing matched comparison schools based only on preintervention test scores is sufficient to produce internally valid impact estimates. We conduct a validation study of the CITS design based on the federal Reading First program as implemented in one state using results from a regression discontinuity design as a causal benchmark. Our results contribute to the growing base of evidence regarding the validity of nonexperimental designs. We demonstrate that the CITS design can, in our example, produce internally valid estimates of program impacts when multiple years of preintervention outcome data (test scores in the present case) are available and when a set of reasonable criteria are used to select comparison organizations (schools in the present case). © The Author(s) 2016.
Concurrent Validity of the Classroom Strategies Scale-Teacher Form: A Preliminary Investigation
ERIC Educational Resources Information Center
Reddy, Linda A.; Dudek, Christopher M.; Rualo, Angelique J.; Fabiano, Gregory A.
2016-01-01
The present study investigated the concurrent validity of the Classroom Strategies Scale-Teacher Form (CSS-T), a multidimensional teacher formative assessment of instructional and behavioral management practices. The CSS-T is compared with the Classroom Assessment Scoring System (CLASS), a well-known teacher assessment of overall classroom…
Comparative Ratings of the Utility of Portfolio Requirements: Toward Content Validity.
ERIC Educational Resources Information Center
McFarland, Jacqueline; Wisniewski, Shirley; Vermette, Paul
While the value of portfolio learning and assessment has gained much support from the educational community, many questions arise as specific implementations are attempted. This study examined one aspect, namely, the content validity of specific requirements, and addressed the question "How do various constituencies (methods students, student…
The Validity of Two Education Requirement Measures
ERIC Educational Resources Information Center
van der Meer, Peter H.
2006-01-01
In this paper we investigate the validity of two education requirement measures. This is important because a key part of the ongoing discussion concerning overeducation is about measurement. Thanks to the Dutch Institute for Labour Studies, we have been given a unique opportunity to compare two education requirement measures: first, Huijgen's…
Liou, Shwu-Ru; Tsai, Hsiu-Min; Cheng, Ching-Yu
2013-01-01
To analyze and compare the psychometric properties and cultural attributes of the Organizational Commitment Questionnaire and the Organizational Commitment Scale to determine their appropriateness for measuring commitment of Asian nurses, the biggest portion of international nurses. The Organizational Commitment Questionnaire was cross-culturally cross-validated when compared with the Organizational Commitment Scale. Both instruments were not tested on Asian nurses. More studies are needed to validate the cultural properties of the Organizational Commitment Scale. Healthcare administrators can use culturally validated instruments, which concern cultural context, including languages and cultural values, to understand Asian nurses' organizational commitment and further lower turnover behavior among them. © 2013 Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
Sinharay, Sandip; Feng, Ying; Saldivia, Luis; Powers, Donald E.; Ginuta, Anthony; Simpson, Annabelle; Weng, Vincent
2008-01-01
The validity of TOEIC Bridge™ scores as a measure of English language skill was examined from the standpoint of a unified concept of test validity. In this study, more than 6,000 test takers in 3 Latin American countries (Chile, Colombia, and Ecuador) took 1 form of the TOEIC Bridge test, and their scores were compared to additional information…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strons, Philip; Bailey, James L.; Davis, John
2016-03-01
In this work, we apply the CFD in modeling airflow and particulate transport. This modeling is then compared to field validation studies to both inform and validate the modeling assumptions. Based on the results of field tests, modeling assumptions and boundary conditions are refined and the process is repeated until the results are found to be reliable with a high level of confidence.
Predictive and concurrent validity of the Braden scale in long-term care: a meta-analysis.
Wilchesky, Machelle; Lungu, Ovidiu
2015-01-01
Pressure ulcer prevention is an important long-term care (LTC) quality indicator. While the Braden Scale is a recommended risk assessment tool, there is a paucity of information specifically pertaining to its validity within the LTC setting. We, therefore, undertook a systematic review and meta-analysis comparing Braden Scale predictive and concurrent validity within this context. We searched the Medline, EMBASE, PsychINFO and PubMed databases from 1985-2014 for studies containing the requisite information to analyze tool validity. Our initial search yielded 3,773 articles. Eleven datasets emanating from nine published studies describing 40,361 residents met all meta-analysis inclusion criteria and were analyzed using random effects models. Pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive values were 86%, 38%, 28%, and 93%, respectively. Specificity was poorer in concurrent samples as compared with predictive samples (38% vs. 72%), while PPV was low in both sample types (25 and 37%). Though random effects model results showed that the Scale had good overall predictive ability [RR, 4.33; 95% CI, 3.28-5.72], none of the concurrent samples were found to have "optimal" sensitivity and specificity. In conclusion, the appropriateness of the Braden Scale in LTC is questionable given its low specificity and PPV, in particular in concurrent validity studies. Future studies should further explore the extent to which the apparent low validity of the Scale in LTC is due to the choice of cutoff point and/or preventive strategies implemented by LTC staff as a matter of course. © 2015 by the Wound Healing Society.
Dendukuri, Nandini; McCusker, Jane; Bellavance, François; Cardin, Sylvie; Verdon, Josée; Karp, Igor; Belzile, Eric
2005-03-01
Emergency department (ED) use in Quebec may be measured from varied sources, eg, patient's self-reports, hospital medical charts, and provincial health insurance claims databases. Determining the relative validity of each source is complicated because none is a gold standard. We sought to compare the validity of different measures of ED use without arbitrarily assuming one is perfect. Data were obtained from a nursing liaison intervention study for frail seniors visiting EDs at 4 university-affiliated hospitals in Montreal. The number of ED visits during 2 consecutive follow-up periods of 1 and 4 months after baseline was obtained from patient interviews, from medical charts of participating hospitals, and from the provincial health insurance claims database. Latent class analysis was used to estimate the validity of each source. The impact of the following covariates on validity was evaluated: hospital visited, patient's demographic/clinical characteristics, risk of functional decline, nursing liaison intervention, duration of recall, previous ED use, and previous hospitalization. The patient's self-report was found to be the least accurate (sensitivity: 70%, specificity: 88%). Claims databases had the greatest validity, especially after defining claims made on consecutive days as part of the same ED visit (sensitivity: 98%, specificity: 98%). The validity of the medical chart was intermediate. Lower sensitivity (or under-reporting) on the self-report appeared to be associated with higher age, low comorbidity and shorter length of recall. The claims database is the most valid method of measuring ED use among seniors in Quebec compared with hospital medical charts and patient-reported use.
Construct Validity of Fresh Frozen Human Cadaver as a Training Model in Minimal Access Surgery
Macafee, David; Pranesh, Nagarajan; Horgan, Alan F.
2012-01-01
Background: The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance. Methods: Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark. Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score. Results: Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition. Conclusion: Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions. PMID:23318058
Body Dysmorphic Disorder in aesthetic rhinoplasty: Validating a new screening tool.
Lekakis, Garyfalia; Picavet, Valerie A; Gabriëls, Loes; Grietens, Jente; Hellings, Peter W
2016-08-01
To validate a new screening tool for body dysmorphic disorder (BDD) in patients seeking aesthetic rhinoplasty. We performed a prospective instrument validation study in an academic rhinology clinic. The Body Dysmorphic Disorder Questionnaire-Aesthetic Surgery (BDDQ-AS) is a seven-item short questionnaire validated in 116 patients undergoing aesthetic rhinoplasty. Screening was positive if the patient acknowledged on the BDDQ-AS that he/she was concerned about their appearance (question 1 = yes) AND preoccupied with these concerns (question 2 = yes) AND that these concerns caused at least moderate distress or impairment in different domains of daily life (question 3 or 4 or 5 or 6 ≥ 3 or question 7 = yes). Construct validity was assessed by comparing the BDDQ-AS to the Sheehan Disability Scale and the Derriford Appearance Scale-59. To determine concurrent validity, the BDDQ-AS was compared to the Yale-Brown Obsessive Compulsive Scale Modified for BDD. Finally, the predictive value of the BDDQ-AS on satisfaction 12 months after rhinoplasty was evaluated using a visual analogue scale and the Rhinoplasty Outcome Evaluation. Reliability of the BDDQ-AS was adequate, with Cronbach alpha = .83 for rhinoplasty patients and .84 for controls. Sensitivity was 89.6% and specificity 81.4%. BDDQ-AS-positive patients (n = 55) were more impaired in daily life and experienced more appearance-related distress and dysfunction compared to BDDQ-AS-negative patients. Moreover, they had more severe BDD symptoms. Finally, BDDQ-AS-positive patients were less satisfied after surgery compared to BDDQ-AS-negative patients. We hereby validated a new screening tool for BDD in an aesthetic rhinoplasty population. 3b. Laryngoscope, 126:1739-1745, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Jurick, S M; Crocker, L D; Keller, A V; Hoffman, S N; Bomyea, J; Jacobson, M W; Jak, A J
2018-05-30
This study examined the Minnesota Multiphasic Personality Inventory-Second Edition-Restructured Form (MMPI-2-RF) to better understand symptom presentation in a sample of treatment-seeking Operation Enduring Freedom/Operation Iraqi Freedom (OEF/OIF) Veterans with self-reported history of mild traumatic brain injury (mTBI). Participants underwent a comprehensive clinical neuropsychological battery including performance and symptom validity measures and self-report measures of depressive, posttraumatic, and post-concussive symptomatology. Those with possible symptom exaggeration (SE+) on the MMPI-2-RF were compared with those without (SE-) with regard to injury, psychiatric, validity, and cognitive variables. Between 50% and 87% of participants demonstrated possible symptom exaggeration on one or more MMPI-2-RF validity scales, and a large majority were elevated on content scales related to cognitive, somatic, and emotional complaints. The SE+ group reported higher depressive, posttraumatic, and post-concussive symptomatology, had higher scores on symptom validity measures, and performed more poorly on neuropsychological measures compared with the SE- group. There were no group differences with regard to injury variables or performance validity measures. Participants were more likely to exhibit possible symptom exaggeration on cognitive/somatic compared with traditional psychopathological validity scales. A sizable portion of treatment-seeking OEF/OIF Veterans demonstrated possible symptom exaggeration on MMPI-2-RF validity scales, which was associated with elevated scores on self-report measures and poorer cognitive performance, but not higher rates of performance validity failure, suggesting symptom and performance validity are distinct concepts. These findings have implications for the interpretation of clinical data in the context of possible symptom exaggeration and treatment in Veterans with persistent post-concussive symptoms.
ERIC Educational Resources Information Center
Makransky, Guido; Havmose, Philip; Vang, Maria Louison; Andersen, Tonny Elmose; Nielsen, Tine
2017-01-01
The aim of this study was to evaluate the predictive validity of a two-step admissions procedure that included a cognitive ability test followed by multiple mini-interviews (MMIs) used to assess non-cognitive skills, compared to grade-based admissions relative to subsequent drop-out rates and academic achievement after one and two years of study.…
ERIC Educational Resources Information Center
Fulmer, Gavin W.
2015-01-01
This study examines the validity of 2 proposed learning progressions on the force concept when tested using items from the Force Concept Inventory (FCI). This is the first study to compare students' performance with respect to learning progressions both for force and motion and for Newton's third law in parallel. It is also among the first studies…
Whitty, Jennifer A; Oliveira Gonçalves, Ana Sofia
2018-06-01
The aim of this study was to compare the acceptability, validity and concordance of discrete choice experiment (DCE) and best-worst scaling (BWS) stated preference approaches in health. A systematic search of EMBASE, Medline, AMED, PubMed, CINAHL, Cochrane Library and EconLit databases was undertaken in October to December 2016 without date restriction. Studies were included if they were published in English, presented empirical data related to the administration or findings of traditional format DCE and object-, profile- or multiprofile-case BWS, and were related to health. Study quality was assessed using the PREFS checklist. Fourteen articles describing 12 studies were included, comparing DCE with profile-case BWS (9 studies), DCE and multiprofile-case BWS (1 study), and profile- and multiprofile-case BWS (2 studies). Although limited and inconsistent, the balance of evidence suggests that preferences derived from DCE and profile-case BWS may not be concordant, regardless of the decision context. Preferences estimated from DCE and multiprofile-case BWS may be concordant (single study). Profile- and multiprofile-case BWS appear more statistically efficient than DCE, but no evidence is available to suggest they have a greater response efficiency. Little evidence suggests superior validity for one format over another. Participant acceptability may favour DCE, which had a lower self-reported task difficulty and was preferred over profile-case BWS in a priority setting but not necessarily in other decision contexts. DCE and profile-case BWS may be of equal validity but give different preference estimates regardless of the health context; thus, they may be measuring different constructs. Therefore, choice between methods is likely to be based on normative considerations related to coherence with theoretical frameworks and on pragmatic considerations related to ease of data collection.
Tang, Jonathan; Mandrusiak, Allison; Russell, Trevor
2012-01-01
Pulmonary rehabilitation is an effective treatment for people with chronic obstructive pulmonary disease. However, access to these services is limited especially in rural and remote areas. Telerehabilitation has the potential to deliver pulmonary rehabilitation programs to these communities. The aim of this study was threefold: to establish the technical feasibility of transmitting real-time pulse oximetry data, determine the validity of remote measurements compared to conventional face-to-face measures, and evaluate the participants' perception of the usability of the technology. Thirty-seven healthy individuals participated in a single remote pulmonary rehabilitation exercise session, conducted using the eHAB telerehabilitation system. Validity was assessed by comparing the participant's oxygen saturation and heart rate with the data set received at the therapist's remote location. There was an 80% exact agreement between participant and therapist data sets. The mean absolute difference and Bland and Altman's limits of agreement fell within the minimum clinically important difference for both oxygen saturation and heart rate values. Participants found the system easy to use and felt confident that they would be able to use it at home. Remote measurement of pulse oximetry data for a pulmonary rehabilitation exercise session was feasible and valid when compared to conventional face-to-face methods. PMID:23049549
Aalbers, Teun; Baars, Maria A E; Olde Rikkert, Marcel G M; Kessels, Roy P C
2013-12-03
Online interventions are aiming increasingly at cognitive outcome measures but so far no easy and fast self-monitors for cognition have been validated or proven reliable and feasible. This study examines a new instrument called the Brain Aging Monitor-Cognitive Assessment Battery (BAM-COG) for its alternate forms reliability, face and content validity, and convergent and divergent validity. Also, reference values are provided. The BAM-COG consists of four easily accessible, short, yet challenging puzzle games that have been developed to measure working memory ("Conveyer Belt"), visuospatial short-term memory ("Sunshine"), episodic recognition memory ("Viewpoint"), and planning ("Papyrinth"). A total of 641 participants were recruited for this study. Of these, 397 adults, 40 years and older (mean 54.9, SD 9.6), were eligible for analysis. Study participants played all games three times with 14 days in between sets. Face and content validity were based on expert opinion. Alternate forms reliability (AFR) was measured by comparing scores on different versions of the BAM-COG and expressed with an intraclass correlation (ICC: two-way mixed; consistency at 95%). Convergent validity (CV) was provided by comparing BAM-COG scores to gold-standard paper-and-pencil and computer-assisted cognitive assessment. Divergent validity (DV) was measured by comparing BAM-COG scores to the National Adult Reading Test IQ (NART-IQ) estimate. Both CV and DV are expressed as Spearman rho correlation coefficients. Three out of four games showed adequate results on AFR, CV, and DV measures. The games Conveyer Belt, Sunshine, and Papyrinth have AFR ICCs of .420, .426, and .645 respectively. Also, these games had good to very good CV correlations: rho=.577 (P=.001), rho=.669 (P<.001), and rho=.400 (P=.04), respectively. Last, as expected, DV correlations were low: rho=-.029 (P=.44), rho=-.029 (P=.45), and rho=-.134 (P=.28) respectively. The game Viewpoint provided less desirable results with an AFR ICC of .167, CV rho=.202 (P=.15), and DV rho=-.162 (P=.21). This study provides evidence for the use of the BAM-COG test battery as a feasible, reliable, and valid tool to monitor cognitive performance in healthy adults in an online setting. Three out of four games have good psychometric characteristics to measure working memory, visuospatial short-term memory, and planning capacity.
Predictors of validity and reliability of a physical activity record in adolescents
2013-01-01
Background Poor to moderate validity of self-reported physical activity instruments is commonly observed in young people in low- and middle-income countries. However, the reasons for such low validity have not been examined in detail. We tested the validity of a self-administered daily physical activity record in adolescents and assessed if personal characteristics or the convenience level of reporting physical activity modified the validity estimates. Methods The study comprised a total of 302 adolescents from an urban and rural area in Ecuador. Validity was evaluated by comparing the record with accelerometer recordings for seven consecutive days. Test-retest reliability was examined by comparing registrations from two records administered three weeks apart. Time spent on sedentary (SED), low (LPA), moderate (MPA) and vigorous (VPA) intensity physical activity was estimated. Bland Altman plots were used to evaluate measurement agreement. We assessed if age, sex, urban or rural setting, anthropometry and convenience of completing the record explained differences in validity estimates using a linear mixed model. Results Although the record provided higher estimates for SED and VPA and lower estimates for LPA and MPA compared to the accelerometer, it showed an overall fair measurement agreement for validity. There was modest reliability for assessing physical activity in each intensity level. Validity was associated with adolescents’ personal characteristics: sex (SED: P = 0.007; LPA: P = 0.001; VPA: P = 0.009) and setting (LPA: P = 0.000; MPA: P = 0.047). Reliability was associated with the convenience of completing the physical activity record for LPA (low convenience: P = 0.014; high convenience: P = 0.045). Conclusions The physical activity record provided acceptable estimates for reliability and validity on a group level. Sex and setting were associated with validity estimates, whereas convenience to fill out the record was associated with better reliability estimates for LPA. This tendency of improved reliability estimates for adolescents reporting higher convenience merits further consideration. PMID:24289296
Reliability and validity of an audio signal modified shuttle walk test.
Singla, Rupak; Rai, Richa; Faye, Abhishek Anil; Jain, Anil Kumar; Chowdhury, Ranadip; Bandyopadhyay, Debdutta
2017-01-01
The audio signal in the conventionally accepted protocol of shuttle walk test (SWT) is not well-understood by the patients and modification of the audio signal may improve the performance of the test. The aim of this study is to study the validity and reliability of an audio signal modified SWT, called the Singla-Richa modified SWT (SWTSR), in healthy normal adults. In SWTSR, the audio signal was modified with the addition of reverse counting to it. A total of 54 healthy normal adults underwent conventional SWT (CSWT) at one instance and two times SWTSRon the same day. The validity was assessed by comparing outcomes of the SWTSRto outcomes of CSWT using the Pearson correlation coefficient and Bland-Altman plot. Test-retest reliability of SWTSRwas assessed using the intraclass correlation coefficient (ICC). The acceptability of the modified test in comparison to the conventional test was assessed using Likert scale. The distance walked (mean ± standard deviation) in the CSWT and SWTSRtest was 853.33 ± 217.33 m and 857.22 ± 219.56 m, respectively (Pearson correlation coefficient - 0.98; P < 0.001) indicating SWTSRto be a valid test. The SWTSRwas found to be a reliable test with ICC of 0.98 (95% confidence interval: 0.97-0.99). The acceptability of SWTSRwas significantly higher than CSWT. The SWTSRwith modified audio signal with reverse counting is a reliable as well as a valid test when compared with CSWT in healthy normal adults. It better understood by subjects compared to CSWT.
Computational fluid dynamics modeling of laboratory flames and an industrial flare.
Singh, Kanwar Devesh; Gangadharan, Preeti; Chen, Daniel H; Lou, Helen H; Li, Xianchang; Richmond, Peyton
2014-11-01
A computational fluid dynamics (CFD) methodology for simulating the combustion process has been validated with experimental results. Three different types of experimental setups were used to validate the CFD model. These setups include an industrial-scale flare setups and two lab-scale flames. The CFD study also involved three different fuels: C3H6/CH/Air/N2, C2H4/O2/Ar and CH4/Air. In the first setup, flare efficiency data from the Texas Commission on Environmental Quality (TCEQ) 2010 field tests were used to validate the CFD model. In the second setup, a McKenna burner with flat flames was simulated. Temperature and mass fractions of important species were compared with the experimental data. Finally, results of an experimental study done at Sandia National Laboratories to generate a lifted jet flame were used for the purpose of validation. The reduced 50 species mechanism, LU 1.1, the realizable k-epsilon turbulence model, and the EDC turbulence-chemistry interaction model were usedfor this work. Flare efficiency, axial profiles of temperature, and mass fractions of various intermediate species obtained in the simulation were compared with experimental data and a good agreement between the profiles was clearly observed. In particular the simulation match with the TCEQ 2010 flare tests has been significantly improved (within 5% of the data) compared to the results reported by Singh et al. in 2012. Validation of the speciated flat flame data supports the view that flares can be a primary source offormaldehyde emission.
Cerruto, Maria A.; D'Elia, Carolina; Siracusano, Salvatore; Porcaro, Antonio B.; Cacciamani, Giovanni; De Marchi, Davide; Niero, Mauro; Lonardi, Cristina; Iafrate, Massimo; Bassi, Pierfrancesco; Belgrano, Emanuele; Imbimbo, Ciro; Racioppi, Marco; Talamini, Renato; Ciciliato, Stefano; Toffoli, Laura; Rizzo, Michele; Visalli, Francesco; Verze, Paolo; Artibani, Walter
2017-01-01
Introduction From the most recent systematic revision of the literature, an orthotopic neobladder would seem to show marginally better health related quality of life (HR-QoL) scores compared with an ileal conduit. The aim of this study was to review all relevant published studies about the comparison between ileal orthotopic neobladder (IONB) and ileal conduit using validated HR-QoL questionnaires. Materials and Methods Studies were identified by searching multiple literature databases. Data were synthesized using meta-analytic methods conformed to the PRISMA statement. Results The literature search identified 10 papers; pooled effect sizes of combined quality of life outcomes for ileal conduit versus IONB showed a significantly better HR-QoL in patients with IONB (Hedges' g = 0.278; p = 0.000);. The present study has an important limitation due to the type of the analyzed comparative studies, all retrospective and not randomized. Conclusion This meta-analysis of not-randomized, retrospective comparative studies on the impact of ileal conduit versus IONB on HR-QoL showed a significant advantage of IONB subgroups. PMID:28785189
Bentzen, S M R; Knudsen, V K; Christiensen, T; Ewers, B
2016-01-01
Background: Diet has an important role in the management of diabetes. However, little is known about dietary intake in Danish diabetes patients. A food frequency questionnaire (FFQ) focusing on most relevant nutrients in diabetes including carbohydrates, dietary fibres and simple sugars was developed and validated. Objectives: To examine the relative validity of nutrients calculated by a web-based food frequency questionnaire for patients with diabetes. Design: The FFQ was validated against a 4-day pre-coded food diary (FD). Intakes of nutrients were calculated. Means of intake were compared and cross-classifications of individuals according to intake were performed. To assess the agreement between the two methods, Pearson and Spearman's correlation coefficients and weighted kappa coefficients were calculated. Subjects: Ninety patients (64 with type 1 diabetes and 26 with type 2 diabetes) accepted to participate in the study. Twenty-six were excluded from the final study population. Setting: 64 volunteer diabetes patients at the Steno Diabetes Center. Results: Intakes of carbohydrates, simple sugars, dietary fibres and total energy were higher according to the FFQ compared with the FD. However, intakes of nutrients were grossly classified in the same or adjacent quartiles with an average of 82% of the selected nutrients when comparing the two methods. In general, moderate agreement between the two methods was found. Conclusion: The FFQ was validated for assessment of a range of nutrients. Comparing the intakes of selected nutrients (carbohydrates, dietary fibres and simple sugars), patients were classified correctly according to low and high intakes. The FFQ is a reliable dietary assessment tool to use in research and evaluation of patient education for patients with diabetes. PMID:27669176
Commissioning and validation of COMPASS system for VMAT patient specific quality assurance
NASA Astrophysics Data System (ADS)
Pimthong, J.; Kakanaporn, C.; Tuntipumiamorn, L.; Laojunun, P.; Iampongpaiboon, P.
2016-03-01
Pre-treatment patient specific quality assurance (QA) of advanced treatment techniques such as volumetric modulated arc therapy (VMAT) is one of important QA in radiotherapy. The fast and reliable dosimetric device is required. The objective of this study is to commission and validate the performance of COMPASS system for dose verification of VMAT technique. The COMPASS system is composed of an array of ionization detectors (MatriXX) mounted to the gantry using a custom holder and software for the analysis and visualization of QA results. We validated the COMPASS software for basic and advanced clinical application. For the basic clinical study, the simple open field in various field sizes were validated in homogeneous phantom. And the advanced clinical application, the fifteen prostate and fifteen nasopharyngeal cancers VMAT plans were chosen to study. The treatment plans were measured by the MatriXX. The doses and dose-volume histograms (DVHs) reconstructed from the fluence measurements were compared to the TPS calculated plans. And also, the doses and DVHs computed using collapsed cone convolution (CCC) Algorithm were compared with Eclipse TPS calculated plans using Analytical Anisotropic Algorithm (AAA) that according to dose specified in ICRU 83 for PTV.
2013-01-01
Background In recent years response rates on telephone surveys have been declining. Rates for the behavioral risk factor surveillance system (BRFSS) have also declined, prompting the use of new methods of weighting and the inclusion of cell phone sampling frames. A number of scholars and researchers have conducted studies of the reliability and validity of the BRFSS estimates in the context of these changes. As the BRFSS makes changes in its methods of sampling and weighting, a review of reliability and validity studies of the BRFSS is needed. Methods In order to assess the reliability and validity of prevalence estimates taken from the BRFSS, scholarship published from 2004–2011 dealing with tests of reliability and validity of BRFSS measures was compiled and presented by topics of health risk behavior. Assessments of the quality of each publication were undertaken using a categorical rubric. Higher rankings were achieved by authors who conducted reliability tests using repeated test/retest measures, or who conducted tests using multiple samples. A similar rubric was used to rank validity assessments. Validity tests which compared the BRFSS to physical measures were ranked higher than those comparing the BRFSS to other self-reported data. Literature which undertook more sophisticated statistical comparisons was also ranked higher. Results Overall findings indicated that BRFSS prevalence rates were comparable to other national surveys which rely on self-reports, although specific differences are noted for some categories of response. BRFSS prevalence rates were less similar to surveys which utilize physical measures in addition to self-reported data. There is very little research on reliability and validity for some health topics, but a great deal of information supporting the validity of the BRFSS data for others. Conclusions Limitations of the examination of the BRFSS were due to question differences among surveys used as comparisons, as well as mode of data collection differences. As the BRFSS moves to incorporating cell phone data and changing weighting methods, a review of reliability and validity research indicated that past BRFSS landline only data were reliable and valid as measured against other surveys. New analyses and comparisons of BRFSS data which include the new methodologies and cell phone data will be needed to ascertain the impact of these changes on estimates in the future. PMID:23522349
NASA Astrophysics Data System (ADS)
Nelson, B. A.; Akcay, C.; Glasser, A. H.; Hansen, C. J.; Jarboe, T. R.; Marklin, G. J.; Milroy, R. D.; Morgan, K. D.; Norgaard, P. C.; Shumlak, U.; Sutherland, D. A.; Victor, B. S.; Sovinec, C. R.; O'Bryan, J. B.; Held, E. D.; Ji, J.-Y.; Lukin, V. S.
2014-10-01
The Plasma Science and Innovation Center (PSI-Center - http://www.psicenter.org) supports collaborating validation platform experiments with 3D extended MHD simulations using the NIMROD, HiFi, and PSI-TET codes. Collaborators include the Bellan Plasma Group (Caltech), CTH (Auburn U), HBT-EP (Columbia), HIT-SI (U Wash-UW), LTX (PPPL), MAST (Culham), Pegasus (U Wisc-Madison), SSX (Swarthmore College), TCSU (UW), and ZaP/ZaP-HD (UW). The PSI-Center is exploring application of validation metrics between experimental data and simulations results. Biorthogonal decomposition (BOD) is used to compare experiments with simulations. BOD separates data sets into spatial and temporal structures, giving greater weight to dominant structures. Several BOD metrics are being formulated with the goal of quantitive validation. Results from these simulation and validation studies, as well as an overview of the PSI-Center status will be presented.
Lee, Chun Fan; Ng, Raymond; Luo, Nan; Wong, Nan Soon; Yap, Yoon Sim; Lo, Soo Kien; Chia, Whay Kuang; Yee, Alethea; Krishna, Lalit; Wong, Celest; Goh, Cynthia; Cheung, Yin Bun
2013-01-01
To examine the measurement properties of and comparability between the English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) in breast cancer patients in Singapore. This is an observational study of 269 patients. Known-group validity and responsiveness of the EQ-5D utility index and visual analog scale (VAS) were assessed in relation to various clinical characteristics and longitudinal change in performance status, respectively. Convergent and divergent validity was examined by correlation coefficients between the EQ-5D and a breast cancer-specific instrument. Test-retest reliability was evaluated. The two language versions were compared by multiple regression analyses. For both English and Chinese versions, the EQ-5D utility index and VAS demonstrated known-group validity and convergent and divergent validity, and presented sufficient test-retest reliability (intraclass correlation = 0.72 to 0.83). The English version was responsive to changes in performance status. The Chinese version was responsive to decline in performance status, but there was no conclusive evidence about its responsiveness to improvement in performance status. In the comparison analyses of the utility index and VAS between the two language versions, borderline results were obtained, and equivalence cannot be definitely confirmed. The five-level EQ-5D is valid, responsive, and reliable in assessing health outcome of breast cancer patients. The English and Chinese versions provide comparable measurement results.
Comparison of C5 and C6 Aqua-MODIS Dark Target Aerosol Validation
NASA Technical Reports Server (NTRS)
Munchak, Leigh A.; Levy, Robert C.; Mattoo, Shana
2014-01-01
We compare C5 and C6 validation to compare the C6 10 km aerosol product against the well validated and trusted aerosol product on global and regional scales. Only the 10 km aerosol product is evaluated in this study, validation of the new C6 3 km aerosol product still needs to be performed. Not all of the time series has processed yet for C5 or C6, and the years processed for the 2 products is not exactly the same (this work is preliminary!). To reduce the impact of outlier observations, MODIS is spatially averaged within 27.5 km of the AERONET site, and AERONET is temporatally averaged within 30 minutes of the MODIS overpass time. Only high quality (QA = 3 over land, QA greater than 0 over ocean) pixels are included in the mean.
Brewin, James; Tang, Jessica; Dasgupta, Prokar; Khan, Muhammad S; Ahmed, Kamran; Bello, Fernando; Kneebone, Roger; Jaye, Peter
2015-07-01
To evaluate the face, content and construct validity of the distributed simulation (DS) environment for technical and non-technical skills training in endourology. To evaluate the educational impact of DS for urology training. DS offers a portable, low-cost simulated operating room environment that can be set up in any open space. A prospective mixed methods design using established validation methodology was conducted in this simulated environment with 10 experienced and 10 trainee urologists. All participants performed a simulated prostate resection in the DS environment. Outcome measures included surveys to evaluate the DS, as well as comparative analyses of experienced and trainee urologist's performance using real-time and 'blinded' video analysis and validated performance metrics. Non-parametric statistical methods were used to compare differences between groups. The DS environment demonstrated face, content and construct validity for both non-technical and technical skills. Kirkpatrick level 1 evidence for the educational impact of the DS environment was shown. Further studies are needed to evaluate the effect of simulated operating room training on real operating room performance. This study has shown the validity of the DS environment for non-technical, as well as technical skills training. DS-based simulation appears to be a valuable addition to traditional classroom-based simulation training. © 2014 The Authors BJU International © 2014 BJU International Published by John Wiley & Sons Ltd.
Measuring striving for understanding and learning value of geometry: a validity study
NASA Astrophysics Data System (ADS)
Ubuz, Behiye; Aydınyer, Yurdagül
2017-11-01
The current study aimed to construct a questionnaire that measures students' personality traits related to striving for understanding and learning value of geometry and then examine its psychometric properties. Through the use of multiple methods on two independent samples of 402 and 521 middle school students, two studies were performed to address this issue to provide support for its validity. In Study 1, exploratory factor analysis indicated the two-factor model. In Study 2, confirmatory factor analysis indicated the better fit of two-factor model compared to one or three-factor model. Convergent and discriminant validity evidence provided insight into the distinctiveness of the two factors. Subgroup validity evidence revealed gender differences for striving for understanding geometry trait favouring girls and grade level differences for learning value of geometry trait favouring the sixth- and seventh-grade students. Predictive validity evidence demonstrated that the striving for understanding geometry trait but not learning value of geometry trait was significantly correlated with prior mathematics achievement. In both studies, each factor and the entire questionnaire showed satisfactory reliability. In conclusion, the questionnaire was psychometrically sound.
External validation of preexisting first trimester preeclampsia prediction models.
Allen, Rebecca E; Zamora, Javier; Arroyo-Manzano, David; Velauthar, Luxmilar; Allotey, John; Thangaratinam, Shakila; Aquilina, Joseph
2017-10-01
To validate the increasing number of prognostic models being developed for preeclampsia using our own prospective study. A systematic review of literature that assessed biomarkers, uterine artery Doppler and maternal characteristics in the first trimester for the prediction of preeclampsia was performed and models selected based on predefined criteria. Validation was performed by applying the regression coefficients that were published in the different derivation studies to our cohort. We assessed the models discrimination ability and calibration. Twenty models were identified for validation. The discrimination ability observed in derivation studies (Area Under the Curves) ranged from 0.70 to 0.96 when these models were validated against the validation cohort, these AUC varied importantly, ranging from 0.504 to 0.833. Comparing Area Under the Curves obtained in the derivation study to those in the validation cohort we found statistically significant differences in several studies. There currently isn't a definitive prediction model with adequate ability to discriminate for preeclampsia, which performs as well when applied to a different population and can differentiate well between the highest and lowest risk groups within the tested population. The pre-existing large number of models limits the value of further model development and future research should be focussed on further attempts to validate existing models and assessing whether implementation of these improves patient care. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Johnson, Sheena Joanne; Guediri, Sara M; Kilkenny, Caroline; Clough, Peter J
2011-12-01
This study developed and validated a virtual reality (VR) simulator for use by interventional radiologists. Research in the area of skill acquisition reports practice as essential to become a task expert. Studies on simulation show skills learned in VR can be successfully transferred to a real-world task. Recently, with improvements in technology, VR simulators have been developed to allow complex medical procedures to be practiced without risking the patient. Three studies are reported. In Study I, 35 consultant interventional radiologists took part in a cognitive task analysis to empirically establish the key competencies of the Seldinger procedure. In Study 2, 62 participants performed one simulated procedure, and their performance was compared by expertise. In Study 3, the transferability of simulator training to a real-world procedure was assessed with 14 trainees. Study I produced 23 key competencies that were implemented as performance measures in the simulator. Study 2 showed the simulator had both face and construct validity, although some issues were identified. Study 3 showed the group that had undergone simulator training received significantly higher mean performance ratings on a subsequent patient procedure. The findings of this study support the centrality of validation in the successful design of simulators and show the utility of simulators as a training device. The studies show the key elements of a validation program for a simulator. In addition to task analysis and face and construct validities, the authors highlight the importance of transfer of training in validation studies.
Wakefield, Jerome C; Schmitz, Mark F
2017-04-01
"Complicated" subthreshold depression (CsD) includes at least one of six pathosuggestive "complicated" symptoms: >6 months duration, marked role impairment, sense of worthlessness, suicidal ideation, psychotic ideation, and psychomotor retardation. "Uncomplicated" subthreshold depression (UsD) has no complicated features. Whereas studies show that complicated (CMDD) versus uncomplicated (UMDD) major depression differ substantially in severity and prognosis, UsD and CsD severity has not been previously compared. This study evaluates UsD and CsD pathology validator levels and examines whether the complicated/uncomplicated distinction offers incremental concurrent validity over the standard number-of-symptoms dimension as a depression severity measure. Using nationally representative community data from the National Comorbidity Survey, seven depression lifetime history subgroups were identified: one MDD screener symptom (n=1432); UsD (n=430); CsD (n=611); UMDD (n=182); and CMDD with 5-6 symptoms (n=518), 7 symptoms (n=217), and 8-9 symptoms (n=291). Severity was evaluated using five concurrent pathology validators: suicide attempt, interference with life, help seeking, hospitalization, and generalized anxiety disorder. CsD validator levels are substantially higher than both UsD and UMDD levels, and similar to mild CMDD, disconfirming the "monotonicity thesis" that severity increase with symptom number. Complicated/uncomplicated status predicts severity, and when complicatedness is controlled, number of symptoms no longer predicts validator levels. Diagnoses were based on respondents' fallible retrospective symptom reports during a lay-administered structured interview, which may not yield diagnoses comparable to clinicians' assessments. CsD is more severe than UsD and comparable to mild MDD. Complicated status more validly indicates depression severity than the standard number-of-symptoms measure. Copyright © 2017 Elsevier B.V. All rights reserved.
Predictive Validity of DSM-IV and ICD-10 Criteria for ADHD and Hyperkinetic Disorder
ERIC Educational Resources Information Center
Lee, Soyoung I.; Schachar, Russell J.; Chen, Shirley X.; Ornstein, Tisha J.; Charach, Alice; Barr, Cathy; Ickowicz, Abel
2008-01-01
Background: The goal of this study was to compare the predictive validity of the two main diagnostic schemata for childhood hyperactivity--attention-deficit hyperactivity disorder (ADHD; "Diagnostic and Statistical Manual"-IV) and hyperkinetic disorder (HKD; "International Classification of Diseases"-10th Edition). Methods: Diagnostic criteria for…
Training Objectives, Transfer, Validation and Evaluation: A Sri Lankan Study
ERIC Educational Resources Information Center
Wickramasinghe, Vathsala M.
2006-01-01
Using a stratified random sample, this paper examines the training practices of setting objectives, transfer, validation and evaluation in Sri Lanka. The paper further sets out to compare those practices across local, foreign and joint-venture companies based on the assumption that there may be significant differences across companies of different…
A Comparison between SRSS-IE and SSiS-PSG Scores: Examining Convergent Validity
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Common, Eric Alan; Zorigian, Kris; Brunsting, Nelson C.; Schatschneider, Christopher
2015-01-01
We report findings of a validation study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE, an adapted version of the Student Risk Screening Scale) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG). Participants included 458 kindergarten through fifth-grade…
Additional Evidence of Convergent Validity between SRSS-IE and SSiS-PSG Scores
ERIC Educational Resources Information Center
Lane, Kathleen Lynne; Oakes, Wendy Peia; Ennis, Robin Parks; Royer, David James
2015-01-01
We report findings of a validity study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG; Elliott & Gresham, 2007). Participants were 1,680 kindergarten through sixth-grade elementary students from three…
USDA-ARS?s Scientific Manuscript database
The aim of this study was to determine the validity of energy intake (EI) estimations made using the remote food photography method (RFPM) compared to the doubly labeled water (DLW) method in minority preschool children in a free-living environment. Seven days of food intake and spot urine samples...
Cocited Author Mapping as a Valid Representation of Intellectual Structure.
ERIC Educational Resources Information Center
McCain, Katherine W.
1986-01-01
To test validity of cocitation studies as representations of intellectual structure, five-six years of aggregate cocitation data for 41 authors in macroeconomics and 49 authors in genetics of fruit flies were compared with independent judgments of interauthor similarity collected from 14 macroeconomists and 15 geneticists via a card-sorting…
Maternal attitudes, depression, and anxiety in pregnant and postpartum multiparous women.
Sockol, Laura E; Battle, Cynthia L
2015-08-01
The Attitudes Toward Motherhood (AToM) Scale was developed to assess women's beliefs about motherhood, a specific risk factor for emotional distress in perinatal populations. As the measure was initially developed and validated for use among first-time mothers, this study assessed the reliability and validity of the AToM Scale in a sample of multiparous women. Maternal attitudes were significantly associated with symptoms of depression, even after controlling for demographic, cognitive, and interpersonal risk factors. Maternal attitudes were also associated with symptoms of anxiety after controlling for demographic risk factors, but this association was not significant after accounting for cognitive and interpersonal risk factors. Compared to primiparous women from the initial validation study of the AToM Scale, multiparous women reported lower levels of social support and marital satisfaction. The relationships between cognitive and interpersonal risk factors and symptoms of depression and anxiety were comparable between multiparous and primiparous women.
Ethinylestradiol and levonorgestrel preparations on the Belgian market: a comparative study.
Vanheusden, V; De Braekeleer, K; Corthout, J
2012-03-01
Preparations formulated as coated or film-coated tablets, containing levonorgestrel and the combination ethinylestradiol/levonorgestrel, were evaluated in a comparative study. This study comprised in vitro dissolution, assay and content uniformity. The analytical methods were previously validated according to international guidelines. All examined products complied with the postulated requirements.
A Spatio-Temporal Approach for Global Validation and Analysis of MODIS Aerosol Products
NASA Technical Reports Server (NTRS)
Ichoku, Charles; Chu, D. Allen; Mattoo, Shana; Kaufman, Yoram J.; Remer, Lorraine A.; Tanre, Didier; Slutsker, Ilya; Holben, Brent N.; Lau, William K. M. (Technical Monitor)
2001-01-01
With the launch of the MODIS sensor on the Terra spacecraft, new data sets of the global distribution and properties of aerosol are being retrieved, and need to be validated and analyzed. A system has been put in place to generate spatial statistics (mean, standard deviation, direction and rate of spatial variation, and spatial correlation coefficient) of the MODIS aerosol parameters over more than 100 validation sites spread around the globe. Corresponding statistics are also computed from temporal subsets of AERONET-derived aerosol data. The means and standard deviations of identical parameters from MOMS and AERONET are compared. Although, their means compare favorably, their standard deviations reveal some influence of surface effects on the MODIS aerosol retrievals over land, especially at low aerosol loading. The direction and rate of spatial variation from MODIS are used to study the spatial distribution of aerosols at various locations either individually or comparatively. This paper introduces the methodology for generating and analyzing the data sets used by the two MODIS aerosol validation papers in this issue.
Setoguchi, Soko; Zhu, Ying; Jalbert, Jessica J; Williams, Lauren A; Chen, Chih-Ying
2014-05-01
Linking patient registries with administrative databases can enhance the utility of the databases for epidemiological and comparative effectiveness research. However, registries often lack direct personal identifiers, and the validity of record linkage using multiple indirect personal identifiers is not well understood. Using a large contemporary national cardiovascular device registry and 100% Medicare inpatient data, we linked hospitalization-level records. The main outcomes were the validity measures of several deterministic linkage rules using multiple indirect personal identifiers compared with rules using both direct and indirect personal identifiers. Linkage rules using 2 or 3 indirect, patient-level identifiers (ie, date of birth, sex, admission date) and hospital ID produced linkages with sensitivity of 95% and specificity of 98% compared with a gold standard linkage rule using a combination of both direct and indirect identifiers. Ours is the first large-scale study to validate the performance of deterministic linkage rules without direct personal identifiers. When linking hospitalization-level records in the absence of direct personal identifiers, provider information is necessary for successful linkage. © 2014 American Heart Association, Inc.
Failure mode and effects analysis outputs: are they valid?
Shebl, Nada Atef; Franklin, Bryony Dean; Barber, Nick
2012-06-10
Failure Mode and Effects Analysis (FMEA) is a prospective risk assessment tool that has been widely used within the aerospace and automotive industries and has been utilised within healthcare since the early 1990s. The aim of this study was to explore the validity of FMEA outputs within a hospital setting in the United Kingdom. Two multidisciplinary teams each conducted an FMEA for the use of vancomycin and gentamicin. Four different validity tests were conducted: Face validity: by comparing the FMEA participants' mapped processes with observational work. Content validity: by presenting the FMEA findings to other healthcare professionals. Criterion validity: by comparing the FMEA findings with data reported on the trust's incident report database. Construct validity: by exploring the relevant mathematical theories involved in calculating the FMEA risk priority number. Face validity was positive as the researcher documented the same processes of care as mapped by the FMEA participants. However, other healthcare professionals identified potential failures missed by the FMEA teams. Furthermore, the FMEA groups failed to include failures related to omitted doses; yet these were the failures most commonly reported in the trust's incident database. Calculating the RPN by multiplying severity, probability and detectability scores was deemed invalid because it is based on calculations that breach the mathematical properties of the scales used. There are significant methodological challenges in validating FMEA. It is a useful tool to aid multidisciplinary groups in mapping and understanding a process of care; however, the results of our study cast doubt on its validity. FMEA teams are likely to need different sources of information, besides their personal experience and knowledge, to identify potential failures. As for FMEA's methodology for scoring failures, there were discrepancies between the teams' estimates and similar incidents reported on the trust's incident database. Furthermore, the concept of multiplying ordinal scales to prioritise failures is mathematically flawed. Until FMEA's validity is further explored, healthcare organisations should not solely depend on their FMEA results to prioritise patient safety issues.
Mitchell, Travis D.; Urli, Kristina E.; Breitenbach, Jacques; Yelverton, Chris
2007-01-01
Abstract Objective This study aimed to evaluate the validity of the sacral base pressure test in diagnosing sacroiliac joint dysfunction. It also determined the predictive powers of the test in determining which type of sacroiliac joint dysfunction was present. Methods This was a double-blind experimental study with 62 participants. The results from the sacral base pressure test were compared against a cluster of previously validated tests of sacroiliac joint dysfunction to determine its validity and predictive powers. The external rotation of the feet, occurring during the sacral base pressure test, was measured using a digital inclinometer. Results There was no statistically significant difference in the results of the sacral base pressure test between the types of sacroiliac joint dysfunction. In terms of the results of validity, the sacral base pressure test was useful in identifying positive values of sacroiliac joint dysfunction. It was fairly helpful in correctly diagnosing patients with negative test results; however, it had only a “slight” agreement with the diagnosis for κ interpretation. Conclusions In this study, the sacral base pressure test was not a valid test for determining the presence of sacroiliac joint dysfunction or the type of dysfunction present. Further research comparing the agreement of the sacral base pressure test or other sacroiliac joint dysfunction tests with a criterion standard of diagnosis is necessary. PMID:19674694
The risk of bias in systematic reviews tool showed fair reliability and good construct validity.
Bühn, Stefanie; Mathes, Tim; Prengel, Peggy; Wegewitz, Uta; Ostermann, Thomas; Robens, Sibylle; Pieper, Dawid
2017-11-01
There is a movement from generic quality checklists toward a more domain-based approach in critical appraisal tools. This study aimed to report on a first experience with the newly developed risk of bias in systematic reviews (ROBIS) tool and compare it with A Measurement Tool to Assess Systematic Reviews (AMSTAR), that is, the most common used tool to assess methodological quality of systematic reviews while assessing validity, reliability, and applicability. Validation study with four reviewers based on 16 systematic reviews in the field of occupational health. Interrater reliability (IRR) of all four raters was highest for domain 2 (Fleiss' kappa κ = 0.56) and lowest for domain 4 (κ = 0.04). For ROBIS, median IRR was κ = 0.52 (range 0.13-0.88) for the experienced pair of raters compared to κ = 0.32 (range 0.12-0.76) for the less experienced pair of raters. The percentage of "yes" scores of each review of ROBIS ratings was strongly correlated with the AMSTAR ratings (r s = 0.76; P = 0.01). ROBIS has fair reliability and good construct validity to assess the risk of bias in systematic reviews. More validation studies are needed to investigate reliability and applicability, in particular. Copyright © 2017 Elsevier Inc. All rights reserved.
The Development of Learning Management System Using Edmodo
NASA Astrophysics Data System (ADS)
Joko; Septia Wulandari, Gayuh
2018-04-01
The development of Learning Management System (LMS) can be used as an online learning media by managing the teacher in delivering the material and giving a task. This study aims to: 1) to know the validity of learning devices using LMS with Edmodo, 2) know the student’s response to LMS implementation using Edmodo, and 3) to know the difference of the learning outcome that is students who learned by using LMS with Edmodo and Direct Learning Model (DLM). This research method is quasi experimental by using control group pretest-posttest design. The population of the study was the student at SMKN 1 Sidoarjo. Research sample X TITL 1 class as control goup, and X TITL 2 class as experimental group. The researcher used scale rating to analyze the data validity and students’ respon, and t-test was used to examine the difference of learning outcomes with significant 0.05. The result of the research shows: 1) the average learning device validity use Edmodo 88.14%, lesson plan validity is 92.45%, pretest-posttest validity is 89.15%, learning material validity is 84.64%, and affective and psychomotor-portfolio observation sheets validity is 86.33 included very good criteria or very suitable to be used for research; 2) the result of students’ response questionnaire after taught by using LMS with Edmodo 86.03% in very good category and students agreed that Edmodo can be used in learning; and 3) the learning outcome of LMS by using Edmodo with DLM are: a) there are significant difference of the student cognitive learning outcome which is taught by using Edmodo with the student who use DLM. The average of student learning outcome that is taught LMS using Edmodo is 81.69 compared to student with DLM outcome 76.39, b) there is difference of affective learning outcome that is taught LMS using Edmodo compared to student using DLM. The average of student learning outcomeof affective that is taught LMS by using Edmodo is 83.50 compared students who use DLM 80.34, and c) there is difference of student psychomotor learning outcome that is taught with LMS using Edmodo compared student who use DLM. The average of student learning outcome that is taught with LMS using Edmodo is 85.60 compared to student who uses DLM 82.31.
DOT National Transportation Integrated Search
2016-12-01
This research project is a continuation of a previous NITC-funded study. The first study compared the MacArthur Park TOD in Los Angeles to the : Fruitvale Village TOD in Oakland. The findings from this new study further validate the key findings from...
Validation of the TTM processes of change measure for physical activity in an adult French sample.
Bernard, Paquito; Romain, Ahmed-Jérôme; Trouillet, Raphael; Gernigon, Christophe; Nigg, Claudio; Ninot, Gregory
2014-04-01
Processes of change (POC) are constructs from the transtheoretical model that propose to examine how people engage in a behavior. However, there is no consensus about a leading model explaining POC and there is no validated French POC scale in physical activity This study aimed to compare the different existing models to validate a French POC scale. Three studies, with 748 subjects included, were carried out to translate the items and evaluate their clarity (study 1, n = 77), to assess the factorial validity (n = 200) and invariance/equivalence (study 2, n = 471), and to analyze the concurrent validity by stage × process analyses (study 3, n = 671). Two models displayed adequate fit to the data; however, based on the Akaike information criterion, the fully correlated five-factor model appeared as the most appropriate to measure POC in physical activity. The invariance/equivalence was also confirmed across genders and student status. Four of the five existing factors discriminated pre-action and post-action stages. These data support the validation of the POC questionnaire in physical activity among a French sample. More research is needed to explore the longitudinal properties of this scale.
Piette, Elizabeth R; Moore, Jason H
2018-01-01
Machine learning methods and conventions are increasingly employed for the analysis of large, complex biomedical data sets, including genome-wide association studies (GWAS). Reproducibility of machine learning analyses of GWAS can be hampered by biological and statistical factors, particularly so for the investigation of non-additive genetic interactions. Application of traditional cross validation to a GWAS data set may result in poor consistency between the training and testing data set splits due to an imbalance of the interaction genotypes relative to the data as a whole. We propose a new cross validation method, proportional instance cross validation (PICV), that preserves the original distribution of an independent variable when splitting the data set into training and testing partitions. We apply PICV to simulated GWAS data with epistatic interactions of varying minor allele frequencies and prevalences and compare performance to that of a traditional cross validation procedure in which individuals are randomly allocated to training and testing partitions. Sensitivity and positive predictive value are significantly improved across all tested scenarios for PICV compared to traditional cross validation. We also apply PICV to GWAS data from a study of primary open-angle glaucoma to investigate a previously-reported interaction, which fails to significantly replicate; PICV however improves the consistency of testing and training results. Application of traditional machine learning procedures to biomedical data may require modifications to better suit intrinsic characteristics of the data, such as the potential for highly imbalanced genotype distributions in the case of epistasis detection. The reproducibility of genetic interaction findings can be improved by considering this variable imbalance in cross validation implementation, such as with PICV. This approach may be extended to problems in other domains in which imbalanced variable distributions are a concern.
ERIC Educational Resources Information Center
Watkins, Nicholas; Rapp, John T.
2013-01-01
Only a few studies have compared the convergent validity of the Questions About Behavioral Function (QABF) scale to the results of functional analyses (FA). In the current study, six participants who emitted problem behaviors participated in either a brief, or a no-interaction-series FA, while each participant's parent completed the QABF. The…
Boer, Annemarie; Dutmer, Alisa L; Schiphorst Preuper, Henrica R; van der Woude, Lucas H V; Stewart, Roy E; Deyo, Richard A; Reneman, Michiel F; Soer, Remko
2017-10-01
Validation study with cross-sectional and longitudinal measurements. To translate the US National Institutes of Health (NIH)-minimal dataset for clinical research on chronic low back pain into the Dutch language and to test its validity and reliability among people with chronic low back pain. The NIH developed a minimal dataset to encourage more complete and consistent reporting of clinical research and to be able to compare studies across countries in patients with low back pain. In the Netherlands, the NIH-minimal dataset has not been translated before and measurement properties are unknown. Cross-cultural validity was tested by a formal forward-backward translation. Structural validity was tested with exploratory factor analyses (comparative fit index, Tucker-Lewis index, and root mean square error of approximation). Hypothesis testing was performed to compare subscales of the NIH dataset with the Pain Disability Index and the EurQol-5D (Pearson correlation coefficients). Internal consistency was tested with Cronbach α and test-retest reliability at 2 weeks was calculated in a subsample of patients with Intraclass Correlation Coefficients and weighted Kappa (κω). In total, 452 patients were included of which 52 were included for the test-retest study. factor analysis for structural validity pointed into the direction of a seven-factor model (Cronbach α = 0.78). Factors and total score of the NIH-minimal dataset showed fair to good correlations with Pain Disability Index (r = 0.43-0.70) and EuroQol-5D (r = -0.41 to -0.64). Reliability: test-retest reliability per item showed substantial agreement (κω=0.65). Test-retest reliability per factor was moderate to good (Intraclass Correlation Coefficient = 0.71). The Dutch language version measurement properties of the NIH-minimal were satisfactory. N/A.
Validation of a multiplex electrochemiluminescent immunoassay platform in human and mouse samples
Bastarache, J.A.; Koyama, T.; Wickersham, N.E; Ware, L.B.
2014-01-01
Despite the widespread use of multiplex immunoassays, there are very few scientific reports that test the accuracy and reliability of a platform prior to publication of experimental data. Our laboratory has previously demonstrated the need for new assay platform validation prior to use of biologic samples from large studies in order to optimize sample handling and assay performance. In this study, our goal was to test the accuracy and reproducibility of an electrochemiluminescent multiplex immunoassay platform (Meso Scale Discovery, MSD®) and compare this platform to validated, singleplex immunoassays (R&D Systems®) using actual study subject (human plasma and mouse bronchoalveolar lavage fluid (BALF) and plasma) samples. We found that the MSD platform performed well on intra- and inter-assay comparisons, spike and recovery and cross-platform comparisons. The mean intra-assay CV% and range for MSD was 3.49 (0.0-10.4) for IL-6 and 2.04 (0.1-7.9) for IL-8. The correlation between values for identical samples measured on both MSD and R&D was R=0.97 for both analytes. The mouse MSD assay had a broader range of CV% with means ranging from 9.5-28.5 depending on the analyte. The range of mean CV% was similar for single plex ELISAs at 4.3-23.7 depending on the analyte. Regardless of species or sample type, CV% was more variable at lower protein concentrations. In conclusion, we validated a multiplex electrochemiluminscent assay system and found that it has superior test characteristics in human plasma compared to mouse BALF and plasma. Both human and MSD assays compared favorably to well-validated singleplex ELISA's PMID:24768796
Errors in reporting on dissolution research: methodological and statistical implications.
Jasińska-Stroschein, Magdalena; Kurczewska, Urszula; Orszulak-Michalak, Daria
2017-02-01
In vitro dissolution testing provides useful information at clinical and preclinical stages of the drug development process. The study includes pharmaceutical papers on dissolution research published in Polish journals between 2010 and 2015. They were analyzed with regard to information provided by authors about chosen methods, performed validation, statistical reporting or assumptions used to properly compare release profiles considering the present guideline documents addressed to dissolution methodology and its validation. Of all the papers included in the study, 23.86% presented at least one set of validation parameters, 63.64% gave the results of the weight uniformity test, 55.68% content determination, 97.73% dissolution testing conditions, and 50% discussed a comparison of release profiles. The assumptions for methods used to compare dissolution profiles were discussed in 6.82% of papers. By means of example analyses, we demonstrate that the outcome can be influenced by the violation of several assumptions or selection of an improper method to compare dissolution profiles. A clearer description of the procedures would undoubtedly increase the quality of papers in this area.
ERIC Educational Resources Information Center
Wade, Ros; Corbett, Mark; Eastwood, Alison
2013-01-01
Assessing the quality of included studies is a vital step in undertaking a systematic review. The recently revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool (QUADAS-2), which is the only validated quality assessment tool for diagnostic accuracy studies, does not include specific criteria for assessing comparative studies. As…
[Development of a Japanese version of a short form of the Profile of Emotional Competence].
Nozaki, Yuki; Koyasu, Masuo
2015-06-01
Emotional competence refers to individual differences in the ability to appropriately identity, understand, express, regulate, and utilize one's own emotions and those of others. This study developed a Japanese version of a short form of the Profile of Emotional Competence, a measure that allows the comprehensive assessment of intra- and interpersonal emotional competence with shorter items, and investigated its reliability and validity. In Study 1, we selected items for a short version and compared it with the full scale in terms of scores, internal consistency, and validity. In Study 2, we examined the short form's test-retest reliability. Results supported the original two-factor model and the measure had adequate reliability and validity. We discuss the construct validity and practical applicability of the short form of the Profile of Emotional Competence.
Translation and validation of the German version of the Bournemouth Questionnaire for Neck Pain.
Soklic, Marina; Peterson, Cynthia; Humphreys, B Kim
2012-01-25
Clinical outcome measures are important tools to monitor patient improvement during treatment as well as to document changes for research purposes. The short-form Bournemouth questionnaire for neck pain patients (BQN) was developed from the biopsychosocial model and measures pain, disability, cognitive and affective domains. It has been shown to be a valid and reliable outcome measure in English, French and Dutch and more sensitive to change compared to other questionnaires. The purpose of this study was to translate and validate a German version of the Bournemouth questionnaire for neck pain patients. German translation and back translation into English of the BQN was done independently by four persons and overseen by an expert committee. Face validity of the German BQN was tested on 30 neck pain patients in a single chiropractic practice. Test-retest reliability was evaluated on 31 medical students and chiropractors before and after a lecture. The German BQN was then assessed on 102 first time neck pain patients at two chiropractic practices for internal consistency, external construct validity, external longitudinal construct validity and sensitivity to change compared to the German versions of the Neck Disability Index (NDI) and the Neck Pain and Disability Scale (NPAD). Face validity testing lead to minor changes to the German BQN. The Intraclass Correlation Coefficient for the test-retest reliability was 0.99. The internal consistency was strong for all 7 items of the BQN with Cronbach α's of .79 and .80 for the pre and post-treatment total scores. External construct validity and external longitudinal construct validity using Pearson's correlation coefficient showed statistically significant correlations for all 7 scales of the BQN with the other questionnaires. The German BQN showed greater responsiveness compared to the other questionnaires for all scales. The German BQN is a valid and reliable outcome measure that has been successfully translated and culturally adapted. It is shorter, easier to use, and more responsive to change than the NDI and NPAD.
Clark, Ross A; Paterson, Kade; Ritchie, Callan; Blundell, Simon; Bryant, Adam L
2011-03-01
Commercial timing light systems (CTLS) provide precise measurement of athletes running velocity, however they are often expensive and difficult to transport. In this study an inexpensive, wireless and portable timing light system was created using the infrared camera in Nintendo Wii hand controllers (NWHC). System creation with gold-standard validation. A Windows-based software program using NWHC to replicate a dual-beam timing gate was created. Firstly, data collected during 2m walking and running trials were validated against a 3D kinematic system. Secondly, data recorded during 5m running trials at various intensities from standing or flying starts were compared to a single beam CTLS and the independent and average scores of three handheld stopwatch (HS) operators. Intraclass correlation coefficient and Bland-Altman plots were used to assess validity. Absolute error quartiles and percentage of trials in absolute error threshold ranges were used to determine accuracy. The NWHC system was valid when compared against the 3D kinematic system (ICC=0.99, median absolute error (MAR)=2.95%). For the flying 5m trials the NWHC system possessed excellent validity and precision (ICC=0.97, MAR<3%) when compared with the CTLS. In contrast, the NWHC system and the HS values during standing start trials possessed only modest validity (ICC<0.75) and accuracy (MAR>8%). A NWHC timing light system is inexpensive, portable and valid for assessing running velocity. Errors in the 5m standing start trials may have been due to erroneous event detection by either the commercial or NWHC-based timing light systems. Copyright © 2010 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Validation of verbal autopsy methods using hospital medical records: a case study in Vietnam.
Tran, Hong Thi; Nguyen, Hoa Phuong; Walker, Sue M; Hill, Peter S; Rao, Chalapati
2018-05-18
Information on causes of death (COD) is crucial for measuring the health outcomes of populations and progress towards the Sustainable Development Goals. In many countries such as Vietnam where the civil registration and vital statistics (CRVS) system is dysfunctional, information on vital events will continue to rely on verbal autopsy (VA) methods. This study assesses the validity of VA methods used in Vietnam, and provides recommendations on methods for implementing VA validation studies in Vietnam. This validation study was conducted on a sample of 670 deaths from a recent VA study in Quang Ninh province. The study covered 116 cases from this sample, which met three inclusion criteria: a) the death occurred within 30 days of discharge after last hospitalisation, and b) medical records (MRs) for the deceased were available from respective hospitals, and c) the medical record mentioned that the patient was terminally ill at discharge. For each death, the underlying cause of death (UCOD) identified from MRs was compared to the UCOD from VA. The validity of VA diagnoses for major causes of death was measured using sensitivity, specificity and positive predictive value (PPV). The sensitivity of VA was at least 75% in identifying some leading CODs such as stroke, road traffic accidents and several site-specific cancers. However, sensitivity was less than 50% for other important causes including ischemic heart disease, chronic obstructive pulmonary diseases, and diabetes. Overall, there was 57% agreement between UCOD from VA and MR, which increased to 76% when multiple causes from VA were compared to UCOD from MR. Our findings suggest that VA is a valid method to ascertain UCOD in contexts such as Vietnam. Furthermore, within cultural contexts in which patients prefer to die at home instead of a healthcare facility, using the available MRs as the gold standard may be meaningful to the extent that recall bias from the interval between last hospital discharge and death can be minimized. Therefore, future studies should evaluate validity of MRs as a gold standard for VA studies in contexts similar to the Vietnamese context.
Reliability and validity of the Safe Routes to school parent and student surveys
2011-01-01
Background The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Methods Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. Results A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n = 112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62 - 0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31 - 0.76). Conclusions The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their children's mode of travel to school. PMID:21651794
Roozenbeek, Bob; Lingsma, Hester F.; Lecky, Fiona E.; Lu, Juan; Weir, James; Butcher, Isabella; McHugh, Gillian S.; Murray, Gordon D.; Perel, Pablo; Maas, Andrew I.R.; Steyerberg, Ewout W.
2012-01-01
Objective The International Mission on Prognosis and Analysis of Clinical Trials (IMPACT) and Corticoid Randomisation After Significant Head injury (CRASH) prognostic models predict outcome after traumatic brain injury (TBI) but have not been compared in large datasets. The objective of this is study is to validate externally and compare the IMPACT and CRASH prognostic models for prediction of outcome after moderate or severe TBI. Design External validation study. Patients We considered 5 new datasets with a total of 9036 patients, comprising three randomized trials and two observational series, containing prospectively collected individual TBI patient data. Measurements Outcomes were mortality and unfavourable outcome, based on the Glasgow Outcome Score (GOS) at six months after injury. To assess performance, we studied the discrimination of the models (by AUCs), and calibration (by comparison of the mean observed to predicted outcomes and calibration slopes). Main Results The highest discrimination was found in the TARN trauma registry (AUCs between 0.83 and 0.87), and the lowest discrimination in the Pharmos trial (AUCs between 0.65 and 0.71). Although differences in predictor effects between development and validation populations were found (calibration slopes varying between 0.58 and 1.53), the differences in discrimination were largely explained by differences in case-mix in the validation studies. Calibration was good, the fraction of observed outcomes generally agreed well with the mean predicted outcome. No meaningful differences were noted in performance between the IMPACT and CRASH models. More complex models discriminated slightly better than simpler variants. Conclusions Since both the IMPACT and the CRASH prognostic models show good generalizability to more recent data, they are valid instruments to quantify prognosis in TBI. PMID:22511138
Elvan-Taşpinar, Ayten; Uiterkamp, Leonore A; Sikkema, J Marko; Bots, Michiel L; Koomans, Hein A; Bruinse, Hein W; Franx, Arie
2003-11-01
Although a large variety of automated blood pressure devices are available, only some have been validated for use in clinical practice. The British Hypertension Society (BHS) recommends separate validation of automated devices in special subgroups, e.g. the elderly and pregnant women. The aim of this study was to compare the Finometer (FM) and the earlier validated SpaceLabs 90207 (SL) with standard auscultatory blood pressure measurements in normal, pre-eclamptic and hypertensive pregnancy, following the guidelines of the BHS and the Association for the Advancement of Medical Instrumentation (AAMI). The total study group consisted of 123 pregnant women, of whom were 54 normotensive, 31 pre-eclamptic and 38 hypertensive. Automated readings with the FM and SL were compared with auscultatory blood pressure measurements. Bland-Altman plots, BHS grades, mean pressure differences and 95% limits of agreement were used for analysis. Bland-Altman plots showed a wide scatter of the pressure differences between auscultatory and automated measurements. FM achieved BHS grades C/D, C/B, D/D and D/D in the total, normotensive, pre-eclamptic and hypertensive group, respectively. The AAMI criteria were only met for diastolic blood pressure in the normotensive group. For SL almost identical BHS grades and 95% limits of agreement as compared to our earlier study were found. The accuracy and precision of the Finometer are not sufficient for determination of absolute blood pressure levels in individual pregnant women. Our present findings on the SpaceLabs 90207 reconfirm our earlier results.
Beutel, Manfred E; Brähler, Elmar; Wiltink, Jörg; Michal, Matthias; Klein, Eva M; Jünger, Claus; Wild, Philipp S; Münzel, Thomas; Blettner, Maria; Lackner, Karl; Nickels, Stefan; Tibubos, Ana N
2017-01-01
Aim of the study was the development and validation of the psychometric properties of a six-item bi-factorial instrument for the assessment of social support (emotional and tangible support) with a population-based sample. A cross-sectional data set of N = 15,010 participants enrolled in the Gutenberg Health Study (GHS) in 2007-2012 was divided in two sub-samples. The GHS is a population-based, prospective, observational single-center cohort study in the Rhein-Main-Region in western Mid-Germany. The first sub-sample was used for scale development by performing an exploratory factor analysis. In order to test construct validity, confirmatory factor analyses were run to compare the extracted bi-factorial model with the one-factor solution. Reliability of the scales was indicated by calculating internal consistency. External validity was tested by investigating demographic characteristics health behavior, and distress using analysis of variance, Spearman and Pearson correlation analysis, and logistic regression analysis. Based on an exploratory factor analysis, a set of six items was extracted representing two independent factors. The two-factor structure of the Brief Social Support Scale (BS6) was confirmed by the results of the confirmatory factor analyses. Fit indices of the bi-factorial model were good and better compared to the one-factor solution. External validity was demonstrated for the BS6. The BS6 is a reliable and valid short scale that can be applied in social surveys due to its brevity to assess emotional and practical dimensions of social support.
Validation of Nutritional Risk Screening-2002 in a Hospitalized Adult Population.
Bolayir, Başak; Arik, Güneş; Yeşil, Yusuf; Kuyumcu, Mehmet Emin; Varan, Hacer Doğan; Kara, Özgür; Güngör, Anil Evrim; Yavuz, Burcu Balam; Cankurtaran, Mustafa; Halil, Meltem Gülhan
2018-03-30
Malnutrition in hospitalized patients is a serious problem and is associated with a number of adverse outcomes. The Nutritional Risk Screening-2002 (NRS-2002) tool was designed to identify patients at nutrition risk. The validation of NRS-2002 compared with detailed clinical assessment of nutrition status was not studied before in hospitalized Turkish adults. The aim of this study is to determine validity, sensitivity, and specificity of the Turkish version of NRS-2002 in a hospitalized adult population. A total of 271 consecutive hospitalized patients aged >18 years admitted to surgical and medical wards of a university hospital in Turkey were included in this single-center non interventional validity study. Assessment by geriatricians was used as the reference method. Two geriatricians experienced in the field of malnutrition interpreted the patients' nutrition status after the evaluation of several parameters. Patients were divided into "at nutrition risk" and "not at nutrition risk" groups by geriatricians. Concordance between the 2 geriatricians' clinical assessments was analyzed by κ statistics. Excellent concordance was found; therefore, the first geriatrician's decisions were accepted as the gold standard. The correlation of nutrition status of the patients, determined with NRS-2002 and experienced geriatrician's decisions, was evaluated for the validity. NRS-2002 has a sensitivity of 88% and specificity of 92% when compared with professional assessment. The positive and negative predictive values were 87% and 92%, respectively. Testretest agreement was excellent as represented by a κ coefficient of 0.956. NRS-2002 is a valid tool to assess malnutrition risk in Turkish hospitalized patients. © 2018 American Society for Parenteral and Enteral Nutrition.
Romero-Franco, Natalia; Jiménez-Reyes, Pedro; Montaño-Munuera, Juan A
2017-11-01
Lower limb isometric strength is a key parameter to monitor the training process or recognise muscle weakness and injury risk. However, valid and reliable methods to evaluate it often require high-cost tools. The aim of this study was to analyse the concurrent validity and reliability of a low-cost digital dynamometer for measuring isometric strength in lower limb. Eleven physically active and healthy participants performed maximal isometric strength for: flexion and extension of ankle, flexion and extension of knee, flexion, extension, adduction, abduction, internal and external rotation of hip. Data obtained by the digital dynamometer were compared with the isokinetic dynamometer to examine its concurrent validity. Data obtained by the digital dynamometer from 2 different evaluators and 2 different sessions were compared to examine its inter-rater and intra-rater reliability. Intra-class correlation (ICC) for validity was excellent in every movement (ICC > 0.9). Intra and inter-tester reliability was excellent for all the movements assessed (ICC > 0.75). The low-cost digital dynamometer demonstrated strong concurrent validity and excellent intra and inter-tester reliability for assessing isometric strength in the main lower limb movements.
Faurholt-Jepsen, Maria; Munkholm, Klaus; Frost, Mads; Bardram, Jakob E; Kessing, Lars Vedel
2016-01-15
Various paper-based mood charting instruments are used in the monitoring of symptoms in bipolar disorder. During recent years an increasing number of electronic self-monitoring tools have been developed. The objectives of this systematic review were 1) to evaluate the validity of electronic self-monitoring tools as a method of evaluating mood compared to clinical rating scales for depression and mania and 2) to investigate the effect of electronic self-monitoring tools on clinically relevant outcomes in bipolar disorder. A systematic review of the scientific literature, reported according to the Preferred Reporting items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines was conducted. MEDLINE, Embase, PsycINFO and The Cochrane Library were searched and supplemented by hand search of reference lists. Databases were searched for 1) studies on electronic self-monitoring tools in patients with bipolar disorder reporting on validity of electronically self-reported mood ratings compared to clinical rating scales for depression and mania and 2) randomized controlled trials (RCT) evaluating electronic mood self-monitoring tools in patients with bipolar disorder. A total of 13 published articles were included. Seven articles were RCTs and six were longitudinal studies. Electronic self-monitoring of mood was considered valid compared to clinical rating scales for depression in six out of six studies, and in two out of seven studies compared to clinical rating scales for mania. The included RCTs primarily investigated the effect of heterogeneous electronically delivered interventions; none of the RCTs investigated the sole effect of electronic mood self-monitoring tools. Methodological issues with risk of bias at different levels limited the evidence in the majority of studies. Electronic self-monitoring of mood in depression appears to be a valid measure of mood in contrast to self-monitoring of mood in mania. There are yet few studies on the effect of electronic self-monitoring of mood in bipolar disorder. The evidence of electronic self-monitoring is limited by methodological issues and by a lack of RCTs. Although the idea of electronic self-monitoring of mood seems appealing, studies using rigorous methodology investigating the beneficial as well as possible harmful effects of electronic self-monitoring are needed.
Sysko, Robyn; Glasofer, Deborah R.; Hildebrandt, Tom; Klimek, Patrycja; Mitchell, James E.; Berg, Kelly C.; Peterson, Carol B.; Wonderlich, Stephen A.; Walsh, B. Timothy
2016-01-01
Objective Existing measures for DSM-IV eating disorder diagnoses have notable limitations, and there are important differences between DSM-IV and DSM-5 feeding and eating disorders. This study developed and validated a new semi-structured interview, the Eating Disorders Assessment for DSM-5 (EDA-5). Method Two studies evaluated the utility of the EDA-5. Study 1 compared the diagnostic validity of the EDA-5 to the Eating Disorder Examination (EDE) and evaluated the test-retest reliability of the new measure. Study 2 compared the diagnostic validity of an EDA-5 electronic application (“app”) to clinician interview and self-report assessments. Results In Study 1, the kappa for EDE and EDA-5 eating disorder diagnoses was 0.74 across all diagnoses (n= 64), with a range of κ=0.65 for Other Specified Feeding or Eating Disorder (OSFED)/Unspecified Feeding or Eating Disorder (USFED) to κ=0.90 for Binge Eating Disorder (BED). The EDA-5 test-retest kappa coefficient was 0.87 across diagnoses. For Study 2, clinical interview versus “app” conditions revealed a kappa of 0.83 for all eating disorder diagnoses (n=71). Across individual diagnostic categories, kappas ranged from 0.56 for OSFED/USFED to 0.94 for BN. Discussion High rates of agreement were found between diagnoses by EDA-5 and the EDE, and EDA-5 and clinical interviews. As this study supports the validity of the EDA-5 to generate DSM-5 eating disorders and the reliability of these diagnoses, the EDA-5 may be an option for the assessment of Anorexia Nervosa, Bulimia Nervosa, and BED. Additional research is needed to evaluate the utility of the EDA-5 in assessing DSM-5 feeding disorders. PMID:25639562
Białek, Michał; Markiewicz, Łukasz; Sawicki, Przemysław
2015-01-01
The delayed lotteries are much more common in everyday life than are pure lotteries. Usually, we need to wait to find out the outcome of the risky decision (e.g., investing in a stock market, engaging in a relationship). However, most research has studied the time discounting and probability discounting in isolation using the methodologies designed specifically to track changes in one parameter. Most commonly used method is adjusting, but its reported validity and time stability in research on discounting are suboptimal. The goal of this study was to introduce the novel method for analyzing delayed lotteries—conjoint analysis—which hypothetically is more suitable for analyzing individual preferences in this area. A set of two studies compared the conjoint analysis with adjusting. The results suggest that individual parameters of discounting strength estimated with conjoint have higher predictive value (Study 1 and 2), and they are more stable over time (Study 2) compared to adjusting. We discuss these findings, despite the exploratory character of reported studies, by suggesting that future research on delayed lotteries should be cross-validated using both methods. PMID:25674069
Enhancement of CFD validation exercise along the roof profile of a low-rise building
NASA Astrophysics Data System (ADS)
Deraman, S. N. C.; Majid, T. A.; Zaini, S. S.; Yahya, W. N. W.; Abdullah, J.; Ismail, M. A.
2018-04-01
The aim of this study is to enhance the validation of CFD exercise along the roof profile of a low-rise building. An isolated gabled-roof house having 26.6° roof pitch was simulated to obtain the pressure coefficient around the house. Validation of CFD analysis with experimental data requires many input parameters. This study performed CFD simulation based on the data from a previous study. Where the input parameters were not clearly stated, new input parameters were established from the open literatures. The numerical simulations were performed in FLUENT 14.0 by applying the Computational Fluid Dynamics (CFD) approach based on steady RANS equation together with RNG k-ɛ model. Hence, the result from CFD was analysed by using quantitative test (statistical analysis) and compared with CFD results from the previous study. The statistical analysis results from ANOVA test and error measure showed that the CFD results from the current study produced good agreement and exhibited the closest error compared to the previous study. All the input data used in this study can be extended to other types of CFD simulation involving wind flow over an isolated single storey house.
A semi-automatic method for left ventricle volume estimate: an in vivo validation study
NASA Technical Reports Server (NTRS)
Corsi, C.; Lamberti, C.; Sarti, A.; Saracino, G.; Shiota, T.; Thomas, J. D.
2001-01-01
This study aims to the validation of the left ventricular (LV) volume estimates obtained by processing volumetric data utilizing a segmentation model based on level set technique. The validation has been performed by comparing real-time volumetric echo data (RT3DE) and magnetic resonance (MRI) data. A validation protocol has been defined. The validation protocol was applied to twenty-four estimates (range 61-467 ml) obtained from normal and pathologic subjects, which underwent both RT3DE and MRI. A statistical analysis was performed on each estimate and on clinical parameters as stroke volume (SV) and ejection fraction (EF). Assuming MRI estimates (x) as a reference, an excellent correlation was found with volume measured by utilizing the segmentation procedure (y) (y=0.89x + 13.78, r=0.98). The mean error on SV was 8 ml and the mean error on EF was 2%. This study demonstrated that the segmentation technique is reliably applicable on human hearts in clinical practice.
Mills, Sarah D; Kwakkenbos, Linda; Carrier, Marie-Eve; Gholizadeh, Shadi; Fox, Rina S; Jewett, Lisa R; Gottesman, Karen; Roesch, Scott C; Thombs, Brett D; Malcarne, Vanessa L
2018-01-17
Systemic sclerosis (SSc) is an autoimmune disease that can cause disfiguring changes in appearance. This study examined the structural validity, internal consistency reliability, convergent validity, and measurement equivalence of the Social Appearance Anxiety Scale (SAAS) across SSc disease subtypes. Patients enrolled in the Scleroderma Patient-centered Intervention Network Cohort completed the SAAS and measures of appearance-related concerns and psychological distress. Confirmatory factor analysis (CFA) was used to examine the structural validity of the SAAS. Multiple-group CFA was used to determine if SAAS scores can be compared across patients with limited and diffuse disease subtypes. Cronbach's alpha was used to examine internal consistency reliability. Correlations of SAAS scores with measures of body image dissatisfaction, fear of negative evaluation, social anxiety, and depression were used to examine convergent validity. SAAS scores were hypothesized to be positively associated with all convergent validity measures, with correlations significant and moderate to large in size. A total of 938 patients with SSc were included. CFA supported a one-factor structure (CFI: .92; SRMR: .04; RMSEA: .08), and multiple-group CFA indicated that the scalar invariance model best fit the data. Internal consistency reliability was good in the total sample (α = .96) and in disease subgroups. Overall, evidence of convergent validity was found with measures of body image dissatisfaction, fear of negative evaluation, social anxiety, and depression. The SAAS can be reliably and validly used to assess fear of appearance evaluation in patients with SSc, and SAAS scores can be meaningfully compared across disease subtypes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Van Lerbeirghe, J; Van Lerbeirghe, J; Van Schaeybroeck, P; Robijn, H; Rasschaert, R; Sys, J; Parlevliet, T; Hallaert, G; Van Wambeke, P; Depreitere, B
2018-01-01
The core outcome measures index (COMI) is a validated multidimensional instrument for assessing patient-reported outcome in patients with back problems. The aim of the present study is to translate the COMI into Dutch and validate it for use in native Dutch speakers with low back pain. The COMI was translated into Dutch following established guidelines and avoiding region-specific terminology. A total of 89 Dutch-speaking patients with low back pain were recruited from 8 centers, located in the Dutch-speaking part of Belgium. Patients completed a questionnaire booklet including the validated Dutch version of the Roland Morris disability questionnaire, EQ-5D, the WHOQoL-Bref, the Numeric Rating Scale (NRS) for pain, and the Dutch translation of the COMI. Two weeks later, patients completed the Dutch COMI translation again, with a transition scale assessing changes in their condition. The patterns of correlations between the individual COMI items and the validated reference questionnaires were comparable to those reported for other validated language versions of the COMI. The intraclass correlation for the COMI summary score was 0.90 (95% CI 0.84-0.94). It was 0.75 and 0.70 for the back and leg pain score, respectively. The minimum detectable change for the COMI summary score was 1.74. No significant differences were observed between repeated scores of individual COMI items or for the summary score. The reproducibility of the Dutch translation of the COMI is comparable to that of other validated spine outcome measures. The COMI items correlate well with the established item-specific scores. The Dutch translation of the COMI, validated by this work, is a reliable and valuable tool for spine centers treating Dutch-speaking patients and can be used in registries and outcome studies.
Yousuf, Naveed; Violato, Claudio; Zuberi, Rukhsana W
2015-01-01
CONSTRUCT: Authentic standard setting methods will demonstrate high convergent validity evidence of their outcomes, that is, cutoff scores and pass/fail decisions, with most other methods when compared with each other. The objective structured clinical examination (OSCE) was established for valid, reliable, and objective assessment of clinical skills in health professions education. Various standard setting methods have been proposed to identify objective, reliable, and valid cutoff scores on OSCEs. These methods may identify different cutoff scores for the same examinations. Identification of valid and reliable cutoff scores for OSCEs remains an important issue and a challenge. Thirty OSCE stations administered at least twice in the years 2010-2012 to 393 medical students in Years 2 and 3 at Aga Khan University are included. Psychometric properties of the scores are determined. Cutoff scores and pass/fail decisions of Wijnen, Cohen, Mean-1.5SD, Mean-1SD, Angoff, borderline group and borderline regression (BL-R) methods are compared with each other and with three variants of cluster analysis using repeated measures analysis of variance and Cohen's kappa. The mean psychometric indices on the 30 OSCE stations are reliability coefficient = 0.76 (SD = 0.12); standard error of measurement = 5.66 (SD = 1.38); coefficient of determination = 0.47 (SD = 0.19), and intergrade discrimination = 7.19 (SD = 1.89). BL-R and Wijnen methods show the highest convergent validity evidence among other methods on the defined criteria. Angoff and Mean-1.5SD demonstrated least convergent validity evidence. The three cluster variants showed substantial convergent validity with borderline methods. Although there was a high level of convergent validity of Wijnen method, it lacks the theoretical strength to be used for competency-based assessments. The BL-R method is found to show the highest convergent validity evidences for OSCEs with other standard setting methods used in the present study. We also found that cluster analysis using mean method can be used for quality assurance of borderline methods. These findings should be further confirmed by studies in other settings.
NASA Astrophysics Data System (ADS)
Sawtelle, Vashti; Brewe, Eric; Kramer, Laird
2009-12-01
The Colorado Learning Attitudes about Science Survey (CLASS) has been widely acknowledged as a useful measure of student cognitive attitudes about science and learning. The initial University of Colorado validation study included only 20% non-Caucasian student populations. In this Brief Report we extend their validation to include a predominately under-represented minority population. We validated the CLASS instrument at Florida International University, a Hispanic-serving institution, by interviewing students in introductory physics classes using a semistructured protocol, examining students’ responses on the CLASS item statements, and comparing them to the items’ intended meaning. We find that in our predominately Hispanic population, 94% of the students’ interview responses indicate that the students interpret the CLASS items correctly, and thus the CLASS is a valid instrument. We also identify one potentially problematic item in the instrument which one third of the students interviewed consistently misinterpreted.
Pournik, Omid; Ghalichi, Leila; TehraniYazdi, Alireza; Tabatabaee, Seyed Mohammad; Ghaffari, Mostafa; Vingard, Eva
2015-01-01
Background: The effect of psychosocial work environment on personal and organizational aspects of employees is well-known; and it is of fundamental importance to have valid tools to evaluate them. This study aims to evaluate the reliability and validity of the Persian version of Copenhagen Psychosocial Questionnaire (COPSOQ). Methods: The questionnaire was translated into Persian and then back translated into English by two translators separately. The wording of the final Persian version was established by comparing the translated versions with the original questionnaire. One hundred three health care workers completed the questionnaire. Chronbach’s alpha was calculated, and factor analysis was performed. Results: Factor analysis revealed acceptable validity for the five contexts of the questionnaire. Cronbach’s alpha ranged from 0.73 to 0.82 in different contexts. Conclusion: This study revealed that the Persian version of COPSOQ is a reliable and valid instrument for measuring psychosocial factors at work. PMID:26478879
Development and Initial Validation of the Multicultural Personality Inventory (MPI).
Ponterotto, Joseph G; Fietzer, Alexander W; Fingerhut, Esther C; Woerner, Scott; Stack, Lauren; Magaldi-Dopman, Danielle; Rust, Jonathan; Nakao, Gen; Tsai, Yu-Ting; Black, Natasha; Alba, Renaldo; Desai, Miraj; Frazier, Chantel; LaRue, Alyse; Liao, Pei-Wen
2014-01-01
Two studies summarize the development and initial validation of the Multicultural Personality Inventory (MPI). In Study 1, the 115-item prototype MPI was administered to 415 university students where exploratory factor analysis resulted in a 70-item, 7-factor model. In Study 2, the 70-item MPI and theoretically related companion instruments were administered to a multisite sample of 576 university students. Confirmatory factory analysis found the 7-factor structure to be a relatively good fit to the data (Comparative Fit Index =.954; root mean square error of approximation =.057), and MPI factors predicted variance in criterion variables above and beyond the variance accounted for by broad personality traits (i.e., Big Five). Study limitations and directions for further validation research are specified.
Three DIBELS Tasks vs. Three Informal Reading/Spelling Tasks: A Comparison of Predictive Validity
ERIC Educational Resources Information Center
Morris, Darrell; Trathen, Woodrow; Perney, Jan; Gill, Tom; Schlagal, Robert; Ward, Devery; Frye, Elizabeth M.
2017-01-01
Within a developmental framework, this study compared the predictive validity of three DIBELS tasks (phoneme segmentation fluency [PSF], nonsense word fluency [NWF], and oral reading fluency [ORF]) with that of three alternative tasks drawn from the field of reading (phonemic spelling [phSPEL], word recognition-timed [WR-t], and graded passage…
Developing a Brief Cross-Culturally Validated Screening Tool for Externalizing Disorders in Children
ERIC Educational Resources Information Center
Zwirs, Barbara W. C.; Burger, Huibert; Schulpen, Tom W. J.; Buitelaar, Jan K.
2008-01-01
The study aims at developing and validating a brief, easy-to-use screening instrument for teachers to predict externalizing disorders in children and recommending them for timely referral. The scores are compared between Dutch and non-Dutch immigrant children and a significant amount of cases for externalizing disorders were identified but sex and…
Reliability and Validity of the Spanish Adaptation of EOSS, Comparing Normal and Clinical Samples
ERIC Educational Resources Information Center
Valero-Aguayo, Luis; Ferro-Garcia, Rafael; Lopez-Bermudez, Miguel Angel; de Huralde, Ma. Angeles Selva-Lopez
2012-01-01
The Experiencing of Self Scale (EOSS) was created for the evaluation of Functional Analytic Psychotherapy (Kohlenberg & Tsai, 1991, 2001, 2008) in relation to the concept of the experience of personal self as socially and verbally constructed. This paper presents a reliability and validity study of the EOSS with a Spanish sample (582…
COCOA: A New Validated Instrument to Assess Medical Students' Attitudes towards Older Adults
ERIC Educational Resources Information Center
Hollar, David; Roberts, Ellen; Busby-Whitehead, Jan
2011-01-01
This study tested the reliability and validity of the Carolina Opinions on Care of Older Adults (COCOA) survey compared with the Geriatric Assessment Survey (GAS). Participants were first year medical students (n = 160). A Linear Structural Relations (LISREL) measurement model for COCOA had a moderately strong fit that was significantly better…
The Validity of the Child and Adolescent Needs and Strengths Assessment
ERIC Educational Resources Information Center
Dilley, Joseph B.; Weiner, Dana A.; Lyons, John S.; Martinovich, Zoran
2007-01-01
The Child and Adolescent Needs and Strengths (CANS) is a functional assessment used in approximately 27 states to evaluate youth service outcomes. The CANS purports to measure both the youth's risk and protective factors, but its validity is largely un-researched. This study compares ratings of 304 delinquent youth on the CANS and ratings on a…
The Content Validity of Juvenile Psychopathy: An Empirical Examination
ERIC Educational Resources Information Center
Lynam, Donald R.; Derefinko, Karen J.; Caspi, Avshalom; Loeber, Rolf; Stouthamer-Loeber, Magda
2007-01-01
This study examined the content validity of a juvenile psychopathy measure, the Childhood Psychopathy Scale (CPS; D. R. Lynam, 1997), based on a downward translation of an adult instrument, the Hare Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991). The CPS was compared with two other indices of juvenile psychopathy: (a) an index derived…
ERIC Educational Resources Information Center
George-Ezzelle, Carol E.; Skaggs, Gary
2004-01-01
Current testing standards call for test developers to provide evidence that testing procedures and test scores, and the inferences made based on the test scores, show evidence of validity and are comparable across subpopulations (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on…
ERIC Educational Resources Information Center
Räisänen, Milla; Tuononen, Tarja; Postareff, Liisa; Hailikari, Telle; Virtanen, Viivi
2016-01-01
This case study explores the assessment of students' learning outcomes in a second-year lecture course in biosciences. The aim is to deeply explore the teacher's and the students' experiences of the validity and reliability of assessment and to compare those perspectives. The data were collected through stimulated recall interviews. The results…
Assessing the Validity of a Single-Item HIV Risk Stage-of-Change Measure
ERIC Educational Resources Information Center
Napper, Lucy E.; Branson, Catherine M.; Fisher, Dennis G.; Reynolds, Grace L.; Wood, Michelle M.
2008-01-01
This study examined the validity of a single-item measure of HIV risk stage of change that HIV prevention contractors were required to collect by the California State Office of AIDS. The single-item measure was compared to the more conventional University of Rhode Island Change Assessment (URICA). Participants were members of Los Angeles…
ERIC Educational Resources Information Center
Kramer, Gene A.; Johnston, JoElle
1997-01-01
A study examined the relationship between Optometry Admission Test scores and pre-optometry or undergraduate grade point average (GPA) with first and second year performance in optometry schools. The test's predictive validity was limited but significant, and comparable to those reported for other admission tests. In addition, the scores…
ERIC Educational Resources Information Center
Hager, Erin R.; Treuth, Margarita S.; Gormely, Candice; Epps, LaShawna; Snitker, Soren; Black, Maureen M.
2015-01-01
Purpose: Ankle accelerometry allows for 24-hr data collection and improves data volume/integrity versus hip accelerometry. Using Actical ankle accelerometry, the purpose of this study was to (a) develop sensitive/specific thresholds, (b) examine validity/reliability, (c) compare new thresholds with those of the manufacturer, and (d) examine…
Assessment of beverage intake and hydration status.
Nissensohn, Mariela; López-Ufano, Marisa; Castro-Quezada, Itandehui; Serra-Majem, Lluis
2015-02-26
Water is the main constituent of the human body. It is involved in practically all its functions. It is particularly important for thermoregulation and in the physical and cognitive performance. Water balance reflects water intake and loss. Intake of water is done mainly through consumption of drinking water and beverages (70 to 80%) plus water containing foods (20 to 30%). Water loss is mainly due to excretion of water in urine, faeces and sweat. The interest in the type and quantity of beverage consumption is not new, and numerous approaches have been used to assess beverage intake, but the validity of these approaches has not been well established. There is no standardized questionnaire developed as a research tool for the evaluation of water intake in the general population. Sometimes, the information comes from different sources or from different methodological characteristics which raises problems of the comparability. In the European Union, current epidemiological studies that focus exclusively on beverage intake are scarce. Biomarkers of intake are able to objectively assess dietary intake/status without the bias of self-reported dietary intake errors and also overcome the problem of intra-individual diet variability. Furthermore, some methods of measuring dietary intake used biomarkers to validate the data it collects. Biological markers may offer advantages and be able to improve the estimates of dietary intake assessment, which impact into the statistical power of the study. There is a surprising paucity of studies that systematically examine the correlation of beverages intake and hydration biomarker in different populations. A pilot investigation was developed to evaluate the comparative validity and reliability of newly developed interactive multimedia (IMM) versions compared to validated paper-administered (PP) versions of the Hedrick et al. beverage questionnaire. The study showed that the IMM appears to be a valid and reliable measure to assess habitual beverage intake. Similar study was developed in China, but in this case, the use of Smartphone technology was employed for beverage assessment. The methodology for measuring beverage intake in population studies remains controversial. There are few validated and reproducible studies, so there is still lacking an ideal method (ie, short, easy to administer, inexpensive and accurate) in this regard. Clearly, this is an area of scientific interest that is still in development and seems to be very promising for improving health research. Copyright AULA MEDICA EDICIONES 2015. Published by AULA MEDICA. All rights reserved.
Shirasaki, Osamu; Asou, Yosuke; Takahashi, Yukio
2007-12-01
Owing to fast or stepwise cuff deflation, or measuring at places other than the upper arm, the clinical accuracy of most recent automated sphygmomanometers (auto-BPMs) cannot be validated by one-arm simultaneous comparison, which would be the only accurate validation method based on auscultation. Two main alternative methods are provided by current standards, that is, two-arm simultaneous comparison (method 1) and one-arm sequential comparison (method 2); however, the accuracy of these validation methods might not be sufficient to compensate for the suspicious accuracy in lateral blood pressure (BP) differences (LD) and/or BP variations (BPV) between the device and reference readings. Thus, the Japan ISO-WG for sphygmomanometer standards has been studying a new method that might improve validation accuracy (method 3). The purpose of this study is to determine the appropriateness of method 3 by comparing immunity to LD and BPV with those of the current validation methods (methods 1 and 2). The validation accuracy of the above three methods was assessed in human participants [N=120, 45+/-15.3 years (mean+/-SD)]. An oscillometric automated monitor, Omron HEM-762, was used as the tested device. When compared with the others, methods 1 and 3 showed a smaller intra-individual standard deviation of device error (SD1), suggesting their higher reproducibility of validation. The SD1 by method 2 (P=0.004) significantly correlated with the participant's BP, supporting our hypothesis that the increased SD of device error by method 2 is at least partially caused by essential BPV. Method 3 showed a significantly (P=0.0044) smaller interparticipant SD of device error (SD2), suggesting its higher interparticipant consistency of validation. Among the methods of validation of the clinical accuracy of auto-BPMs, method 3, which showed the highest reproducibility and highest interparticipant consistency, can be proposed as being the most appropriate.
The JaCVAM international validation study on the in vivo comet assay: Selection of test chemicals.
Morita, Takeshi; Uno, Yoshifumi; Honma, Masamitsu; Kojima, Hajime; Hayashi, Makoto; Tice, Raymond R; Corvi, Raffaella; Schechtman, Leonard
2015-07-01
The Japanese Center for the Validation of Alternative Methods (JaCVAM) sponsored an international prevalidation and validation study of the in vivo rat alkaline pH comet assay. The main objective of the study was to assess the sensitivity and specificity of the assay for correctly identifying genotoxic carcinogens, as compared with the traditional rat liver unscheduled DNA synthesis assay. Based on existing carcinogenicity and genotoxicity data and chemical class information, 90 chemicals were identified as primary candidates for use in the validation study. From these 90 chemicals, 46 secondary candidates and then 40 final chemicals were selected based on a sufficiency of carcinogenic and genotoxic data, differences in chemical class or genotoxic or carcinogenic mode of action (MOA), availability, price, and ease of handling. These 40 chemicals included 19 genotoxic carcinogens, 6 genotoxic non-carcinogens, 7 non-genotoxic carcinogens and 8 non-genotoxic non-carcinogens. "Genotoxicity" was defined as positive in the Ames mutagenicity test or in one of the standard in vivo genotoxicity tests (primarily the erythrocyte micronucleus assay). These chemicals covered various chemicals classes, MOAs, and genotoxicity profiles and were considered to be suitable for the purpose of the validation study. General principles of chemical selection for validation studies are discussed. Copyright © 2015 Elsevier B.V. All rights reserved.
Statistical Modeling of Natural Backgrounds in Hyperspectral LWIR Data
2016-09-06
extremely important for studying performance trades. First, we study the validity of this model using real hyperspectral data, and compare the relative...difficult to validate any statistical model created for a target of interest. However, since background measurements are plentiful, it is reasonable to...Golden, S., Less, D., Jin, X., and Rynes, P., “ Modeling and analysis of LWIR signature variability associated with 3d and BRDF effects,” 98400P (May 2016
Discriminant validity study of Achilles enthesis ultrasound.
Expósito Molinero, María Rosa; de Miguel Mendieta, Eugenio
2016-01-01
We want to know if the ultrasound examination of the Achilles tendon in spondyloarthritis is different compared to other rheumatic diseases. We studied 97 patients divided into five groups: rheumatoid arthritis, spondyloarthritis, gout, chondrocalcinosis and osteoarthritis, exploring six elementary lesions in 194 Achilles entheses examined. In our study the total index ultrasonographic Achilles is higher in spondyloarthritis with significant differences. The worst elementary spondyloarthritis lesions for discriminations against other pathologies were calcification. This study aims to demonstrate the discriminant validity of Achilles enthesitis observed by ultrasound in spondyloarthritis compared with other rheumatic diseases that may also have ultrasound abnormalities such enthesis level. Copyright © 2015 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.
Coster, Wendy J; Haley, Stephen M; Ni, Pengsheng; Dumas, Helene M; Fragala-Pinkham, Maria A
2008-04-01
To examine score agreement, validity, precision, and response burden of a prototype computer adaptive testing (CAT) version of the self-care and social function scales of the Pediatric Evaluation of Disability Inventory compared with the full-length version of these scales. Computer simulation analysis of cross-sectional and longitudinal retrospective data; cross-sectional prospective study. Pediatric rehabilitation hospital, including inpatient acute rehabilitation, day school program, outpatient clinics; community-based day care, preschool, and children's homes. Children with disabilities (n=469) and 412 children with no disabilities (analytic sample); 38 children with disabilities and 35 children without disabilities (cross-validation sample). Not applicable. Summary scores from prototype CAT applications of each scale using 15-, 10-, and 5-item stopping rules; scores from the full-length self-care and social function scales; time (in seconds) to complete assessments and respondent ratings of burden. Scores from both computer simulations and field administration of the prototype CATs were highly consistent with scores from full-length administration (r range, .94-.99). Using computer simulation of retrospective data, discriminant validity, and sensitivity to change of the CATs closely approximated that of the full-length scales, especially when the 15- and 10-item stopping rules were applied. In the cross-validation study the time to administer both CATs was 4 minutes, compared with over 16 minutes to complete the full-length scales. Self-care and social function score estimates from CAT administration are highly comparable with those obtained from full-length scale administration, with small losses in validity and precision and substantial decreases in administration time.
Gergana, Kodjebacheva; Coleman, Anne L.; Ensrud, Kristine E.; Cauley, Jane A.; Yu, Fei; Stone, Katie L.; Pedula, Kathryn L.; Hochberg, Marc C.; Mangione, Carol M.
2010-01-01
Purpose To test the reliability and validity of questionnaires shortened from the National Eye Institute 25-item Vision Function Questionnaire (NEI VFQ-9 and NEI VFQ-8). Design A cross-sectional multi-center cohort study. Methods Reliability was assessed by Cronbach alpha coefficients. Validity was evaluated by studying the association of vision-targeted quality-of-life composite scores with objective visual function measurements. Study population: A total of 5,482 women between the ages of 65 and 100 years participated in the Year-10 clinic visit in the Study of Osteoporotic Fractures (SOF). A total of 3,631 women with complete data were included in the visual acuity (VA) and visual field (VF) analysis of the NEI VFQ-9, which is defined for those who care to drive. and 5,311 in the analysis of the NEI VFQ-8. To assess differences in prevalent eye diseases, which were ascertained for a random sample of SOF participants, 853 and 1,237 women were included in the NEI VFQ-9 and the NEI VFQ-8 analyses, respectively. Results Cronbach alpha coefficient for the NEI VFQ-9 scale was 0.83 and that of the NEI VFQ-8 was 0.84. Using both questionnaires, women with VA worse than 20/40 had lower composite scores compared to those with VA 20/40 or better (p<0.001). Participants with mild, moderate, and severe binocular VF loss had lower composite scores compared to those with no binocular VF loss (p<0.001).Compared to women without chronic eye diseases in both eyes, women with at least one chronic eye disease in at least one eye had lower composite scores. Conclusions Both questionnaires showed high reliability across items and validity with respect to clinical markers of eye disease Future research should compare the properties of these shortened surveys to those of the NEI VFQ-25. PMID:20103058
Bham, Ghulam H; Leu, Ming C; Vallati, Manoj; Mathur, Durga R
2014-06-01
This study is aimed at validating a driving simulator (DS) for the study of driver behavior in work zones. A validation study requires field data collection. For studies conducted in highway work zones, the availability of safe vantage points for data collection at critical locations can be a significant challenge. A validation framework is therefore proposed in this paper, demonstrated using a fixed-based DS that addresses the issue by using a global positioning system (GPS). The validation of the DS was conducted using objective and subjective evaluations. The objective validation was divided into qualitative and quantitative evaluations. The DS was validated by comparing the results of simulation with the field data, which were collected using a GPS along the highway and video recordings at specific locations in a work zone. The constructed work zone scenario in the DS was subjectively evaluated with 46 participants. The objective evaluation established the absolute and relative validity of the DS. The mean speeds from the DS data showed excellent agreement with the field data. The subjective evaluation indicated realistic driving experience by the participants. The use of GPS showed that continuous data collected along the highway can overcome the challenges of unavailability of safe vantage points especially at critical locations. Further, a validated DS can be used for examining driver behavior in complex situations by replicating realistic scenarios. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pulmonary function tests as outcomes for systemic sclerosis interstitial lung disease.
Caron, Melissa; Hoa, Sabrina; Hudson, Marie; Schwartzman, Kevin; Steele, Russell
2018-06-30
Interstitial lung disease (ILD) is the leading cause of morbidity and mortality in systemic sclerosis (SSc). We performed a systematic review to characterise the use and validation of pulmonary function tests (PFTs) as surrogate markers for systemic sclerosis-associated interstitial lung disease (SSc-ILD) progression.Five electronic databases were searched to identify all relevant studies. Included studies either used at least one PFT measure as a longitudinal outcome for SSc-ILD progression ( i.e. outcome studies) and/or reported at least one classical measure of validity for the PFTs in SSc-ILD ( i.e. validation studies).This systematic review included 169 outcome studies and 50 validation studies. Diffusing capacity of the lung for carbon monoxide ( D LCO ) was cumulatively the most commonly used outcome until 2010 when it was surpassed by forced vital capacity (FVC). FVC (% predicted) was the primary endpoint in 70.4% of studies, compared to 11.3% for % predicted D LCO Only five studies specifically aimed to validate the PFTs: two concluded that D LCO was the best measure of SSc-ILD extent, while the others did not favour any PFT. These studies also showed respectable validity measures for total lung capacity (TLC).Despite the current preference for FVC, available evidence suggests that D LCO and TLC should not yet be discounted as potential surrogate markers for SSc-ILD progression. Copyright ©ERS 2018.
Getts, Katherine M; Quinn, Emilee L; Johnson, Donna B; Otten, Jennifer J
2017-11-01
Measuring food waste (ie, plate waste) in school cafeterias is an important tool to evaluate the effectiveness of school nutrition policies and interventions aimed at increasing consumption of healthier meals. Visual assessment methods are frequently applied in plate waste studies because they are more convenient than weighing. The visual quarter-waste method has become a common tool in studies of school meal waste and consumption, but previous studies of its validity and reliability have used correlation coefficients, which measure association but not necessarily agreement. The aims of this study were to determine, using a statistic measuring interrater agreement, whether the visual quarter-waste method is valid and reliable for assessing food waste in a school cafeteria setting when compared with the gold standard of weighed plate waste. To evaluate validity, researchers used the visual quarter-waste method and weighed food waste from 748 trays at four middle schools and five high schools in one school district in Washington State during May 2014. To assess interrater reliability, researcher pairs independently assessed 59 of the same trays using the visual quarter-waste method. Both validity and reliability were assessed using a weighted κ coefficient. For validity, as compared with the measured weight, 45% of foods assessed using the visual quarter-waste method were in almost perfect agreement, 42% of foods were in substantial agreement, 10% were in moderate agreement, and 3% were in slight agreement. For interrater reliability between pairs of visual assessors, 46% of foods were in perfect agreement, 31% were in almost perfect agreement, 15% were in substantial agreement, and 8% were in moderate agreement. These results suggest that the visual quarter-waste method is a valid and reliable tool for measuring plate waste in school cafeteria settings. Copyright © 2017 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Foreman, K. Bo; Addison, Odessa; Kim, Han S.; Dibble, Leland E.
2010-01-01
Introduction Despite clear deficits in postural control, most clinical examination tools lack accuracy in identifying persons with Parkinson disease (PD) who have fallen or are at risk for falls. We assert that this is in part due to the lack of ecological validity of the testing. Methods To test this assertion, we examined the responsiveness and predictive validity of the Functional Gait Assessment (FGA), the Pull test, and the Timed up and Go (TUG) during clinically defined ON and OFF medication states. To address responsiveness, ON/OFF medication performance was compared. To address predictive validity, areas under the curve (AUC) of receiver operating characteristic (ROC) curves were compared. Comparisons were made using separate non-parametric tests. Results Thirty-six persons (24 male, 12 female) with PD (22 fallers, 14 non-fallers) participated. Only the FGA was able to detect differences between fallers and non-fallers for both ON/OFF medication testing. The predictive validity of the FGA and the TUG for fall identification was higher during OFF medication compared to ON medication testing. The predictive validity of the FGA was higher than the TUG and the Pull test during ON and OFF medication testing. Discussion In order to most accurately identify fallers, clinicians should test persons with PD in ecologically relevant conditions and tasks. In this study, interpretation of the OFF medication performance and use of the FGA provided more accurate prediction of those who would fall. PMID:21215674
Alphs, Larry; Morlock, Robert; Coon, Cheryl; Cazorla, Pilar; Szegedi, Armin; Panagides, John
2011-06-01
The 16-item Negative Symptom Assessment (NSA-16) scale is a validated tool for evaluating negative symptoms of schizophrenia. The psychometric properties and predictive power of a four-item version (NSA-4) were compared with the NSA-16. Baseline data from 561 patients with predominant negative symptoms of schizophrenia who participated in two identically designed clinical trials were evaluated. Ordered logistic regression analysis of ratings using NSA-4 and NSA-16 were compared with ratings using several other standard tools to determine predictive validity and construct validity. Internal consistency and test--retest reliability were also analyzed. NSA-16 and NSA-4 scores were both predictive of scores on the NSA global rating (odds ratio = 0.83-0.86) and the Clinical Global Impressions--Severity scale (odds ratio = 0.91-0.93). NSA-16 and NSA-4 showed high correlation with each other (Pearson r = 0.85), similar high correlation with other measures of negative symptoms (demonstrating convergent validity), and lesser correlations with measures of other forms of psychopathology (demonstrating divergent validity). NSA-16 and NSA-4 both showed acceptable internal consistency (Cronbach α, 0.85 and 0.64, respectively) and test--retest reliability (intraclass correlation coefficient, 0.87 and 0.82). This study demonstrates that NSA-4 offers accuracy comparable to the NSA-16 in rating negative symptoms in patients with schizophrenia. Copyright © 2011 John Wiley & Sons, Ltd.
Mitchell, Katy; Gutierrez, Simran Bakshi; Sutton, Stacy; Morton, Stephanie; Morgenthaler, Andrea
2014-10-01
The purpose of this study was to determine the reliability and validity of two smartphone applications: (1) GetMyROM - inclinometery-based and (2) DrGoniometry - photo-based in the measurement of active shoulder external rotation (ER) as compared to standard goniometry (SG). Ninety-four Texas Woman's University Doctor of Physical Therapy students from the School of Physical Therapy - Houston campus, were recruited to participate in this study. Two iPhone applications were compared to SG using both novice and experienced raters. Active shoulder ER range of motion was measured over two time periods in random order by blinded novice and experienced raters. Intra-rater reliability using novice raters for the two applications ranged from an intraclass correlation coefficient (ICC) of 0.79 to 0.81 with SG at 0.82. Inter-rater reliability (novice/expert) for the two applications ranged from an ICC of 0.92 to 0.94 with SG at 0.91. Concurrent validity (when compared to SG) ranged from 0.93 to 0.94. There were no significant differences between the novice and experienced raters. Both applications were found to be reliable and comparable to SG. A photo-based application potentially offers a superior method of measurement as visualizing the landmarks may be simplified in this format and it provides a record of measurement. Further study using patient populations may find the two studied applications are useful as an adjunct for clinical practice.
NASA Astrophysics Data System (ADS)
Nir, A.; Doughty, C.; Tsang, C. F.
Validation methods which developed in the context of deterministic concepts of past generations often cannot be directly applied to environmental problems, which may be characterized by limited reproducibility of results and highly complex models. Instead, validation is interpreted here as a series of activities, including both theoretical and experimental tests, designed to enhance our confidence in the capability of a proposed model to describe some aspect of reality. We examine the validation process applied to a project concerned with heat and fluid transport in porous media, in which mathematical modeling, simulation, and results of field experiments are evaluated in order to determine the feasibility of a system for seasonal thermal energy storage in shallow unsaturated soils. Technical details of the field experiments are not included, but appear in previous publications. Validation activities are divided into three stages. The first stage, carried out prior to the field experiments, is concerned with modeling the relevant physical processes, optimization of the heat-exchanger configuration and the shape of the storage volume, and multi-year simulation. Subjects requiring further theoretical and experimental study are identified at this stage. The second stage encompasses the planning and evaluation of the initial field experiment. Simulations are made to determine the experimental time scale and optimal sensor locations. Soil thermal parameters and temperature boundary conditions are estimated using an inverse method. Then results of the experiment are compared with model predictions using different parameter values and modeling approximations. In the third stage, results of an experiment performed under different boundary conditions are compared to predictions made by the models developed in the second stage. Various aspects of this theoretical and experimental field study are described as examples of the verification and validation procedure. There is no attempt to validate a specific model, but several models of increasing complexity are compared with experimental results. The outcome is interpreted as a demonstration of the paradigm proposed by van der Heijde, 26 that different constituencies have different objectives for the validation process and therefore their acceptance criteria differ also.
The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review
Scafoglieri, Aldo; Jager‐Wittenaar, Harriët; Hobbelen, Johannes S.M.; van der Schans, Cees P.
2017-01-01
Abstract This review evaluates the reliability and validity of ultrasound to quantify muscles in older adults. The databases PubMed, Cochrane, and Cumulative Index to Nursing and Allied Health Literature were systematically searched for studies. In 17 studies, the reliability (n = 13) and validity (n = 8) of ultrasound to quantify muscles in community‐dwelling older adults (≥60 years) or a clinical population were evaluated. Four out of 13 reliability studies investigated both intra‐rater and inter‐rater reliability. Intraclass correlation coefficient (ICC) scores for reliability ranged from −0.26 to 1.00. The highest ICC scores were found for the vastus lateralis, rectus femoris, upper arm anterior, and the trunk (ICC = 0.72 to 1.000). All included validity studies found ICC scores ranging from 0.92 to 0.999. Two studies describing the validity of ultrasound to predict lean body mass showed good validity as compared with dual‐energy X‐ray absorptiometry (r 2 = 0.92 to 0.96). This systematic review shows that ultrasound is a reliable and valid tool for the assessment of muscle size in older adults. More high‐quality research is required to confirm these findings in both clinical and healthy populations. Furthermore, ultrasound assessment of small muscles needs further evaluation. Ultrasound to predict lean body mass is feasible; however, future research is required to validate prediction equations in older adults with varying function and health. PMID:28703496
QUADAS and STARD: evaluating the quality of diagnostic accuracy studies.
Oliveira, Maria Regina Fernandes de; Gomes, Almério de Castro; Toscano, Cristiana Maria
2011-04-01
To compare the performance of two approaches, one based on the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) and another on the Standards for Reporting Studies of Diagnostic Accuracy (STARD), in evaluating the quality of studies validating the OptiMal® rapid malaria diagnostic test. Articles validating the rapid test published until 2007 were searched in the Medline/PubMed database. This search retrieved 13 articles. A combination of 12 QUADAS criteria and three STARD criteria were compared with the 12 QUADAS criteria alone. Articles that fulfilled at least 50% of QUADAS criteria were considered as regular to good quality. Of the 13 articles retrieved, 12 fulfilled at least 50% of QUADAS criteria, and only two fulfilled the STARD/QUADAS criteria combined. Considering the two criteria combination (> 6 QUADAS and > 3 STARD), two studies (15.4%) showed good methodological quality. The articles selection using the proposed combination resulted in two to eight articles, depending on the number of items assumed as cutoff point. The STARD/QUADAS combination has the potential to provide greater rigor when evaluating the quality of studies validating malaria diagnostic tests, given that it incorporates relevant information not contemplated in the QUADAS criteria alone.
Schiffman, Eric L.; Truelove, Edmond L.; Ohrbach, Richard; Anderson, Gary C.; John, Mike T.; List, Thomas; Look, John O.
2011-01-01
AIMS The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. An overview is presented, including Axis I and II methodology and descriptive statistics for the study participant sample. This paper details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. Validity testing for the Axis II biobehavioral instruments was based on previously validated reference standards. METHODS The Axis I reference standards were based on the consensus of 2 criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion exam reliability was also assessed within study sites. RESULTS Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas ≥ 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion exam agreement with reference standards was excellent (k ≥ 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). CONCLUSION The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods. PMID:20213028
Boezeman, Edwin J; Hofhuis, José G M; Hovingh, Aly; Cox, Christopher E; de Vries, Reinout E; Spronk, Peter E
2016-09-01
Adaptive coping strategies are associated with less psychological distress. However, there is no brief, specific, and validated instrument for assessing adaptive coping among seriously ill patients. Our objective was to examine the validity and patient-proxy agreement of a novel instrument, the Sickness Insight in Coping Questionnaire. A cross-sectional design which included two related studies. A single university-affiliated Dutch hospital. Hospitalized patients (study 1) and ICU-patients and proxies (study 2). None. Study 1 (n = 103 hospitalized patients) addressed the Sickness Insight in Coping Questionnaire's performance relative to questionnaires addressing similar content areas. Coping subscales of the BRIEF COPE, Illness Cognition Questionnaire, and Utrecht Coping List were used as comparator measures in testing the construct validity of the Sickness Insight in Coping Questionnaire-subscales (fighting spirit, toughness, redefinition, positivism, and non-acceptance). The Sickness Insight in Coping Questionnaire had good internal consistency (0.64 ≤ α ≤ 0.79), a clear initial factor structure, and fair convergent (0.24 ≤ r ≤ 0.50) and divergent (r, ≤ 0.12) construct validity. Study 2 examined the performance of the Sickness Insight in Coping Questionnaire among 100 ICU patients and their close family members. This study showed that the Sickness Insight in Coping Questionnaire has good structural validity (confirmatory factor analyses with Comparative Fit Index > 0.90 and Root Mean Square Error of Approximation < 0.08) and moderate (r, 0.37; non-acceptance) to strong (r, > 0.50; fighting spirit, toughness, redefinition, and positivism) patient-close proxy agreement. Overall, the Sickness Insight in Coping Questionnaire has good psychometric properties. ICU clinicians can use the Sickness Insight in Coping Questionnaire to gain insight in adaptive coping style of patients through ratings of patients or their close family members.
Lindemann, Ulrich; Zijlstra, Wiebren; Aminian, Kamiar; Chastin, Sebastien F M; de Bruin, Eling D; Helbostad, Jorunn L; Bussmann, Johannes B J
2014-01-10
Physical activity is an important determinant of health and well-being in older persons and contributes to their social participation and quality of life. Hence, assessment tools are needed to study this physical activity in free-living conditions. Wearable motion sensing technology is used to assess physical activity. However, there is a lack of harmonisation of validation protocols and applied statistics, which make it hard to compare available and future studies. Therefore, the aim of this paper is to formulate recommendations for assessing the validity of sensor-based activity monitoring in older persons with focus on the measurement of body postures and movements. Validation studies of body-worn devices providing parameters on body postures and movements were identified and summarized and an extensive inter-active process between authors resulted in recommendations about: information on the assessed persons, the technical system, and the analysis of relevant parameters of physical activity, based on a standardized and semi-structured protocol. The recommended protocols can be regarded as a first attempt to standardize validity studies in the area of monitoring physical activity.
TAMDAR Sensor Validation in 2003 AIRS II
NASA Technical Reports Server (NTRS)
Daniels, Taumi S.; Murray, John J.; Anderson, Mark V.; Mulally, Daniel J.; Jensen, Kristopher R.; Grainger, Cedric A.; Delene, David J.
2005-01-01
This study entails an assessment of TAMDAR in situ temperature, relative humidity and winds sensor data from seven flights of the UND Citation II. These data are undergoing rigorous assessment to determine their viability to significantly augment domestic Meteorological Data Communications Reporting System (MDCRS) and the international Aircraft Meteorological Data Reporting (AMDAR) system observational databases to improve the performance of regional and global numerical weather prediction models. NASA Langley Research Center participated in the Second Alliance Icing Research Study from November 17 to December 17, 2003. TAMDAR data taken during this period is compared with validation data from the UND Citation. The data indicate acceptable performance of the TAMDAR sensor when compared to measurements from the UND Citation research instruments.
Validity and Reliability of Accelerometers in Patients With COPD: A SYSTEMATIC REVIEW.
Gore, Shweta; Blackwood, Jennifer; Guyette, Mary; Alsalaheen, Bara
2018-05-01
Reduced physical activity is associated with poor prognosis in chronic obstructive pulmonary disease (COPD). Accelerometers have greatly improved quantification of physical activity by providing information on step counts, body positions, energy expenditure, and magnitude of force. The purpose of this systematic review was to compare the validity and reliability of accelerometers used in patients with COPD. An electronic database search of MEDLINE and CINAHL was performed. Study quality was assessed with the Strengthening the Reporting of Observational Studies in Epidemiology checklist while methodological quality was assessed using the modified Quality Appraisal Tool for Reliability Studies. The search yielded 5392 studies; 25 met inclusion criteria. The SenseWear Pro armband reported high criterion validity under controlled conditions (r = 0.75-0.93) and high reliability (ICC = 0.84-0.86) for step counts. The DynaPort MiniMod demonstrated highest concurrent validity for step count using both video and manual methods. Validity of the SenseWear Pro armband varied between studies especially in free-living conditions, slower walking speeds, and with addition of weights during gait. A high degree of variability was found in the outcomes used and statistical analyses performed between studies, indicating a need for further studies to measure reliability and validity of accelerometers in COPD. The SenseWear Pro armband is the most commonly used accelerometer in COPD, but measurement properties are limited by gait speed variability and assistive device use. DynaPort MiniMod and Stepwatch accelerometers demonstrated high validity in patients with COPD but lack reliability data.
Juneja, Prabhjot; Evans, Philp M; Harris, Emma J
2013-08-01
Validation is required to ensure automated segmentation algorithms are suitable for radiotherapy target definition. In the absence of true segmentation, algorithmic segmentation is validated against expert outlining of the region of interest. Multiple experts are used to overcome inter-expert variability. Several approaches have been studied in the literature, but the most appropriate approach to combine the information from multiple expert outlines, to give a single metric for validation, is unclear. None consider a metric that can be tailored to case-specific requirements in radiotherapy planning. Validation index (VI), a new validation metric which uses experts' level of agreement was developed. A control parameter was introduced for the validation of segmentations required for different radiotherapy scenarios: for targets close to organs-at-risk and for difficult to discern targets, where large variation between experts is expected. VI was evaluated using two simulated idealized cases and data from two clinical studies. VI was compared with the commonly used Dice similarity coefficient (DSCpair - wise) and found to be more sensitive than the DSCpair - wise to the changes in agreement between experts. VI was shown to be adaptable to specific radiotherapy planning scenarios.
Note on concurrent validation of the personality assessment inventory in law enforcement.
Hays, J R
1997-08-01
This study compared the Personality Assessment Inventory and MMPI-168 profiles of 9 law enforcement applicants with published MMPI profiles to provide concurrent validation for the use of the Personality Assessment Inventory to assess personality pathology of peace officer applicants. The sample showed subclinical elevations of the Positive Impression and Treatment Rejection scales on the Personality Assessment Inventory and subclinical elevations on the MMPI validity scales of Lie and Correction and the clinical scales of Psychopathic Deviate and Hypomania. The applicants' mean MMPI profile provided concurrent validation for the use of the Personality Assessment Inventory in this decision on fitness to serve.
A standardised protocol for the validation of banking methodologies for arterial allografts.
Lomas, R J; Dodd, P D F; Rooney, P; Pegg, D E; Hogg, P A; Eagle, M E; Bennett, K E; Clarkson, A; Kearney, J N
2013-09-01
The objective of this study was to design and test a protocol for the validation of banking methodologies for arterial allografts. A series of in vitro biomechanical and biological assessments were derived, and applied to paired fresh and banked femoral arteries. The ultimate tensile stress and strain, suture pullout stress and strain, expansion/rupture under hydrostatic pressure, histological structure and biocompatibility properties of disinfected and cryopreserved femoral arteries were compared to those of fresh controls. No significant differences were detected in any of the test criteria. This validation protocol provides an effective means of testing and validating banking protocols for arterial allografts.
Armistead-Jehle, Patrick; Cole, Wesley R; Stegman, Robert L
2018-02-01
The study was designed to replicate and extend pervious findings demonstrating the high rates of invalid neuropsychological testing in military service members (SMs) with a history of mild traumatic brain injury (mTBI) assessed in the context of a medical evaluation board (MEB). Two hundred thirty-one active duty SMs (61 of which were undergoing an MEB) underwent neuropsychological assessment. Performance validity (Word Memory Test) and symptom validity (MMPI-2-RF) test data were compared across those evaluated within disability (MEB) and clinical contexts. As with previous studies, there were significantly more individuals in an MEB context that failed performance (MEB = 57%, non-MEB = 31%) and symptom validity testing (MEB = 57%, non-MEB = 22%) and performance validity testing had a notable affect on cognitive test scores. Performance and symptom validity test failure rates did not vary as a function of the reason for disability evaluation when divided into behavioral versus physical health conditions. These data are consistent with past studies, and extends those studies by including symptom validity testing and investigating the effect of reason for MEB. This and previous studies demonstrate that more than 50% of SMs seen in the context of an MEB will fail performance validity tests and over-report on symptom validity measures. These results emphasize the importance of using both performance and symptom validity testing when evaluating SMs with a history of mTBI, especially if they are being seen for disability evaluations, in order to ensure the accuracy of cognitive and psychological test data. Published by Oxford University Press 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Validity of Highlighting on Text Comprehension
NASA Astrophysics Data System (ADS)
So, Joey C. Y.; Chan, Alan H. S.
2009-10-01
In this study, 38 university students were tested with a Chinese reading task on an LED display under different task conditions for determining the effects of the highlighting and its validity on comprehension performance on light-emitting diodes (LED) display for Chinese reading. Four levels of validity (0%, 33%, 67% and 100%) and a control condition with no highlighting were tested. Each subject was required to perform the five experimental conditions in which different passages were read and comprehended. The results showed that the condition with 100% validity of highlighting was found to have better comprehension performance than other validity levels and conditions with no highlighting. The comprehension score of the condition without highlighting effect was comparatively lower than those highlighting conditions with distracters, though not significant.
Gemoll, Timo; Kollbeck, Sophie L; Karstens, Karl F; Hò, Gia G; Hartwig, Sonja; Strohkamp, Sarah; Schillo, Katharina; Thorns, Christoph; Oberländer, Martina; Kalies, Kathrin; Lehr, Stefan; Habermann, Jens K
2017-08-15
While carcinogenesis in Sporadic Colorectal Cancer (SCC) has been thoroughly studied, less is known about Ulcerative Colitis associated Colorectal Cancer (UCC). This study aimed to identify and validate differentially expressed proteins between clinical samples of SCC and UCC to elucidate new insights of UCC/SCC carcinogenesis and progression. Multiplex-fluorescence two-dimensional gel electrophoresis (2-D DIGE) and mass spectrometry identified 67 proteoforms representing 43 distinct proteins. After analysis by Ingenuity Pathway Analysis ® (IPA), subsequent Western blot validation proofed the differential expression of Heat shock 27 kDA protein 1 (HSPB1) and Microtubule-associated protein R/EB family, member 1 (EB1) while the latter one showed also expression differences by immunohistochemistry. Fresh frozen tissue of UCC ( n = 10) matched with SCC ( n = 10) was investigated. Proteins of cancerous intestinal mucosal cells were obtained by Laser Capture Microdissection (LCM) and compared by 2-D DIGE. Significant spots were identified by mass spectrometry. After IPA, three proteins [EB1, HSPB1, and Annexin 5 (ANXA5)] were chosen for further validation by Western blotting and tissue microarray-based immunohistochemistry. This study identified significant differences in protein expression of colorectal carcinoma cells from UCC patients compared to patients with SCC. Particularly, EB1 was validated in an independent clinical cohort.
NASA Astrophysics Data System (ADS)
Yepes, Pablo P.; Eley, John G.; Liu, Amy; Mirkovic, Dragan; Randeniya, Sharmalee; Titt, Uwe; Mohan, Radhe
2016-04-01
Monte Carlo (MC) methods are acknowledged as the most accurate technique to calculate dose distributions. However, due its lengthy calculation times, they are difficult to utilize in the clinic or for large retrospective studies. Track-repeating algorithms, based on MC-generated particle track data in water, accelerate dose calculations substantially, while essentially preserving the accuracy of MC. In this study, we present the validation of an efficient dose calculation algorithm for intensity modulated proton therapy, the fast dose calculator (FDC), based on a track-repeating technique. We validated the FDC algorithm for 23 patients, which included 7 brain, 6 head-and-neck, 5 lung, 1 spine, 1 pelvis and 3 prostate cases. For validation, we compared FDC-generated dose distributions with those from a full-fledged Monte Carlo based on GEANT4 (G4). We compared dose-volume-histograms, 3D-gamma-indices and analyzed a series of dosimetric indices. More than 99% of the voxels in the voxelized phantoms describing the patients have a gamma-index smaller than unity for the 2%/2 mm criteria. In addition the difference relative to the prescribed dose between the dosimetric indices calculated with FDC and G4 is less than 1%. FDC reduces the calculation times from 5 ms per proton to around 5 μs.
The influence of work personality on job satisfaction: incremental validity and mediation effects.
Heller, Daniel; Ferris, D Lance; Brown, Douglas; Watson, David
2009-08-01
Drawing from recent developments regarding the contextual nature of personality (e.g., D. Wood & B. W. Roberts, 2006), we conducted 2 studies (1 cross-sectional and 1 longitudinal over 1 year) to examine the validity of work personality in predicting job satisfaction and its mediation of the effect of global personality on job satisfaction. Study 1 showed that (a) individuals vary systematically in their personality between roles- they were significantly more conscientious and open to experience and less extraverted at work compared to at home; (b) work personality was a better predictor of job satisfaction than both global personality and home personality; and (c) work personality demonstrated incremental validity above and beyond the other two personality measures. Study 2 further showed that each of the work personality dimensions fully mediated the association between its corresponding global personality trait and job satisfaction. Evidence for the discriminant validity of the findings is also presented.
Identifying Careless Responding With the Psychopathic Personality Inventory-Revised Validity Scales.
Marcus, David K; Church, Abere Sawaqdeh; O'Connell, Debra; Lilienfeld, Scott O
2018-01-01
The Psychopathic Personality Inventory-Revised (PPI-R) includes validity scales that assess Deviant Responding (DR), Virtuous Responding, and Inconsistent Responding. We examined the utility of these scales for identifying careless responding using data from two online studies that examined correlates of psychopathy in college students (Sample 1: N = 583; Sample 2: N = 454). Compared with those below the cut scores, those above the cut on the DR scale yielded consistently lower validity coefficients when PPI-R scores were correlated with corresponding scales from the Triarchic Psychopathy Measure. The other three PPI-R validity scales yielded weaker and less consistent results. Participants who completed the studies in an inordinately brief amount of time scored significantly higher on the DR and Virtuous Responding scales than other participants. Based on the findings from the current studies, researchers collecting PPI-R data online should consider identifying and perhaps screening out respondents with elevated scores on the DR scale.
Baars, Maria A E; Olde Rikkert, Marcel G M; Kessels, Roy P C
2013-01-01
Background Online interventions are aiming increasingly at cognitive outcome measures but so far no easy and fast self-monitors for cognition have been validated or proven reliable and feasible. Objective This study examines a new instrument called the Brain Aging Monitor–Cognitive Assessment Battery (BAM-COG) for its alternate forms reliability, face and content validity, and convergent and divergent validity. Also, reference values are provided. Methods The BAM-COG consists of four easily accessible, short, yet challenging puzzle games that have been developed to measure working memory (“Conveyer Belt”), visuospatial short-term memory (“Sunshine”), episodic recognition memory (“Viewpoint”), and planning (“Papyrinth”). A total of 641 participants were recruited for this study. Of these, 397 adults, 40 years and older (mean 54.9, SD 9.6), were eligible for analysis. Study participants played all games three times with 14 days in between sets. Face and content validity were based on expert opinion. Alternate forms reliability (AFR) was measured by comparing scores on different versions of the BAM-COG and expressed with an intraclass correlation (ICC: two-way mixed; consistency at 95%). Convergent validity (CV) was provided by comparing BAM-COG scores to gold-standard paper-and-pencil and computer-assisted cognitive assessment. Divergent validity (DV) was measured by comparing BAM-COG scores to the National Adult Reading Test IQ (NART-IQ) estimate. Both CV and DV are expressed as Spearman rho correlation coefficients. Results Three out of four games showed adequate results on AFR, CV, and DV measures. The games Conveyer Belt, Sunshine, and Papyrinth have AFR ICCs of .420, .426, and .645 respectively. Also, these games had good to very good CV correlations: rho=.577 (P=.001), rho=.669 (P<.001), and rho=.400 (P=.04), respectively. Last, as expected, DV correlations were low: rho=−.029 (P=.44), rho=−.029 (P=.45), and rho=−.134 (P=.28) respectively. The game Viewpoint provided less desirable results with an AFR ICC of .167, CV rho=.202 (P=.15), and DV rho=−.162 (P=.21). Conclusions This study provides evidence for the use of the BAM-COG test battery as a feasible, reliable, and valid tool to monitor cognitive performance in healthy adults in an online setting. Three out of four games have good psychometric characteristics to measure working memory, visuospatial short-term memory, and planning capacity. PMID:24300212
Two- and three-dimensional CT measurements of urinary calculi length and width: a comparative study.
Lidén, Mats; Thunberg, Per; Broxvall, Mathias; Geijer, Håkan
2015-04-01
The standard imaging procedure for a patient presenting with renal colic is unenhanced computed tomography (CT). The CT measured size has a close correlation to the estimated prognosis for spontaneous passage of a ureteral calculus. Size estimations of urinary calculi in CT images are still based on two-dimensional (2D) reformats. To develop and validate a calculus oriented three-dimensional (3D) method for measuring the length and width of urinary calculi and to compare the calculus oriented measurements of the length and width with corresponding 2D measurements obtained in axial and coronal reformats. Fifty unenhanced CT examinations demonstrating urinary calculi were included. A 3D symmetric segmentation algorithm was validated against reader size estimations. The calculus oriented size from the segmentation was then compared to the estimated size in axial and coronal 2D reformats. The validation showed 0.1 ± 0.7 mm agreement against reference measure. There was a 0.4 mm median bias for 3D estimated calculus length compared to 2D (P < 0.001), but no significant bias for 3D width compared to 2D. The length of a calculus in axial and coronal reformats becomes underestimated compared to 3D if its orientation is not aligned to the image planes. Future studies aiming to correlate calculus size with patient outcome should use a calculus oriented size estimation. © The Foundation Acta Radiologica 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Lohrer, H; Nauck, T
2010-06-01
The VISA-A questionnaire is currently the only valid, reliable, and disease specific patient administered questionnaire for research in Achilles tendinopathy. To perform multinational and multilingual investigations this instrument was already adapted to several languages. According to the "guidelines for the process of cross-cultural adaptation of self-report measures" we already translated and validated the VISA-A questionnaire for patients with Achilles tendinopathy. To cross-culturally adapt and validate the VISA-A Questionnaire for German-speaking patients suffering from Haglund's disease. The VISA-A-G questionnaire was tested for reliability, validity, and internal consistency in 39 Haglund's disease patients and 79 asymptomatic persons. For concurrent validity the VISA-A-G was compared with the Curwin and Stanish tendon grading system and with the Percy and Conochie classification system for the effect of pain on athletic performance. VISA-A-G results in Haglund's disease were additionally compared with VISA-A-G results obtained from Achilles tendinopathy patients and with VISA-A results presented in the international literature. ICC for the VISA-A-G questionnaire in conservatively treated Haglund's disease patients was 0.96. In asymptomatic students and joggers ICC was 0.97 and 0.60. When correlated with the grading system of Curwin and Stanish and with the Percy and Conochie classification rho was -0.95 and 0.94, respectively. Internal consistency (Cronbach's alpha) for the total VISA-A-G scores of the patients was calculated to be 0.87. Compared with VISA-A-G results obtained from Achilles tendinopathy patients there was no relevant difference discernible. Compared with VISA-A results presented in the original publication no difference was found statistically for students, healthy people, conservative, and preoperative patients, respectively. This study confirms that the VISA-A-G is a valid and reliable measure for German-speaking patients suffering from Haglund's disease. Georg Thieme Verlag KG Stuttgart, New York.
Development and validation of a two-dimensional fast-response flood estimation model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Judi, David R; Mcpherson, Timothy N; Burian, Steven J
2009-01-01
A finite difference formulation of the shallow water equations using an upwind differencing method was developed maintaining computational efficiency and accuracy such that it can be used as a fast-response flood estimation tool. The model was validated using both laboratory controlled experiments and an actual dam breach. Through the laboratory experiments, the model was shown to give good estimations of depth and velocity when compared to the measured data, as well as when compared to a more complex two-dimensional model. Additionally, the model was compared to high water mark data obtained from the failure of the Taum Sauk dam. Themore » simulated inundation extent agreed well with the observed extent, with the most notable differences resulting from the inability to model sediment transport. The results of these validation studies complex two-dimensional model. Additionally, the model was compared to high water mark data obtained from the failure of the Taum Sauk dam. The simulated inundation extent agreed well with the observed extent, with the most notable differences resulting from the inability to model sediment transport. The results of these validation studies show that a relatively numerical scheme used to solve the complete shallow water equations can be used to accurately estimate flood inundation. Future work will focus on further reducing the computation time needed to provide flood inundation estimates for fast-response analyses. This will be accomplished through the efficient use of multi-core, multi-processor computers coupled with an efficient domain-tracking algorithm, as well as an understanding of the impacts of grid resolution on model results.« less
Stefanidis, Dimitrios; Hope, William W; Scott, Daniel J
2011-07-01
The value of robotic assistance for intracorporeal suturing is not well defined. We compared robotic suturing with laparoscopic suturing on the FLS model with a large cohort of surgeons. Attendees (n=117) at the SAGES 2006 Learning Center robotic station placed intracorporeal sutures on the FLS box-trainer model using conventional laparoscopic instruments and the da Vinci® robot. Participant performance was recorded using a validated objective scoring system, and a questionnaire regarding demographics, task workload, and suturing modality preference was completed. Construct validity for both tasks was assessed by comparing the performance scores of subjects with various levels of experience. A validated questionnaire was used for workload measurement. Of the participants, 84% had prior laparoscopic and 10% prior robotic suturing experience. Within the allotted time, 83% of participants completed the suturing task laparoscopically and 72% with the robot. Construct validity was demonstrated for both simulated tasks according to the participants' advanced laparoscopic experience, laparoscopic suturing experience, and self-reported laparoscopic suturing ability (p<0.001 for all) and according to prior robotic experience, robotic suturing experience, and self-reported robotic suturing ability (p<0.001 for all), respectively. While participants achieved higher suturing scores with standard laparoscopy compared with the robot (84±75 vs. 56±63, respectively; p<0.001), they found the laparoscopic task more physically demanding (NASA score 13±5 vs. 10±5, respectively; p<0.001) and favored the robot as their method of choice for intracorporeal suturing (62 vs. 38%, respectively; p<0.01). Construct validity was demonstrated for robotic suturing on the FLS model. Suturing scores were higher using standard laparoscopy likely as a result of the participants' greater experience with laparoscopic suturing versus robotic suturing. Robotic assistance decreases the physical demand of intracorporeal suturing compared with conventional laparoscopy and, in this study, was the preferred suturing method by most surgeons. Curricula for robotic suturing training need to be developed.
Leveraging biospecimen resources for discovery or validation of markers for early cancer detection.
Schully, Sheri D; Carrick, Danielle M; Mechanic, Leah E; Srivastava, Sudhir; Anderson, Garnet L; Baron, John A; Berg, Christine D; Cullen, Jennifer; Diamandis, Eleftherios P; Doria-Rose, V Paul; Goddard, Katrina A B; Hankinson, Susan E; Kushi, Lawrence H; Larson, Eric B; McShane, Lisa M; Schilsky, Richard L; Shak, Steven; Skates, Steven J; Urban, Nicole; Kramer, Barnett S; Khoury, Muin J; Ransohoff, David F
2015-04-01
Validation of early detection cancer biomarkers has proven to be disappointing when initial promising claims have often not been reproducible in diagnostic samples or did not extend to prediagnostic samples. The previously reported lack of rigorous internal validity (systematic differences between compared groups) and external validity (lack of generalizability beyond compared groups) may be effectively addressed by utilizing blood specimens and data collected within well-conducted cohort studies. Cohort studies with prediagnostic specimens (eg, blood specimens collected prior to development of clinical symptoms) and clinical data have recently been used to assess the validity of some early detection biomarkers. With this background, the Division of Cancer Control and Population Sciences (DCCPS) and the Division of Cancer Prevention (DCP) of the National Cancer Institute (NCI) held a joint workshop in August 2013. The goal was to advance early detection cancer research by considering how the infrastructure of cohort studies that already exist or are being developed might be leveraged to include appropriate blood specimens, including prediagnostic specimens, ideally collected at periodic intervals, along with clinical data about symptom status and cancer diagnosis. Three overarching recommendations emerged from the discussions: 1) facilitate sharing of existing specimens and data, 2) encourage collaboration among scientists developing biomarkers and those conducting observational cohort studies or managing healthcare systems with cohorts followed over time, and 3) conduct pilot projects that identify and address key logistic and feasibility issues regarding how appropriate specimens and clinical data might be collected at reasonable effort and cost within existing or future cohorts. © Published by Oxford University Press 2015.
Leveraging Biospecimen Resources for Discovery or Validation of Markers for Early Cancer Detection
Carrick, Danielle M.; Mechanic, Leah E.; Srivastava, Sudhir; Anderson, Garnet L.; Baron, John A.; Berg, Christine D.; Cullen, Jennifer; Diamandis, Eleftherios P.; Doria-Rose, V. Paul; Goddard, Katrina A. B.; Hankinson, Susan E.; Kushi, Lawrence H.; Larson, Eric B.; McShane, Lisa M.; Schilsky, Richard L.; Shak, Steven; Skates, Steven J.; Urban, Nicole; Kramer, Barnett S.; Khoury, Muin J.; Ransohoff, David F.
2015-01-01
Validation of early detection cancer biomarkers has proven to be disappointing when initial promising claims have often not been reproducible in diagnostic samples or did not extend to prediagnostic samples. The previously reported lack of rigorous internal validity (systematic differences between compared groups) and external validity (lack of generalizability beyond compared groups) may be effectively addressed by utilizing blood specimens and data collected within well-conducted cohort studies. Cohort studies with prediagnostic specimens (eg, blood specimens collected prior to development of clinical symptoms) and clinical data have recently been used to assess the validity of some early detection biomarkers. With this background, the Division of Cancer Control and Population Sciences (DCCPS) and the Division of Cancer Prevention (DCP) of the National Cancer Institute (NCI) held a joint workshop in August 2013. The goal was to advance early detection cancer research by considering how the infrastructure of cohort studies that already exist or are being developed might be leveraged to include appropriate blood specimens, including prediagnostic specimens, ideally collected at periodic intervals, along with clinical data about symptom status and cancer diagnosis. Three overarching recommendations emerged from the discussions: 1) facilitate sharing of existing specimens and data, 2) encourage collaboration among scientists developing biomarkers and those conducting observational cohort studies or managing healthcare systems with cohorts followed over time, and 3) conduct pilot projects that identify and address key logistic and feasibility issues regarding how appropriate specimens and clinical data might be collected at reasonable effort and cost within existing or future cohorts. PMID:25688116
Jones, Anne; Sealey, Rebecca; Crowe, Michael; Gordon, Susan
2014-10-01
The aim of this study was to assess the concurrent validity and reliability of the Simple Goniometer (SG) iPhone® app compared to the Universal Goniometer (UG). Within subject comparison design comparing the UG with the SG app. James Cook University, Townsville, Queensland, Australia. Thirty-six volunteer participants, with a mean age of 60.6 years (SD 6.2). Not applicable. Thirty-six participants performed three standing lunges during which the knee joint angle was measured with the SG app and the UG. There were no significant differences in the measures of individual knee joint angles between the UG and the SG app. Pearson correlations of 0.96-0.98 and intraclass correlation coefficients of 0.97-0.99 (95% confidence interval: 0.95-1.00) were recorded for all measures. Using the Bland-Altman method, the standard error of the mean of the differences and the standard deviation of the mean of the differences were low. The measurements from the SG iPhone® app were reliable and possessed concurrent validity for this sample and protocol when compared to the UG.
ERIC Educational Resources Information Center
Moore, Delilah S.; Ellis, Rebecca; Allen, Priscilla D.; Cherry, Katie E.; Monroe, Pamela A.; O'Neil, Carol E.; Wood, Robert H.
2008-01-01
The purpose of this study was to establish validity evidence of four physical activity (PA) questionnaires in culturally diverse older adults by comparing self-report PA with performance-based physical function. Participants were 54 older adults who completed the Continuous Scale Physical Functional Performance 10-item Test (CS-PFP10), Physical…
Validity of a Sun Safety Diary Using UV Monitors in Middle School Children
ERIC Educational Resources Information Center
Yaroch, Amy L.; Reynolds, Kim D.; Buller, David B.; Maloy, Julie A.; Geno, Cristy R.
2006-01-01
This article describes a validity study conducted among middle school students comparing self-reported sun safety behaviors from a diary with readings from ultraviolet (UV) monitors worn on different body sites. The UV monitors are stickers with panels that turn increasingly darker shades of blue in the presence of increasing amounts of UV light.…
ERIC Educational Resources Information Center
Farnsworth, Timothy L.
2013-01-01
This study examined the construct validity of the TOEFL iBT Speaking subsection for the purposes of international teaching assistant (ITA) certification, a purpose for which it was not specifically designed. The factor structure of the new TOEFL was compared with that of another language performance test in use at a major American research…
ERIC Educational Resources Information Center
Kettler, Ryan J.; Elliott, Stephen N.; Davies, Michael; Griffin, Patrick
2012-01-01
This study addresses the predictive validity of results from a screening system of academic enablers, with a sample of Australian elementary school students, when the criterion variable is end-of-year achievement. The investigation included (a) comparing the predictive validity of a brief criterion-referenced nomination system with more…
Project Evaluation: Validation of a Scale and Analysis of Its Predictive Capacity
ERIC Educational Resources Information Center
Fernandes Malaquias, Rodrigo; de Oliveira Malaquias, Fernanda Francielle
2014-01-01
The objective of this study was to validate a scale for assessment of academic projects. As a complement, we examined its predictive ability by comparing the scores of advised/corrected projects based on the model and the final scores awarded to the work by an examining panel (approximately 10 months after the project design). Results of…
Evaluation of the Spiritual Well-Being Scale in a Sample of Korean Adults.
You, Sukkyung; Yoo, Ji Eun
2016-08-01
This study explored the psychometric qualities and construct validity of the Spiritual Well-Being Scale (SWBS; Ellison in J Psychol Theol 11:330-340, 1983) using a sample of 470 Korean adults. Two factor analyses, exploratory factor analysis and confirmatory factor analysis, were conducted in order to test the validity of the SWBS. The results of the factor analyses supported the original two-dimensional structure of the SWBS-religious well-being (RWB) and existential well-being (EWB) with method effects associated with negatively worded items. By controlling for method effects, the evaluation of the two-factor structure of SWBS is confirmed with clarity. Further, the differential pattern and magnitude of correlations between the SWB subscales and the religious and psychological variables suggested that two factors of the SWBS were valid for Protestant, Catholic, and religiously unaffiliated groups except Buddhists. The Protestant group scored higher in RWB compared to the Buddhist, Catholic, and unaffiliated groups. The Protestant group scored higher in EWB compared to the unaffiliated groups. Future studies may need to include more Buddhist samples to gain solid evidence for validity of the SWBS on a non-Western religious tradition.
Validating the Assessment for Measuring Indonesian Secondary School Students Performance in Ecology
NASA Astrophysics Data System (ADS)
Rachmatullah, A.; Roshayanti, F.; Ha, M.
2017-09-01
The aims of this current study are validating the American Association for the Advancement of Science (AAAS) Ecology assessment and examining the performance of Indonesian secondary school students on the assessment. A total of 611 Indonesian secondary school students (218 middle school students and 393 high school students) participated in the study. Forty-five items of AAAS assessment in the topic of Interdependence in Ecosystems were divided into two versions which every version has 21 similar items. Linking item method was used as the method to combine those two versions of assessment and further Rasch analyses were utilized to validate the instrument. Independent sample t-test was also run to compare the performance of Indonesian students and American students based on the mean of item difficulty. We found that from the total of 45 items, three items were identified as misfitting items. Later on, we also found that both Indonesian middle and high school students were significantly lower performance with very large and medium effect size compared to American students. We will discuss our findings in the regard of validation issue and the connection to Indonesian student’s science literacy.
Validation of the Practice Environment Scale to the Brazilian culture.
Gasparino, Renata C; Guirardello, Edinêis de B
2017-07-01
To validate the Brazilian version of the Practice Environment Scale. The Practice Environment Scale is a tool that evaluates the presence of characteristics that are favourable for professional nursing practice because a better work environment contributes to positive results for patients, professionals and institutions. Methodological study including 209 nurses. Validity was assessed via a confirmatory factor analysis using structural equation modelling, in which the correlations between the instrument and the following variables were tested: burnout, job satisfaction, safety climate, perception of quality of care and intention to leave the job. Subgroups were compared and the reliability was assessed using Cronbach's alpha and the composite reliability. Factor analysis resulted in exclusion of seven items. Significant correlations were obtained between the subscales and all variables in the study. The reliability was considered acceptable. The Brazilian version of the Practice Environment Scale is a valid and reliable tool used to assess the characteristics that promote professional nursing practice. Use of this tool in Brazilian culture should allow managers to implement changes that contribute to the achievement of better results, in addition to identifying and comparing the environments of health institutions. © 2017 John Wiley & Sons Ltd.
Najimi, Arash; Mostafavi, Firoozeh; Sharifirad, Gholamreza; Golshiri, Parastoo
2017-01-01
BACKGROUND: This study was aimed at developing and studying the scale of self-efficacy in adherence to treatment in Iranian patients with hypertension. METHODS: A mix-method study was conducted on the two stages: in the first phase, a qualitative study was done using content analysis through deep and semi-structured interviews. After data analysis, the draft of tool was prepared. Items in the draft were selected based on the extracted concepts. In the second phase, validity and reliability of the instrument were implemented using a quantitative study. The prepared instrument in the first phase was studied among 612 participants. To test the construct validity and internal consistency, exploratory factor analysis and Cronbach's alpha were used, respectively. To study the validity of the final scale, the average score of self-efficacy in patients with controlled hypertension were compared with patients with uncontrolled hypertension. RESULTS: In overall, 16 patients were interviewed. Twenty-six items were developed to assess different concepts of self-efficacy. Concept-related items were extracted from interviews to study the face validity of the tool from patient's point of view. Four items were deleted because scored 0.79 in content validity. The mean of questionnaire content validity was 0.85. Items were collected in two factors with an eigenvalue >1. Four items were deleted with load factor <0.4. Reliability was 0.84 for the entire instrument. CONCLUSION: Self-efficacy scale in patients with hypertension is a valid and reliable instrument that can effectively evaluate the self-efficacy in medication adherence in the management of hypertension. PMID:29114551
Convergent validity of alternative MMPI-2 personality disorder scales.
Hicklin, J; Widiger, T A
2000-12-01
The Morey, Waugh, and Blashfield (1985) MMPI (Hathaway et al., 1989) personality disorder scales provided a significant contribution to personality disorder research and assessment. However, the subsequent revisions to the MMPI and the multiple revisions to the diagnostic criteria sets that have since occurred may have justified comparable revisions to these scales. Somwaru and Ben-Porath (1995) selected a substantially different set of items from the MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) to assess Diagnostic and Statistical Manual of Mental Disorders (4th ed.; American Psychiatric Association, 1994) personality disorder diagnostic criteria. In our study, we compared the convergent validity of these alternative MMPI-2 personality disorder scales with respect to 3 self-report measures of personality disorder symptomatology in a sample of 82 psychiatric outpatients. The results suggested that Somwaru and Ben-Porath's scales are as valid as the original Morey et al. scales and might be even more valid for the assessment of borderline, antisocial, and schizoid personality disorder symptomatology.
Castillo-Tandazo, Wilson; Flores-Fortty, Adolfo; Feraud, Lourdes; Tettamanti, Daniel
2013-01-01
Purpose To translate, cross-culturally adapt, and validate the Questionnaire for Diabetes-Related Foot Disease (Q-DFD), originally created and validated in Australia, for its use in Spanish-speaking patients with diabetes mellitus. Patients and methods The translation and cross-cultural adaptation were based on international guidelines. The Spanish version of the survey was applied to a community-based (sample A) and a hospital clinic-based sample (samples B and C). Samples A and B were used to determine criterion and construct validity comparing the survey findings with clinical evaluation and medical records, respectively; while sample C was used to determine intra- and inter-rater reliability. Results After completing the rigorous translation process, only four items were considered problematic and required a new translation. In total, 127 patients were included in the validation study: 76 to determine criterion and construct validity and 41 to establish intra- and inter-rater reliability. For an overall diagnosis of diabetes-related foot disease, a substantial level of agreement was obtained when we compared the Q-DFD with the clinical assessment (kappa 0.77, sensitivity 80.4%, specificity 91.5%, positive likelihood ratio [LR+] 9.46, negative likelihood ratio [LR−] 0.21); while an almost perfect level of agreement was obtained when it was compared with medical records (kappa 0.88, sensitivity 87%, specificity 97%, LR+ 29.0, LR− 0.13). Survey reliability showed substantial levels of agreement, with kappa scores of 0.63 and 0.73 for intra- and inter-rater reliability, respectively. Conclusion The translated and cross-culturally adapted Q-DFD showed good psychometric properties (validity, reproducibility, and reliability) that allow its use in Spanish-speaking diabetic populations. PMID:24039434
Vanwolleghem, Griet; Van Dyck, Delfien; Ducheyne, Fabian; De Bourdeaudhuij, Ilse; Cardon, Greet
2014-06-10
Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child's cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.
Validity and reliability of Optojump photoelectric cells for estimating vertical jump height.
Glatthorn, Julia F; Gouge, Sylvain; Nussbaumer, Silvio; Stauffacher, Simone; Impellizzeri, Franco M; Maffiuletti, Nicola A
2011-02-01
Vertical jump is one of the most prevalent acts performed in several sport activities. It is therefore important to ensure that the measurements of vertical jump height made as a part of research or athlete support work have adequate validity and reliability. The aim of this study was to evaluate concurrent validity and reliability of the Optojump photocell system (Microgate, Bolzano, Italy) with force plate measurements for estimating vertical jump height. Twenty subjects were asked to perform maximal squat jumps and countermovement jumps, and flight time-derived jump heights obtained by the force plate were compared with those provided by Optojump, to examine its concurrent (criterion-related) validity (study 1). Twenty other subjects completed the same jump series on 2 different occasions (separated by 1 week), and jump heights of session 1 were compared with session 2, to investigate test-retest reliability of the Optojump system (study 2). Intraclass correlation coefficients (ICCs) for validity were very high (0.997-0.998), even if a systematic difference was consistently observed between force plate and Optojump (-1.06 cm; p < 0.001). Test-retest reliability of the Optojump system was excellent, with ICCs ranging from 0.982 to 0.989, low coefficients of variation (2.7%), and low random errors (±2.81 cm). The Optojump photocell system demonstrated strong concurrent validity and excellent test-retest reliability for the estimation of vertical jump height. We propose the following equation that allows force plate and Optojump results to be used interchangeably: force plate jump height (cm) = 1.02 × Optojump jump height + 0.29. In conclusion, the use of Optojump photoelectric cells is legitimate for field-based assessments of vertical jump height.
Validity of Dietary Assessment in Athletes: A Systematic Review
Beck, Kathryn L.; Gifford, Janelle A.; Slater, Gary; Flood, Victoria M.; O’Connor, Helen
2017-01-01
Dietary assessment methods that are recognized as appropriate for the general population are usually applied in a similar manner to athletes, despite the knowledge that sport-specific factors can complicate assessment and impact accuracy in unique ways. As dietary assessment methods are used extensively within the field of sports nutrition, there is concern the validity of methodologies have not undergone more rigorous evaluation in this unique population sub-group. The purpose of this systematic review was to compare two or more methods of dietary assessment, including dietary intake measured against biomarkers or reference measures of energy expenditure, in athletes. Six electronic databases were searched for English-language, full-text articles published from January 1980 until June 2016. The search strategy combined the following keywords: diet, nutrition assessment, athlete, and validity; where the following outcomes are reported but not limited to: energy intake, macro and/or micronutrient intake, food intake, nutritional adequacy, diet quality, or nutritional status. Meta-analysis was performed on studies with sufficient methodological similarity, with between-group standardized mean differences (or effect size) and 95% confidence intervals (CI) being calculated. Of the 1624 studies identified, 18 were eligible for inclusion. Studies comparing self-reported energy intake (EI) to energy expenditure assessed via doubly labelled water were grouped for comparison (n = 11) and demonstrated mean EI was under-estimated by 19% (−2793 ± 1134 kJ/day). Meta-analysis revealed a large pooled effect size of −1.006 (95% CI: −1.3 to −0.7; p < 0.001). The remaining studies (n = 7) compared a new dietary tool or instrument to a reference method(s) (e.g., food record, 24-h dietary recall, biomarker) as part of a validation study. This systematic review revealed there are limited robust studies evaluating dietary assessment methods in athletes. Existing literature demonstrates the substantial variability between methods, with under- and misreporting of intake being frequently observed. There is a clear need for careful validation of dietary assessment methods, including emerging technical innovations, among athlete populations. PMID:29207495
Knudsen, Vibeke K; Hatch, Elizabeth E; Cueto, Heidi; Tucker, Katherine L; Wise, Lauren; Christensen, Tue; Mikkelsen, Ellen M
2016-04-01
To assess the relative validity of a semi-quantitative, web-based FFQ completed by female pregnancy planners in the Danish 'Snart Forældre' study. We validated a web-based FFQ based on the FFQ used in the Danish National Birth Cohort against a 4 d food diary (FD) and assessed the relative validity of intakes of foods and nutrients. We compared means and medians of intakes, and calculated Pearson correlation coefficients and de-attenuated coefficients to assess agreement between the two methods. We also calculated the proportion correctly classified based on the same or adjacent quintile of intake and the proportion of grossly misclassified (extreme quintiles). Participants (n 128) in the 'Snart Forældre' study who had completed the web-based FFQ were invited to participate in the validation study. Participants in the 'Snart Forældre' study, in total ninety-seven women aged 20-42 years. Reported intakes of dairy products, vegetables and potatoes were higher in the FFQ compared with the FD, whereas reported intakes of fruit, meat, sugar and beverages were lower in the FFQ than in the FD. Overall the de-attenuated correlation coefficients were acceptable, ranging from 0·33 for energy to 0·93 for vitamin D. The majority of the women were classified in the same or adjacent quintile and few women were misclassified (extreme quintiles). The web-based FFQ performs well for ranking women of reproductive age according to high or low intake of foods and nutrients and, thus, provides a solid basis for investigating associations between diet and fertility.
Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J
2014-05-01
Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
Park, Myonghwa; Kyung Kim, Sun; Jeong, Miri; Lee, Song Ja; Kim, Seon Hwa; Kim, Jinha; Lee, Dong Young
2018-04-10
The prevalence of dementia has increased rapidly with an aging Korean population. Compared to those without dementia, individuals with dementia have more and complex needs. In this study, the Korean version of the Camberwell Assessment of Need for the Elderly (CANE-K) was evaluated to determine its suitability for individuals with dementia in Korea. The CANE-K was developed following linguistic validation. The reliability of the measurement was examined with Cronbach's alpha coefficient. The factor structure and construct validity were evaluated by performing exploratory factor analysis (EFA) and confirmatory factor analyses (CFA). Pearson's correlation coefficients with related measures were used to ensure concurrent validity. Four factors extracted with EFA and CFA validated the model structure (X 2 = 367.25, p = .000, goodness of fit index = .84, adjusted goodness of fit index = .80, root mean square error of approximation = .07, and comparative fit index = .83). Items on the CANE-K loaded on the four factors in a range between .40 and .80. The output of Pearson's correlation coefficient with cognitive impairment, behavioral problems, activities of daily living and caregiver burden showed acceptable concurrent validity. The CANE-K showed a reasonable degree of reliability and validity. Therefore, it has good potential to appropriately measure the needs and unmet needs of those with dementia. Copyright © 2018. Published by Elsevier B.V.
Betz, C; Mannsdörfer, K; Bischoff, S C
2013-10-01
Irritable bowel syndrome (IBS) is a functional gastrointestinal disorder characterised by abdominal pain, associated with stool abnormalities and changes in stool consistency. Diagnosis of IBS is based on characteristic symptoms and exclusion of other gastrointestinal diseases. A number of questionnaires exist to assist diagnosis and assessment of severity of the disease. One of these is the irritable bowel syndrome - severity scoring system (IBS-SSS). The IBS-SSS was validated 1997 in its English version. In the present study, the IBS-SSS has been validated in German language. To do this, a cohort of 60 patients with IBS according to the Rome III criteria, was compared with a control group of healthy individuals (n = 38). We studied sensitivity and reproducibility of the score, as well as the sensitivity to detect changes of symptom severity. The results of the German validation largely reflect the results of the English validation. The German version of the IBS-SSS is also a valid, meaningful and reproducible questionnaire with a high sensitivity to assess changes in symptom severity, especially in IBS patients with moderate symptoms. It is unclear if the IBS-SSS is also a valid questionnaire in IBS patients with severe symptoms because this group of patients was not studied. © Georg Thieme Verlag KG Stuttgart · New York.
Failure mode and effects analysis outputs: are they valid?
2012-01-01
Background Failure Mode and Effects Analysis (FMEA) is a prospective risk assessment tool that has been widely used within the aerospace and automotive industries and has been utilised within healthcare since the early 1990s. The aim of this study was to explore the validity of FMEA outputs within a hospital setting in the United Kingdom. Methods Two multidisciplinary teams each conducted an FMEA for the use of vancomycin and gentamicin. Four different validity tests were conducted: · Face validity: by comparing the FMEA participants’ mapped processes with observational work. · Content validity: by presenting the FMEA findings to other healthcare professionals. · Criterion validity: by comparing the FMEA findings with data reported on the trust’s incident report database. · Construct validity: by exploring the relevant mathematical theories involved in calculating the FMEA risk priority number. Results Face validity was positive as the researcher documented the same processes of care as mapped by the FMEA participants. However, other healthcare professionals identified potential failures missed by the FMEA teams. Furthermore, the FMEA groups failed to include failures related to omitted doses; yet these were the failures most commonly reported in the trust’s incident database. Calculating the RPN by multiplying severity, probability and detectability scores was deemed invalid because it is based on calculations that breach the mathematical properties of the scales used. Conclusion There are significant methodological challenges in validating FMEA. It is a useful tool to aid multidisciplinary groups in mapping and understanding a process of care; however, the results of our study cast doubt on its validity. FMEA teams are likely to need different sources of information, besides their personal experience and knowledge, to identify potential failures. As for FMEA’s methodology for scoring failures, there were discrepancies between the teams’ estimates and similar incidents reported on the trust’s incident database. Furthermore, the concept of multiplying ordinal scales to prioritise failures is mathematically flawed. Until FMEA’s validity is further explored, healthcare organisations should not solely depend on their FMEA results to prioritise patient safety issues. PMID:22682433
Green, Dido; Meroz, Anat; Margalit, Adi Edit; Ratzon, Navah Z
2012-11-01
This study examines a potential instrument for measurement of typing postures of children. This paper describes inter-rater, test-retest reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS), an observational measurement of postures and movements during keyboarding, for use with children. Two trained raters independently rated videos of 24 children (aged 7-10 years). Six children returned one week later for identifying test-retest reliability. Concurrent validity was assessed by comparing ratings obtained using the K-PECS to scores from a 3D motion analysis system. Inter-rater reliability was moderate to high for 12 out of 16 items (Kappa: 0.46 to 1.00; correlation coefficients: 0.77-0.95) and test-retest reliability varied across items (Kappa: 0.25 to 0.67; correlation coefficients: r = 0.20 to r = 0.95). Concurrent validity compared favourably across arm pathlength, wrist extension and ulnar deviation. In light of the limitations of other tools the K-PeCS offers a fairly affordable, reliable and valid instrument to address the gap for measurement of typing styles of children, despite the shortcomings of some items. However further research is required to refine the instrument for use in evaluating typing among children. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Developing and validating the Youth Conduct Problems Scale-Rwanda: a mixed methods approach.
Ng, Lauren C; Kanyanganzi, Frederick; Munyanah, Morris; Mushashi, Christine; Betancourt, Theresa S
2014-01-01
This study developed and validated the Youth Conduct Problems Scale-Rwanda (YCPS-R). Qualitative free listing (n = 74) and key informant interviews (n = 47) identified local conduct problems, which were compared to existing standardized conduct problem scales and used to develop the YCPS-R. The YCPS-R was cognitive tested by 12 youth and caregiver participants, and assessed for test-retest and inter-rater reliability in a sample of 64 youth. Finally, a purposive sample of 389 youth and their caregivers were enrolled in a validity study. Validity was assessed by comparing YCPS-R scores to conduct disorder, which was diagnosed with the Mini International Neuropsychiatric Interview for Children, and functional impairment scores on the World Health Organization Disability Assessment Schedule Child Version. ROC analyses assessed the YCPS-R's ability to discriminate between youth with and without conduct disorder. Qualitative data identified a local presentation of youth conduct problems that did not match previously standardized measures. Therefore, the YCPS-R was developed solely from local conduct problems. Cognitive testing indicated that the YCPS-R was understandable and required little modification. The YCPS-R demonstrated good reliability, construct, criterion, and discriminant validity, and fair classification accuracy. The YCPS-R is a locally-derived measure of Rwandan youth conduct problems that demonstrated good psychometric properties and could be used for further research.
Yi, Ming; Zhao, Yongmei; Jia, Li; He, Mei; Kebebew, Electron; Stephens, Robert M.
2014-01-01
To apply exome-seq-derived variants in the clinical setting, there is an urgent need to identify the best variant caller(s) from a large collection of available options. We have used an Illumina exome-seq dataset as a benchmark, with two validation scenarios—family pedigree information and SNP array data for the same samples, permitting global high-throughput cross-validation, to evaluate the quality of SNP calls derived from several popular variant discovery tools from both the open-source and commercial communities using a set of designated quality metrics. To the best of our knowledge, this is the first large-scale performance comparison of exome-seq variant discovery tools using high-throughput validation with both Mendelian inheritance checking and SNP array data, which allows us to gain insights into the accuracy of SNP calling through such high-throughput validation in an unprecedented way, whereas the previously reported comparison studies have only assessed concordance of these tools without directly assessing the quality of the derived SNPs. More importantly, the main purpose of our study was to establish a reusable procedure that applies high-throughput validation to compare the quality of SNP discovery tools with a focus on exome-seq, which can be used to compare any forthcoming tool(s) of interest. PMID:24831545
Validation of the 'Test of the Adherence to Inhalers' (TAI) for Asthma and COPD Patients.
Plaza, Vicente; Fernández-Rodríguez, Concepción; Melero, Carlos; Cosío, Borja G; Entrenas, Luís Manuel; de Llano, Luis Pérez; Gutiérrez-Pereyra, Fernando; Tarragona, Eduard; Palomino, Rosa; López-Viña, Antolín
2016-04-01
To validate the 'Test of Adherence to Inhalers' (TAI), a 12-item questionnaire designed to assess the adherence to inhalers in patients with COPD or asthma. A total of 1009 patients with asthma or COPD participated in a cross-sectional multicenter study. Patients with electronic adherence ≥80% were defined as adherents. Construct validity, internal validity, and criterion validity were evaluated. Self-reported adherence was compared with the Morisky-Green questionnaire. Factor analysis study demonstrated two factors, factor 1 was coincident with TAI patient domain (items 1 to 10) and factor 2 with TAI health-care professional domain (items 11 and 12). The Cronbach's alpha was 0.860 and the test-retest reliability 0.883. TAI scores correlated with electronic adherence (ρ=0.293, p=0.01). According to the best cut-off for 10 items (score 50, area under the ROC curve 0.7), 569 (62.5%) patients were classified as non-adherents. The non-adherence behavior pattern was: erratic 527 (57.9%), deliberate 375 (41.2%), and unwitting 242 (26.6%) patients. As compared to Morisky-Green test, TAI showed better psychometric properties. The TAI is a reliable and homogeneous questionnaire to identify easily non-adherence and to classify from a clinical perspective the barriers related to the use of inhalers in asthma and COPD.
Statistical considerations on prognostic models for glioma
Molinaro, Annette M.; Wrensch, Margaret R.; Jenkins, Robert B.; Eckel-Passow, Jeanette E.
2016-01-01
Given the lack of beneficial treatments in glioma, there is a need for prognostic models for therapeutic decision making and life planning. Recently several studies defining subtypes of glioma have been published. Here, we review the statistical considerations of how to build and validate prognostic models, explain the models presented in the current glioma literature, and discuss advantages and disadvantages of each model. The 3 statistical considerations to establishing clinically useful prognostic models are: study design, model building, and validation. Careful study design helps to ensure that the model is unbiased and generalizable to the population of interest. During model building, a discovery cohort of patients can be used to choose variables, construct models, and estimate prediction performance via internal validation. Via external validation, an independent dataset can assess how well the model performs. It is imperative that published models properly detail the study design and methods for both model building and validation. This provides readers the information necessary to assess the bias in a study, compare other published models, and determine the model's clinical usefulness. As editors, reviewers, and readers of the relevant literature, we should be cognizant of the needed statistical considerations and insist on their use. PMID:26657835
Sperandio, Naiara; Morais, Dayane de Castro; Priore, Silvia Eloiza
2018-02-01
The scope of this systematic review was to compare the food insecurity scales validated and used in the countries in Latin America and the Caribbean, and analyze the methods used in validation studies. A search was conducted in the Lilacs, SciELO and Medline electronic databases. The publications were pre-selected by titles and abstracts, and subsequently by a full reading. Of the 16,325 studies reviewed, 14 were selected. Twelve validated scales were identified for the following countries: Venezuela, Brazil, Colombia, Bolivia, Ecuador, Costa Rica, Mexico, Haiti, the Dominican Republic, Argentina and Guatemala. Besides these, there is the Latin American and Caribbean scale, the scope of which is regional. The scales ranged from the standard reference used, number of questions and diagnosis of insecurity. The methods used by the studies for internal validation were calculation of Cronbach's alpha and the Rasch model; for external validation the authors calculated association and /or correlation with socioeconomic and food consumption variables. The successful experience of Latin America and the Caribbean in the development of national and regional scales can be an example for other countries that do not have this important indicator capable of measuring the phenomenon of food insecurity.
Predictive validity of the Braden Scale, Norton Scale, and Waterlow Scale in the Czech Republic.
Šateková, Lenka; Žiaková, Katarína; Zeleníková, Renáta
2017-02-01
The aim of this study was to determine the predictive validity of the Braden, Norton, and Waterlow scales in 2 long-term care departments in the Czech Republic. Assessing the risk for developing pressure ulcers is the first step in their prevention. At present, many scales are used in clinical practice, but most of them have not been properly validated yet (for example, the Modified Norton Scale in the Czech Republic). In the Czech Republic, only the Braden Scale has been validated so far. This is a prospective comparative instrument testing study. A random sample of 123 patients was recruited. The predictive validity of the pressure ulcer risk assessment scales was evaluated based on sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve. The data were collected from April to August 2014. In the present study, the best predictive validity values were observed for the Norton Scale, followed by the Braden Scale and the Waterlow Scale, in that order. We recommended that the above 3 pressure ulcer risk assessment scales continue to be evaluated in the Czech clinical setting. © 2016 John Wiley & Sons Australia, Ltd.
[Validity and Reliability of Korean Version of the Spiritual Care Competence Scale].
Chung, Mi Ja; Park, Youngrye; Eun, Young
2016-12-01
The aim of this study was to examine the validity and reliability of the Korean Version of the Spiritual Care Competence Scale (K-SCCS). A cross-sectional study design was used. The K-SCCS consisted of 26 questions to measure spiritual care competence of nurses. Participants, 228 nurses who had more than 3 years'experience as a nurse, completed the survey. Confirmatory factor analysis was used to examine the construct validity and correlations of K-SCCS and spiritual well-being (SWB) were used to examine the criterion validity of K-SCCS. Cronbach's alpha was used to test internal consistency. The construct and the criterion-related validity of K-SCCS were supported as measures of spiritual care competence. Cronbach's alpha was .95. Factor loadings of the 26 questions ranged from .60 to .96. Construct validity of K-SCCS was verified by confirmatory factor analysis (RMSEA=.08, CFI=.90, NFI=.85). Criterion validity compared to the SWB showed significant correlation (r=.44, p<.001). The findings suggest that K-SCCS serves as an appropriate measure of spiritual care competence with validity and reliability. However, further study is needed to retest the verification of the factor analysis related to factor 2 (professionalisation and improving the quality of spiritual care) and factor 3 (personal support and patient counseling). Therefore, we recommend using the total score without distinguishing subscales.
Moreno Berggren, Daniel; Folkvaljon, Yasin; Engvall, Marie; Sundberg, Johan; Lambe, Mats; Antunovic, Petar; Garelius, Hege; Lorenz, Fryderyk; Nilsson, Lars; Rasmussen, Bengt; Lehmann, Sören; Hellström-Lindberg, Eva; Jädersten, Martin; Ejerblad, Elisabeth
2018-06-01
The myelodysplastic syndromes (MDS) have highly variable outcomes and prognostic scoring systems are important tools for risk assessment and to guide therapeutic decisions. However, few population-based studies have compared the value of the different scoring systems. With data from the nationwide Swedish population-based MDS register we validated the International Prognostic Scoring System (IPSS), revised IPSS (IPSS-R) and the World Health Organization (WHO) Classification-based Prognostic Scoring System (WPSS). We also present population-based data on incidence, clinical characteristics including detailed cytogenetics and outcome from the register. The study encompassed 1329 patients reported to the register between 2009 and 2013, 14% of these had therapy-related MDS (t-MDS). Based on the MDS register, the yearly crude incidence of MDS in Sweden was 2·9 per 100 000 inhabitants. IPSS-R had a significantly better prognostic power than IPSS (P < 0·001). There was a trend for better prognostic power of IPSS-R compared to WPSS (P = 0·05) and for WPSS compared to IPSS (P = 0·07). IPSS-R was superior to both IPSS and WPSS for patients aged ≤70 years. Patients with t-MDS had a worse outcome compared to de novo MDS (d-MDS), however, the validity of the prognostic scoring systems was comparable for d-MDS and t-MDS. In conclusion, population-based studies are important to validate prognostic scores in a 'real-world' setting. In our nationwide cohort, the IPSS-R showed the best predictive power. © 2018 John Wiley & Sons Ltd.
Baum, C M; Wolf, T J; Wong, A W K; Chen, C H; Walker, K; Young, A C; Carlozzi, N E; Tulsky, D S; Heaton, R K; Heinemann, A W
2017-07-01
This study examined the relationships between the Executive Function Performance Test (EFPT), the NIH Toolbox Cognitive Function tests, and neuropsychological executive function measures in 182 persons with traumatic brain injury (TBI) and 46 controls to evaluate construct, discriminant, and predictive validity. Construct validity: There were moderate correlations between the EFPT and the NIH Toolbox Crystallized (r = -.479), Fluid Tests (r = -.420), and Total Composite Scores (r = -.496). Discriminant validity: Significant differences were found in the EFPT total and sequence scores across control, complicated mild/moderate, and severe TBI groups. We found differences in the organisation score between control and severe, and between mild and severe TBI groups. Both TBI groups had significantly lower scores in safety and judgement than controls. Compared to the controls, the severe TBI group demonstrated significantly lower performance on all instrumental activities of daily living (IADL) tasks. Compared to the mild TBI group, the controls performed better on the medication task, the severe TBI group performed worse in the cooking and telephone tasks. Predictive validity: The EFPT predicted the self-perception of independence measured by the TBI-QOL (beta = -0.49, p < .001) for the severe TBI group. Overall, these data support the validity of the EFPT for use in individuals with TBI.
Measuring the emotional climate of an organization.
Yurtsever, Gülçimen; De Rivera, Joseph
2010-04-01
The importance of emotional climate in the organizational climate literature has gained interest. However, few studies have concentrated on adequately measuring the emotional climate of organizations. In this study, a reliable and valid scale was developed to measure the most important aspects of emotional climate in different organizations. This study presents evidence of reliability and validity for 28 items constructed to measure emotional climate in an organization in four separate studies. The data were obtained from working people from four different organizations by self-administered questionnaires. The findings indicate that three factors--Trust, Hope, and Security--were factors of the 28-item scale. Validation data also included correlations with duration of employment. The other method of assessing criterion validity was by comparing mean scores in organizations with differing productivity; results indicated that the organization with more productive members had a significantly higher mean score on emotional climate and its subscales. The generalizability of the results to private businesses also was assessed.
Isometric hand grip strength measured by the Nintendo Wii Balance Board - a reliable new method.
Blomkvist, A W; Andersen, S; de Bruin, E D; Jorgensen, M G
2016-02-03
Low hand grip strength is a strong predictor for both long-term and short-term disability and mortality. The Nintendo Wii Balance Board (WBB) is an inexpensive, portable, wide-spread instrument with the potential for multiple purposes in assessing clinically relevant measures including muscle strength. The purpose of the study was to explore intrarater reliability and concurrent validity of the WBB by comparing it to the Jamar hand dynamometer. Intra-rater test-retest cohort design with randomized validity testing on the first session. Using custom WBB software, thirty old adults (69.0 ± 4.2 years of age) were studied for reproducibility and concurrent validity compared to the Jamar hand dynamometer. Reproducibility was tested for dominant and non-dominant hands during the same time-of-day, one week apart. Intraclass correlation coefficient (ICC) and standard error of measurement (SEM) and limits of agreement (LOA) were calculated to describe relative and absolute reproducibility respectively. To describe concurrent validity, Pearson's product-moment correlation and ICC was calculated. Reproducibility was high with ICC values of >0.948 across all measures. Both SEM and LOA were low (0.2-0.5 kg and 2.7-4.2 kg, respectively) in both the dominant and non-dominant hand. For validity, Pearson correlations were high (0.80-0.88) and ICC values were fair to good (0.763-0.803). Reproducibility for WBB was high for relative measures and acceptable for absolute measures. In addition, concurrent validity between the Jamar hand dynamometer and the WBB was acceptable. Thus, the WBB may be a valid instrument to assess hand grip strength in older adults.
Summers, Rebekah L S; Chen, Mo; Kimberley, Teresa J
2017-01-01
Muscular targets that are deep or inaccessible to surface electromyography (sEMG) require intrinsic recording using fine-wire electromyography (fEMG). It is unknown if fEMG validly record cortically evoked muscle responses compared to sEMG. The purpose of this investigation was to establish the validity and agreement of fEMG compared to sEMG to quantify typical transcranial magnetic stimulation (TMS) measures pre and post repetitive TMS (rTMS). The hypotheses were that fEMG would demonstrate excellent validity and agreement compared with sEMG. In ten healthy volunteers, paired pulse and cortical silent period (CSP) TMS measures were collected before and after 1200 pulses of 1Hz rTMS to the motor cortex. Data were simultaneously recorded with sEMG and fEMG in the first dorsal interosseous. Concurrent validity (r and rho) and agreement (Tukey mean-difference) were calculated. fEMG quantified corticospinal excitability with good to excellent validity compared to sEMG data at both pretest (r = 0.77-0.97) and posttest (r = 0.83-0.92). Pairwise comparisons indicated no difference between sEMG and fEMG for all outcomes; however, Tukey mean-difference plots display increased variance and questionable agreement for paired pulse outcomes. CSP displayed the highest estimates of validity and agreement. Paired pulse MEP responses recorded with fEMG displayed reduced validity, agreement and less sensitivity to changes in MEP amplitude compared to sEMG. Change scores following rTMS were not significantly different between sEMG and fEMG. fEMG electrodes are a valid means to measure CSP and paired pulse MEP responses. CSP displays the highest validity estimates, while caution is warranted when assessing paired pulse responses with fEMG. Corticospinal excitability and neuromodulatory aftereffects from rTMS may be assessed using fEMG.
ERIC Educational Resources Information Center
Weeden, Marc; Ehrhardt, Kristal; Poling, Alan
2009-01-01
Both risperidone, an atypical antipsychotic drug, and function-based behavior-analytic interventions are popular and empirically validated treatments for reducing challenging behavior in children with autism. The kind of research that supports their effectiveness differs, however, and no published study has directly compared their effects or…
Evaluating the Evaluators: Comparative Study of High School Newspaper Critique Services.
ERIC Educational Resources Information Center
Davis, Nancy
High school publication staffs depend on national critique services as a major means of evaluation and recognition, but most have no measure of how one critique service compares to the others, because they can afford the entry fee for only one evaluation. Thus, a study was conducted to test the validity of three major national critique…
Validity and reliability of an occupational exposure questionnaire for parkinsonism in welders.
Hobson, Angela J; Sterling, David A; Emo, Brett; Evanoff, Bradley A; Sterling, Callen S; Good, Laura; Seixas, Noah; Checkoway, Harvey; Racette, Brad A
2009-06-01
This study assessed the validity and test-retest reliability of a medical and occupational history questionnaire for workers performing welding in the shipyard industry. This self-report questionnaire was developed for an epidemiologic study of the risk of parkinsonism in welders. Validity participants recruited from three similar shipyards were asked to give consent for access to personnel files and complete the questionnaire. Responses on the questionnaire were compared with information extracted from personnel records. Reliability participants were recruited from the same shipyards and were asked to complete the questionnaire at two different times approximately 4 weeks apart. Percent agreement, kappa, intraclass correlation coefficient (ICC), and sensitivity and specificity were used as measures of validity and/or reliability. Personnel files were obtained for 101 of 143 participants (70%) in the validity study, and 56 of the 95 (58.9%) participants in the reliability study completed the retest of the questionnaire. Validity scores for items extracted from personnel files were high. Percent agreement for employment dates and job titles ranged from 83-100%, while ICC for start and stop dates ranged from 0.93-0.99. Sensitivity and specificity for current job title ranged from 0.5-1.0. Reliability scores for demographic, medical and health behavior items were mainly moderate or high, but ranged from 0.19 to 1.0. Most recent job/title items such as title, types of welding performed, and material used showed substantial to perfect agreement. Certain determinants of exposure such as days and hours per week exposed to welding fumes demonstrated mainly moderate agreement (kappa= 0.42-0.47, percent agreement 63-77%); however, mean days and hours reported did not differ between test and retest. The results of this study suggest that participants' self-report for job title and dates employed are valid compared with employer records. While kappa scores were low for some medical conditions and for caffeine consumption, high kappa scores for job title, dates worked, types of welding, and materials welded suggest participants generated reproducible answers important for occupational exposure assessment.
Hoppe, Matthias W; Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen
2018-01-01
This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6-8.0%; CV: 1.1-5.1%) and sprint mechanical properties (TEE: 4.5-14.3%; CV: 3.1-7.5%) than the 10 Hz GPS (TEE: 3.0-12.9%; CV: 2.5-13.0% and TEE: 4.1-23.1%; CV: 3.3-20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0-6.0%; CV: 0.7-5.0% and TEE: 2.1-9.2%; CV: 1.6-7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that 18 Hz GPS has enhanced validity and reliability for determining movement patterns in team sports compared to 10 Hz GPS, whereas 20 Hz LPS had superior validity and reliability overall. However, compared to 10 Hz GPS, 18 Hz GPS and 20 Hz LPS technologies had more outliers due to measurement errors, which limits their practical applications at this time.
Coster, Wendy J.; Haley, Stephen M.; Ni, Pengsheng; Dumas, Helene M.; Fragala-Pinkham, Maria A.
2009-01-01
Objective To examine score agreement, validity, precision, and response burden of a prototype computer adaptive testing (CAT) version of the Self-Care and Social Function scales of the Pediatric Evaluation of Disability Inventory (PEDI) compared to the full-length version of these scales. Design Computer simulation analysis of cross-sectional and longitudinal retrospective data; cross-sectional prospective study. Settings Pediatric rehabilitation hospital, including inpatient acute rehabilitation, day school program, outpatient clinics; community-based day care, preschool, and children’s homes. Participants Four hundred sixty-nine children with disabilities and 412 children with no disabilities (analytic sample); 38 children with disabilities and 35 children without disabilities (cross-validation sample). Interventions Not applicable. Main Outcome Measures Summary scores from prototype CAT applications of each scale using 15-, 10-, and 5-item stopping rules; scores from the full-length Self-Care and Social Function scales; time (in seconds) to complete assessments and respondent ratings of burden. Results Scores from both computer simulations and field administration of the prototype CATs were highly consistent with scores from full-length administration (all r’s between .94 and .99). Using computer simulation of retrospective data, discriminant validity and sensitivity to change of the CATs closely approximated that of the full-length scales, especially when the 15- and 10-item stopping rules were applied. In the cross-validation study the time to administer both CATs was 4 minutes, compared to over 16 minutes to complete the full-length scales. Conclusions Self-care and Social Function score estimates from CAT administration are highly comparable to those obtained from full-length scale administration, with small losses in validity and precision and substantial decreases in administration time. PMID:18373991
Virtual temporal bone dissection system: OSU virtual temporal bone system: development and testing.
Wiet, Gregory J; Stredney, Don; Kerwin, Thomas; Hittle, Bradley; Fernandez, Soledad A; Abdel-Rasoul, Mahmoud; Welling, D Bradley
2012-03-01
The objective of this project was to develop a virtual temporal bone dissection system that would provide an enhanced educational experience for the training of otologic surgeons. A randomized, controlled, multi-institutional, single-blinded validation study. The project encompassed four areas of emphasis: structural data acquisition, integration of the system, dissemination of the system, and validation. Structural acquisition was performed on multiple imaging platforms. Integration achieved a cost-effective system. Dissemination was achieved on different levels including casual interest, downloading of software, and full involvement in development and validation studies. A validation study was performed at eight different training institutions across the country using a two-arm randomized trial where study subjects were randomized to a 2-week practice session using either the virtual temporal bone or standard cadaveric temporal bones. Eighty subjects were enrolled and randomized to one of the two treatment arms; 65 completed the study. There was no difference between the two groups using a blinded rating tool to assess performance after training. A virtual temporal bone dissection system has been developed and compared to cadaveric temporal bones for practice using a multicenter trial. There was no statistical difference between practice on the current simulator compared to practice on human cadaveric temporal bones. Further refinements in structural acquisition and interface design have been identified, which can be implemented prior to full incorporation into training programs and used for objective skills assessment. Copyright © 2012 The American Laryngological, Rhinological, and Otological Society, Inc.
Toledano, Mireille B; Auvinen, Anssi; Tettamanti, Giorgio; Cao, Yang; Feychting, Maria; Ahlbom, Anders; Fremling, Karin; Heinävaara, Sirpa; Kojo, Katja; Knowles, Gemma; Smith, Rachel B; Schüz, Joachim; Johansen, Christoffer; Poulsen, Aslak Harbo; Deltour, Isabelle; Vermeulen, Roel; Kromhout, Hans; Elliott, Paul; Hillert, Lena
2018-01-01
This study investigates validity of self-reported mobile phone use in a subset of 75 993 adults from the COSMOS cohort study. Agreement between self-reported and operator-derived mobile call frequency and duration for a 3-month period was assessed using Cohen's weighted Kappa (κ). Sensitivity and specificity of both self-reported high (≥10 calls/day or ≥4h/week) and low (≤6 calls/week or <30min/week) mobile phone use were calculated, as compared to operator data. For users of one mobile phone, agreement was fair for call frequency (κ=0.35, 95% CI: 0.35, 0.36) and moderate for call duration (κ=0.50, 95% CI: 0.49, 0.50). Self-reported low call frequency and duration demonstrated high sensitivity (87% and 76% respectively), but for high call frequency and duration sensitivity was lower (38% and 56% respectively), reflecting a tendency for greater underestimation than overestimation. Validity of self-reported mobile phone use was lower in women, younger age groups and those reporting symptoms during/shortly after using a mobile phone. This study highlights the ongoing value of using self-report data to measure mobile phone use. Furthermore, compared to continuous scale estimates used by previous studies, categorical response options used in COSMOS appear to improve validity considerably, most likely by preventing unrealistically high estimates from being reported. Copyright © 2017 Elsevier GmbH. All rights reserved.
Validity of the SenseWear armband step count measure during controlled and free-living conditions.
Lee, Joey Allen; Laurson, Kelly Rian
2015-06-01
Advances in technology continue to provide numerous options for physical activity assessment. These advances necessitate evaluation of the validity of newly developed activity monitors being used in clinical and research settings. The purpose of this study was to validate the SenseWear Pro3 Armband (SWA) step counts during treadmill walking and free-living conditions. Study 1 observed 39 individuals (17 males, 22 females) wearing an SWA and a Yamax Digiwalker SW-701 pedometer (DIGI) during treadmill walking, utilizing manually counted steps as the criterion. Study 2 compared free-living step count data from 35 participants (17 males, 18 females) wearing the SWA and DIGI (comparison) for 3 consecutive days. During Study 1, the SWA underestimated steps by 16.0%, 10.7%, 5.6%, 6.1%, and 6.5% at speeds of 54 m/min, 67 m/min, 80 m/min, 94 m/min, and 107 m/min, respectively, compared to manually counted steps. During Study 2, the intraclass correlation (ICC) coefficient of mean steps/d between the SWA and DIGI was strong (r = 0.98, p < 0.001). Unlike Study 1, the SWA overestimated step counts during the 3-day wear period by an average of 1028 steps/d (or +11.3%) compared to the DIGI. When analyzed individually, the SWA consistently overestimated step counts for each day ( p < 0.05). The SWA underestimates steps during treadmill walking and appears to overestimate steps during free-living compared to the DIGI pedometer. Caution is warranted when using the SWA to count steps. Modifications are needed to enhance step counting accuracy.
Feldsine, Philip; Kaur, Mandeep; Shah, Khyati; Immerman, Amy; Jucker, Markus; Lienau, Andrew
2015-01-01
Assurance GDSTM for Salmonella Tq has been validated according to the AOAC INTERNATIONAL Methods Committee Guidelines for Validation of Microbiological Methods for Food and Environmental Surfaces for the detection of selected foods and environmental surfaces (Official Method of AnalysisSM 2009.03, Performance Tested MethodSM No. 050602). The method also completed AFNOR validation (following the ISO 16140 standard) compared to the reference method EN ISO 6579. For AFNOR, GDS was given a scope covering all human food, animal feed stuff, and environmental surfaces (Certificate No. TRA02/12-01/09). Results showed that Assurance GDS for Salmonella (GDS) has high sensitivity and is equivalent to the reference culture methods for the detection of motile and non-motile Salmonella. As part of the aforementioned validations, inclusivity and exclusivity studies, stability, and ruggedness studies were also conducted. Assurance GDS has 100% inclusivity and exclusivity among the 100 Salmonella serovars and 35 non-Salmonella organisms analyzed. To add to the scope of the Assurance GDS for Salmonella method, a matrix extension study was conducted, following the AOAC guidelines, to validate the application of the method for selected spices, specifically curry powder, cumin powder, and chili powder, for the detection of Salmonella.
Lin, Ju-Han; Nien, Chiao-Lin; Hsu, Ya-Wen; Liu, Hong-Yu
2016-01-01
Background Although Perceived Stress Scale (PSS, Cohen, Kamarack & Mermelstein, 1983) has been validated and widely used in many domains, there is still no validation in sports by comparing athletes and non-athletes and examining related psychometric indices. Purpose The purpose of this study was to examine the measurement invariance of PSS between athletes and non-athletes, and examine construct validity and reliability in the sports contexts. Methods Study 1 sampled 359 college student-athletes (males = 233; females = 126) and 242 non-athletes (males = 124; females = 118) and examined factorial structure, measurement invariance and internal consistency. Study 2 sampled 196 student-athletes (males = 139, females = 57, Mage = 19.88 yrs, SD = 1.35) and examined discriminant validity and convergent validity of PSS. Study 3 sampled 37 student-athletes to assess test-retest reliability of PSS. Results Results found that 2-factor PSS-10 fitted the model the best and had appropriate reliability. Also, there was a measurement invariance between athletes and non-athletes; and PSS positively correlated with athletic burnout and life stress but negatively correlated with coping efficacy provided evidence of discriminant validity and convergent validity. Further, the test-retest reliability for PSS subscales was significant (r = .66 and r = .50). Discussion It is suggested that 2-factor PSS-10 can be a useful tool in assessing perceived stress either in sports or non-sports settings. We suggest future study may use 2-factor PSS-10 in examining the effects of stress on the athletic injury, burnout, and psychiatry disorders. PMID:27994983
Leporace, Gustavo; Batista, Luiz Alberto; Serra Cruz, Raphael; Zeitoune, Gabriel; Cavalin, Gabriel Armondi; Metsavaht, Leonardo
2018-03-01
The purpose of this study was to test the validity of dynamic leg length discrepancy (DLLD) during gait as a radiation-free screening method for measuring anatomic leg length discrepancy (ALLD). Thirty-three subjects with mild leg length discrepancy walked along a walkway and the dynamic leg length discrepancy (DLLD) was calculated using a motion analysis system. Pearson correlation and paired Student t -tests were applied to calculate the correlation and compare the differences between DLLD and ALLD (α = 0.05). The results of our study showed DLLD is not a valid method to predict ALLD in subjects with mild limb discrepancy.
Validation of a dynamic linked segment model to calculate joint moments in lifting.
de Looze, M P; Kingma, I; Bussmann, J B; Toussaint, H M
1992-08-01
A two-dimensional dynamic linked segment model was constructed and applied to a lifting activity. Reactive forces and moments were calculated by an instantaneous approach involving the application of Newtonian mechanics to individual adjacent rigid segments in succession. The analysis started once at the feet and once at a hands/load segment. The model was validated by comparing predicted external forces and moments at the feet or at a hands/load segment to actual values, which were simultaneously measured (ground reaction force at the feet) or assumed to be zero (external moments at feet and hands/load and external forces, beside gravitation, at hands/load). In addition, results of both procedures, in terms of joint moments, including the moment at the intervertebral disc between the fifth lumbar and first sacral vertebra (L5-S1), were compared. A correlation of r = 0.88 between calculated and measured vertical ground reaction forces was found. The calculated external forces and moments at the hands showed only minor deviations from the expected zero level. The moments at L5-S1, calculated starting from feet compared to starting from hands/load, yielded a coefficient of correlation of r = 0.99. However, moments calculated from hands/load were 3.6% (averaged values) and 10.9% (peak values) higher. This difference is assumed to be due mainly to erroneous estimations of the positions of centres of gravity and joint rotation centres. The estimation of the location of L5-S1 rotation axis can affect the results significantly. Despite the numerous studies estimating the load on the low back during lifting on the basis of linked segment models, only a few attempts to validate these models have been made. This study is concerned with the validity of the presented linked segment model. The results support the model's validity. Effects of several sources of error threatening the validity are discussed. Copyright © 1992. Published by Elsevier Ltd.
Macedo-Ojeda, Gabriela; Márquez-Sandoval, Fabiola; Fernández-Ballart, Joan; Vizmanos, Barbara
2016-01-01
The study of diet quality in a population provides information for the development of programs to improve nutritional status through better directed actions. The aim of this study was to assess the reproducibility and relative validity of a Mexican Diet Quality Index (ICDMx) for the assessment of the habitual diet of adults. The ICDMx was designed to assess the characteristics of a healthy diet using a validated semi-quantitative food frequency questionnaire (FFQ-Mx). Reproducibility was determined by comparing 2 ICDMx based on FFQs (one-year interval). Relative validity was assessed by comparing the ICDMx (2nd FFQ) with that estimated based on the intake averages from dietary records (nine days). The questionnaires were answered by 97 adults (mean age in years = 27.5, SD = 12.6). Pearson (r) and intraclass correlations (ICC) were calculated; Bland-Altman plots, Cohen’s κ coefficients and blood lipid determinations complemented the analysis. Additional analysis compared ICDMx scores with nutrients derived from dietary records, using a Pearson correlation. These nutrient intakes were transformed logarithmically to improve normality (log10) and adjusted according to energy, prior to analyses. The ICDMx obtained ICC reproducibility values ranged from 0.33 to 0.87 (23/24 items with significant correlations; mean = 0.63), while relative validity ranged from 0.26 to 0.79 (mean = 0.45). Bland-Altman plots showed a high level of agreement between methods. ICDMx scores were inversely correlated (p < 0.05) with total blood cholesterol (r = −0.33) and triglycerides (r = −0.22). ICDMx (as calculated from FFQs and DRs) obtained positive correlations with fiber, magnesium, potassium, retinol, thiamin, riboflavin, pyridoxine, and folate. The ICDMx obtained acceptable levels of reproducibility and relative validity in this population. It can be useful for population nutritional surveillance and to assess the changes resulting from the implementation of nutritional interventions. PMID:27563921
Innovative use of self-organising maps (SOMs) in model validation.
NASA Astrophysics Data System (ADS)
Jolly, Ben; McDonald, Adrian; Coggins, Jack
2016-04-01
We present an innovative combination of techniques for validation of numerical weather prediction (NWP) output against both observations and reanalyses using two classification schemes, demonstrated by a validation of the operational NWP 'AMPS' (the Antarctic Mesoscale Prediction System). Historically, model validation techniques have centred on case studies or statistics at various time scales (yearly/seasonal/monthly). Within the past decade the latter technique has been expanded by the addition of classification schemes in place of time scales, allowing more precise analysis. Classifications are typically generated for either the model or the observations, then used to create composites for both which are compared. Our method creates and trains a single self-organising map (SOM) on both the model output and observations, which is then used to classify both datasets using the same class definitions. In addition to the standard statistics on class composites, we compare the classifications themselves between the model and the observations. To add further context to the area studied, we use the same techniques to compare the SOM classifications with regimes developed for another study to great effect. The AMPS validation study compares model output against surface observations from SNOWWEB and existing University of Wisconsin-Madison Antarctic Automatic Weather Stations (AWS) during two months over the austral summer of 2014-15. Twelve SOM classes were defined in a '4 x 3' pattern, trained on both model output and observations of 2 m wind components, then used to classify both training datasets. Simple statistics (correlation, bias and normalised root-mean-square-difference) computed for SOM class composites showed that AMPS performed well during extreme weather events, but less well during lighter winds and poorly during the more changeable conditions between either extreme. Comparison of the classification time-series showed that, while correlations were lower during lighter wind periods, AMPS actually forecast the existence of those periods well suggesting that the correlations may be unfairly low. Further investigation showed poor temporal alignment during more changeable conditions, highlighting problems AMPS has around the exact timing of events. There was also a tendency for AMPS to over-predict certain wind flow patterns at the expense of others. In order to gain a larger scale perspective, we compared our mesoscale SOM classification time-series with synoptic scale regimes developed by another study using ERA-Interim reanalysis output and k-means clustering. There was good alignment between the regimes and the observations classifications (observations/regimes), highlighting the effect of synoptic scale forcing on the area. However, comparing the alignment between observations/regimes and AMPS/regimes showed that AMPS may have problems accurately resolving the strength and location of cyclones in the Ross Sea to the north of the target area.
Renteria, Laura; Li, Susan Tinsley; Pliskin, Neil H
2008-05-01
The utility of the Spanish WAIS-III was investigated by examining its reliability and validity among 100 Spanish-speaking participants. Results indicated that the internal consistency of the subtests was satisfactory, but inadequate for Letter Number Sequencing. Criterion validity was adequate. Convergent and discriminant validity results were generally similar to the North American normative sample. Paired sample t-tests suggested that the WAIS-III may underestimate ability when compared to the criterion measures that were utilized to assess validity. This study provides support for the use of the Spanish WAIS-III in urban Hispanic populations, but also suggests that caution be used when administering specific subtests, due to the nature of the Latin America alphabet and potential test bias.
Zarit, Steven H.; Liu, Yin; Bangerter, Lauren R.; Rovine, Michael J.
2017-01-01
Objectives There is growing emphasis on empirical validation of the efficacy of community-based services for older people and their families, but research on services such as respite care faces methodological challenges that have limited the growth of outcome studies. We identify problems associated with the usual research approaches for studying respite care, with the goal of stimulating use of novel and more appropriate research designs that can lead to improved studies of community-based services. Method Using the concept of research validity, we evaluate the methodological approaches in the current literature on respite services, including adult day services, in-home respite and overnight respite. Results Although randomized control trials (RCTs) are possible in community settings, validity is compromised by practical limitations of randomization and other problems. Quasi-experimental and interrupted time series designs offer comparable validity to RCTs and can be implemented effectively in community settings. Conclusion An emphasis on RCTs by funders and researchers is not supported by scientific evidence. Alternative designs can lead to development of a valid body of research on community services such as respite. PMID:26729467
Kubiak, Sheryl Pimlott; Beeble, Marisa; Bybee, Deborah
2012-12-01
A lack of a consistent and valid approach to screening within the jail often hinders identification and treatment. Furthermore, screening instruments developed for jail populations are often inadequate in detecting serious depression and anxiety disorders in women. While the remedy thus far has been the use of separate screening instruments for men and women, others have suggested that the K6, a six-item measure validated in large epidemiologic studies, may hold promise. Building on prior research, this study assesses the validity of the K6 in detecting depression, posttraumatic stress disorder, and anxiety disorders among 494 male and 515 female jail detainees. The authors found that 15% of males and 36% of females meet criteria for serious mental illness on the K6, with receiver operating characteristics--area under the curve scores of .84 and .93, respectively. This study not only establishes the validity and efficiency of using the K6 for screening within jails but also suggests a need for adjusting scale cut points.
Zarit, Steven H; Bangerter, Lauren R; Liu, Yin; Rovine, Michael J
2017-03-01
There is growing emphasis on empirical validation of the efficacy of community-based services for older people and their families, but research on services such as respite care faces methodological challenges that have limited the growth of outcome studies. We identify problems associated with the usual research approaches for studying respite care, with the goal of stimulating use of novel and more appropriate research designs that can lead to improved studies of community-based services. Using the concept of research validity, we evaluate the methodological approaches in the current literature on respite services, including adult day services, in-home respite and overnight respite. Although randomized control trials (RCTs) are possible in community settings, validity is compromised by practical limitations of randomization and other problems. Quasi-experimental and interrupted time series designs offer comparable validity to RCTs and can be implemented effectively in community settings. An emphasis on RCTs by funders and researchers is not supported by scientific evidence. Alternative designs can lead to development of a valid body of research on community services such as respite.
Rönspies, Jelena; Schmidt, Alexander F; Melnikova, Anna; Krumova, Rosina; Zolfagari, Asadeh; Banse, Rainer
2015-07-01
The present study was conducted to validate an adaptation of the Implicit Relational Assessment Procedure (IRAP) as an indirect latency-based measure of sexual orientation. Furthermore, reliability and criterion validity of the IRAP were compared to two established indirect measures of sexual orientation: a Choice Reaction Time task (CRT) and a Viewing Time (VT) task. A sample of 87 heterosexual and 35 gay men completed all three indirect measures in an online study. The IRAP and the VT predicted sexual orientation nearly perfectly. Both measures also showed a considerable amount of convergent validity. Reliabilities (internal consistencies) reached satisfactory levels. In contrast, the CRT did not tap into sexual orientation in the present study. In sum, the VT measure performed best, with the IRAP showing only slightly lower reliability and criterion validity, whereas the CRT did not yield any evidence of reliability or criterion validity in the present research. The results were discussed in the light of specific task properties of the indirect latency-based measures (task-relevance vs. task-irrelevance).
Validation of the Behavioral Risk Factor Surveillance System Sleep Questions
Jungquist, Carla R.; Mund, Jaime; Aquilina, Alan T.; Klingman, Karen; Pender, John; Ochs-Balcom, Heather; van Wijngaarden, Edwin; Dickerson, Suzanne S.
2016-01-01
Study Objective: Sleep problems may constitute a risk for health problems, including cardiovascular disease, depression, diabetes, poor work performance, and motor vehicle accidents. The primary purpose of this study was to assess the validity of the current Behavioral Risk Factor Surveillance System (BRFSS) sleep questions by establishing the sensitivity and specificity for detection of sleep/ wake disturbance. Methods: Repeated cross-sectional assessment of 300 community dwelling adults over the age of 18 who did not wear CPAP or oxygen during sleep. Reliability and validity testing of the BRFSS sleep questions was performed comparing to BFRSS responses to data from home sleep study, actigraphy for 14 days, Insomnia Severity Index, Epworth Sleepiness Scale, and PROMIS-57. Results: Only two of the five BRFSS sleep questions were found valid and reliable in determining total sleep time and excessive daytime sleepiness. Conclusions: Refinement of the BRFSS questions is recommended. Citation: Jungquist CR, Mund J, Aquilina AT, Klingman K, Pender J, Ochs-Balcom H, van Wijngaarden E, Dickerson SS. Validation of the behavioral risk factor surveillance system sleep questions. J Clin Sleep Med 2016;12(3):301–310. PMID:26446246
Landscape scale estimation of soil carbon stock using 3D modelling.
Veronesi, F; Corstanje, R; Mayr, T
2014-07-15
Soil C is the largest pool of carbon in the terrestrial biosphere, and yet the processes of C accumulation, transformation and loss are poorly accounted for. This, in part, is due to the fact that soil C is not uniformly distributed through the soil depth profile and most current landscape level predictions of C do not adequately account the vertical distribution of soil C. In this study, we apply a method based on simple soil specific depth functions to map the soil C stock in three-dimensions at landscape scale. We used soil C and bulk density data from the Soil Survey for England and Wales to map an area in the West Midlands region of approximately 13,948 km(2). We applied a method which describes the variation through the soil profile and interpolates this across the landscape using well established soil drivers such as relief, land cover and geology. The results indicate that this mapping method can effectively reproduce the observed variation in the soil profiles samples. The mapping results were validated using cross validation and an independent validation. The cross-validation resulted in an R(2) of 36% for soil C and 44% for BULKD. These results are generally in line with previous validated studies. In addition, an independent validation was undertaken, comparing the predictions against the National Soil Inventory (NSI) dataset. The majority of the residuals of this validation are between ± 5% of soil C. This indicates high level of accuracy in replicating topsoil values. In addition, the results were compared to a previous study estimating the carbon stock of the UK. We discuss the implications of our results within the context of soil C loss factors such as erosion and the impact on regional C process models. Copyright © 2014 Elsevier B.V. All rights reserved.
Chirico, Nicola; Gramatica, Paola
2011-09-26
The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have proposed the predictive squared correlation coefficient Q(2)(F1) (Shi et al.). However, other validation criteria have been proposed by other authors: the Golbraikh-Tropsha method, r(2)(m) (Roy), Q(2)(F2) (Schüürmann et al.), Q(2)(F3) (Consonni et al.). In QSAR studies these measures are usually in accordance, though this is not always the case, thus doubts can arise when contradictory results are obtained. It is likely that none of the aforementioned criteria is the best in every situation, so a comparative study using simulated data sets is proposed here, using threshold values suggested by the proponents or those widely used in QSAR modeling. In addition, a different and simple external validation measure, the concordance correlation coefficient (CCC), is proposed and compared with other criteria. Huge data sets were used to study the general behavior of validation measures, and the concordance correlation coefficient was shown to be the most restrictive. On using simulated data sets of a more realistic size, it was found that CCC was broadly in agreement, about 96% of the time, with other validation measures in accepting models as predictive, and in almost all the examples it was the most precautionary. The proposed concordance correlation coefficient also works well on real data sets, where it seems to be more stable, and helps in making decisions when the validation measures are in conflict. Since it is conceptually simple, and given its stability and restrictiveness, we propose the concordance correlation coefficient as a complementary, or alternative, more prudent measure of a QSAR model to be externally predictive.
Validity and reliability of the iPhone to measure rib hump in scoliosis.
Balg, Frederic; Juteau, Mathieu; Theoret, Chantal; Svotelis, Amy; Grenier, Guillaume
2014-12-01
This was a prospective blinded validity and reliability analysis. The aim of this study was validation and reliability evaluation of the Scoligauge iPhone app. The scoliometer is used to clinically measure the rib hump in scoliosis as a means to evaluate the axial trunk rotation. The increasing availability of smartphone with built-in accelerometer led to the development of a vast number of applications to measure angles. Of these, the Scoligauge mimics a scoliometer. The aim of this study was to compare the validity of the Scoligauge iPhone application without an associated adapter with the traditional scoliometer and to test the reliability of the application in a clinical setting. Two observers measured the rib hump deformity on 34 consecutive patients with idiopathic scoliosis with an average Cobb angle of 24.2 ± 13.5 degrees (range, 4 to 65 degrees). Measurements were made with an iPhone without the adapter and with a scoliometer. The validity as well as the interobserver and intraobserver reliability were calculated using the intraclass coefficient (ICC) and the Bland-Altman test. The mean difference between the scoliometer and the Scoligauge application was 0.4 degrees [95% confidence interval (CI) of ± 3.1 degrees] with an ICC of 0.947 (P < 0.001). The intraobserver and interobserver ICC were 0.961 (P < 0.001) and 0.901 (P < 0.001), respectively. The mean intraobserver difference was 0.0 degrees (95% CI of ± 2.7 degrees) and the mean interobserver difference was 0.1 degrees (95% CI of ± 4.4 degrees). The intraobserver and interobserver reliability of the Scoligauge iPhone app, as well as its validity compared with the scoliometer, are excellent. The mean differences between measurements are small and clinically not significant. Thus, the Scoligauge application is valid for clinical evaluation even without special adapter. Level I (Diagnostic Study).
Clinical Validation of Heart Rate Apps: Mixed-Methods Evaluation Study.
Vandenberk, Thijs; Stans, Jelle; Mortelmans, Christophe; Van Haelst, Ruth; Van Schelvergem, Gertjan; Pelckmans, Caroline; Smeets, Christophe Jp; Lanssens, Dorien; De Cannière, Hélène; Storms, Valerie; Thijs, Inge M; Vaes, Bert; Vandervoort, Pieter M
2017-08-25
Photoplethysmography (PPG) is a proven way to measure heart rate (HR). This technology is already available in smartphones, which allows measuring HR only by using the smartphone. Given the widespread availability of smartphones, this creates a scalable way to enable mobile HR monitoring. An essential precondition is that these technologies are as reliable and accurate as the current clinical (gold) standards. At this moment, there is no consensus on a gold standard method for the validation of HR apps. This results in different validation processes that do not always reflect the veracious outcome of comparison. The aim of this paper was to investigate and describe the necessary elements in validating and comparing HR apps versus standard technology. The FibriCheck (Qompium) app was used in two separate prospective nonrandomized studies. In the first study, the HR of the FibriCheck app was consecutively compared with 2 different Food and Drug Administration (FDA)-cleared HR devices: the Nonin oximeter and the AliveCor Mobile ECG. In the second study, a next step in validation was performed by comparing the beat-to-beat intervals of the FibriCheck app to a synchronized ECG recording. In the first study, the HR (BPM, beats per minute) of 88 random subjects consecutively measured with the 3 devices showed a correlation coefficient of .834 between FibriCheck and Nonin, .88 between FibriCheck and AliveCor, and .897 between Nonin and AliveCor. A single way analysis of variance (ANOVA; P=.61 was executed to test the hypothesis that there were no significant differences between the HRs as measured by the 3 devices. In the second study, 20,298 (ms) R-R intervals (RRI)-peak-to-peak intervals (PPI) from 229 subjects were analyzed. This resulted in a positive correlation (rs=.993, root mean square deviation [RMSE]=23.04 ms, and normalized root mean square error [NRMSE]=0.012) between the PPI from FibriCheck and the RRI from the wearable ECG. There was no significant difference (P=.92) between these intervals. Our findings suggest that the most suitable method for the validation of an HR app is a simultaneous measurement of the HR by the smartphone app and an ECG system, compared on the basis of beat-to-beat analysis. This approach could lead to more correct assessments of the accuracy of HR apps. ©Thijs Vandenberk, Jelle Stans, Gertjan Van Schelvergem, Caroline Pelckmans, Christophe JP Smeets, Dorien Lanssens, Hélène De Cannière, Valerie Storms, Inge M Thijs, Pieter M Vandervoort. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 25.08.2017.
Roles of Naturalistic Observation in Comparative Psychology
ERIC Educational Resources Information Center
Miller, David B.
1977-01-01
"Five roles are considered by which systematic, quantified field research can augment controlled laboratory experimentation in terms of increasing the validity of laboratory studies." Advocates that comparative psychologists should "take more initiative in designing, executing, and interpreting our experiments with regard to the natural history of…
Schiffman, Eric L; Truelove, Edmond L; Ohrbach, Richard; Anderson, Gary C; John, Mike T; List, Thomas; Look, John O
2010-01-01
The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. The aim of this article is to provide an overview of the project's methodology, descriptive statistics, and data for the study participant sample. This article also details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. The Axis I reference standards were based on the consensus of two criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion examination reliability was also assessed within study sites. Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas > or = 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion examiner agreement with reference standards was excellent (k > or = 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods.
Disc displacement without reduction: a retrospective study of a clinical diagnostic sign.
Giraudeau, Anne; Jeany, Marion; Ehrmann, Elodie; Déjou, Jacques; Ouni, Imed; Orthlieb, Jean-Daniel
2017-03-01
The purpose of this retrospective study is to evaluate a clinical diagnostic sign for disc displacement without reduction (DDWR), the absence of additional condylar translation during opening compared with protrusion. Thirty-eight electronic axiographic and magnetic resonance imaging (MRI) examinations of the TMJ were analyzed in order to compare the opening/protrusion ratio of condylar translation between non-painful DDWR and non-DDWR. According to the Mann-Whitney U test, the opening/protrusion ratio in non-painful DDWR differs significantly from non-DDWR (p < 0.0001). Among non-painful DDWR, there is no additional condylar translation during opening in comparison with protrusion, and this is probably also the case for DDWR without limited opening, which is a subtype that has not been validated by the Diagnostic Criteria for Temporomandibular Disorders (DC/TMD). Comparative condylar palpation can analyze this sign, and therefore, further comparative investigations between MRI and clinical examination are needed to validate the corresponding clinical test.
Display format, highlight validity, and highlight method: Their effects on search performance
NASA Technical Reports Server (NTRS)
Donner, Kimberly A.; Mckay, Tim D.; Obrien, Kevin M.; Rudisill, Marianne
1991-01-01
Display format and highlight validity were shown to affect visual display search performance; however, these studies were conducted on small, artificial displays of alphanumeric stimuli. A study manipulating these variables was conducted using realistic, complex Space Shuttle information displays. A 2x2x3 within-subjects analysis of variance found that search times were faster for items in reformatted displays than for current displays. Responses to valid applications of highlight were significantly faster than responses to non or invalidly highlighted applications. The significant format by highlight validity interaction showed that there was little difference in response time to both current and reformatted displays when the highlight validity was applied; however, under the non or invalid highlight conditions, search times were faster with reformatted displays. A separate within-subject analysis of variance of display format, highlight validity, and several highlight methods did not reveal a main effect of highlight method. In addition, observed display search times were compared to search time predicted by Tullis' Display Analysis Program. Benefits of highlighting and reformatting displays to enhance search and the necessity to consider highlight validity and format characteristics in tandem for predicting search performance are discussed.
Boerboom, T B B; Dolmans, D H J M; Jaarsma, A D C; Muijtjens, A M M; Van Beukelen, P; Scherpbier, A J J A
2011-01-01
Feedback to aid teachers in improving their teaching requires validated evaluation instruments. When implementing an evaluation instrument in a different context, it is important to collect validity evidence from multiple sources. We examined the validity and reliability of the Maastricht Clinical Teaching Questionnaire (MCTQ) as an instrument to evaluate individual clinical teachers during short clinical rotations in veterinary education. We examined four sources of validity evidence: (1) Content was examined based on theory of effective learning. (2) Response process was explored in a pilot study. (3) Internal structure was assessed by confirmatory factor analysis using 1086 student evaluations and reliability was examined utilizing generalizability analysis. (4) Relations with other relevant variables were examined by comparing factor scores with other outcomes. Content validity was supported by theory underlying the cognitive apprenticeship model on which the instrument is based. The pilot study resulted in an additional question about supervision time. A five-factor model showed a good fit with the data. Acceptable reliability was achievable with 10-12 questionnaires per teacher. Correlations between the factors and overall teacher judgement were strong. The MCTQ appears to be a valid and reliable instrument to evaluate clinical teachers' performance during short rotations.
ERIC Educational Resources Information Center
Surapiboonchai, Kampol
2010-01-01
There is a lack of valid and reliable low cost observational instruments to measure moderate to vigorous physical activity (MVPA) in school physical education (PE). The participants in this study were third to tenth grade boys and girls from a south Texas school district. The SAM (Simple Activity Measurement) activity levels were compared with…
ERIC Educational Resources Information Center
Krach, S. Kathleen; Loe, Scott A.; Jones, W. Paul; Farrally, Autumn
2009-01-01
Validity studies with the Reynolds Intellectual Ability scales (RIAS) indicated that RIAS composite intelligence index (CIX) and verbal intelligence index (VIX) scores have moderate-to-high correlation with comparable scores on other instruments. The authors of the RIAS described the VIX scale as a measure of crystallized ability and the nonverbal…
ERIC Educational Resources Information Center
O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew
2018-01-01
The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…
Quasiglobal reaction model for ethylene combustion
NASA Technical Reports Server (NTRS)
Singh, D. J.; Jachimowski, Casimir J.
1994-01-01
The objective of this study is to develop a reduced mechanism for ethylene oxidation. The authors are interested in a model with a minimum number of species and reactions that still models the chemistry with reasonable accuracy for the expected combustor conditions. The model will be validated by comparing the results to those calculated with a detailed kinetic model that has been validated against the experimental data.
ERIC Educational Resources Information Center
Johnston, Claire S.; Broonen, Jean-Paul; Stauffer, Sarah D.; Hamtiaux, Armanda; Pouyaud, Jacques; Zecca, Gregory; Houssemand, Claude; Rossier, Jerome
2013-01-01
This study presents the validation of a French version of the Career Adapt-Abilities Scale in four Francophone countries. The aim was to re-analyze the item selection and then compare this newly developed French-language form with the international form 2.0. Exploratory factor analysis was used as a tool for item selection, and confirmatory factor…
Vable, Anusha M; Gilsanz, Paola; Nguyen, Thu T; Kawachi, Ichiro; Glymour, M Maria
2017-01-01
Childhood socioeconomic status (cSES) is a powerful predictor of adult health, but its operationalization and measurement varies across studies. Using Health and Retirement Study data (HRS, which is nationally representative of community-residing United States adults aged 50+ years), we specified theoretically-motivated cSES measures, evaluated their reliability and validity, and compared their performance to other cSES indices. HRS respondent data (N = 31,169, interviewed 1992-2010) were used to construct a cSES index reflecting childhood social capital (cSC), childhood financial capital (cFC), and childhood human capital (cHC), using retrospective reports from when the respondent was <16 years (at least 34 years prior). We assessed internal consistency reliability (Cronbach's alpha) for the scales (cSC and cFC), and construct validity, and predictive validity for all measures. Validity was assessed with hypothesized correlates of cSES (educational attainment, measured adult height, self-reported childhood health, childhood learning problems, childhood drug and alcohol problems). We then compared the performance of our validated measures with other indices used in HRS in predicting self-rated health and number of depressive symptoms, measured in 2010. Internal consistency reliability was acceptable (cSC = 0.63, cFC = 0.61). Most measures were associated with hypothesized correlates (for example, the association between educational attainment and cSC was 0.01, p < 0.0001), with the exception that measured height was not associated with cFC (p = 0.19) and childhood drug and alcohol problems (p = 0.41), and childhood learning problems (p = 0.12) were not associated with cHC. Our measures explained slightly more variability in self-rated health (adjusted R2 = 0.07 vs. <0.06) and number of depressive symptoms (adjusted R2 > 0.05 vs. < 0.04) than alternative indices. Our cSES measures use latent variable models to handle item-missingness, thereby increasing the sample size available for analysis compared to complete case approaches (N = 15,345 vs. 8,248). Adopting this type of theoretically motivated operationalization of cSES may strengthen the quality of research on the effects of cSES on health outcomes.
42 CFR 493.575 - Removal of deeming authority or CLIA exemption and final determination review.
Code of Federal Regulations, 2014 CFR
2014-10-01
... an accreditation organization's program if the comparability or validation review produces findings...) An exemption review of a State's licensure program if the comparability or validation review produces... review of an accreditation organization or State licensure program, at CMS's discretion, if validation...
42 CFR 493.575 - Removal of deeming authority or CLIA exemption and final determination review.
Code of Federal Regulations, 2012 CFR
2012-10-01
... an accreditation organization's program if the comparability or validation review produces findings...) An exemption review of a State's licensure program if the comparability or validation review produces... review of an accreditation organization or State licensure program, at CMS's discretion, if validation...
42 CFR 493.575 - Removal of deeming authority or CLIA exemption and final determination review.
Code of Federal Regulations, 2013 CFR
2013-10-01
... an accreditation organization's program if the comparability or validation review produces findings...) An exemption review of a State's licensure program if the comparability or validation review produces... review of an accreditation organization or State licensure program, at CMS's discretion, if validation...
Cost Minimization Using an Artificial Neural Network Sleep Apnea Prediction Tool for Sleep Studies
Teferra, Rahel A.; Grant, Brydon J. B.; Mindel, Jesse W.; Siddiqi, Tauseef A.; Iftikhar, Imran H.; Ajaz, Fatima; Aliling, Jose P.; Khan, Meena S.; Hoffmann, Stephen P.
2014-01-01
Rationale: More than a million polysomnograms (PSGs) are performed annually in the United States to diagnose obstructive sleep apnea (OSA). Third-party payers now advocate a home sleep test (HST), rather than an in-laboratory PSG, as the diagnostic study for OSA regardless of clinical probability, but the economic benefit of this approach is not known. Objectives: We determined the diagnostic performance of OSA prediction tools including the newly developed OSUNet, based on an artificial neural network, and performed a cost-minimization analysis when the prediction tools are used to identify patients who should undergo HST. Methods: The OSUNet was trained to predict the presence of OSA in a derivation group of patients who underwent an in-laboratory PSG (n = 383). Validation group 1 consisted of in-laboratory PSG patients (n = 149). The network was trained further in 33 patients who underwent HST and then was validated in a separate group of 100 HST patients (validation group 2). Likelihood ratios (LRs) were compared with two previously published prediction tools. The total costs from the use of the three prediction tools and the third-party approach within a clinical algorithm were compared. Measurements and Main Results: The OSUNet had a higher +LR in all groups compared with the STOP-BANG and the modified neck circumference (MNC) prediction tools. The +LRs for STOP-BANG, MNC, and OSUNet in validation group 1 were 1.1 (1.0–1.2), 1.3 (1.1–1.5), and 2.1 (1.4–3.1); and in validation group 2 they were 1.4 (1.1–1.7), 1.7 (1.3–2.2), and 3.4 (1.8–6.1), respectively. With an OSA prevalence less than 52%, the use of all three clinical prediction tools resulted in cost savings compared with the third-party approach. Conclusions: The routine requirement of an HST to diagnose OSA regardless of clinical probability is more costly compared with the use of OSA clinical prediction tools that identify patients who should undergo this procedure when OSA is expected to be present in less than half of the population. With OSA prevalence less than 40%, the OSUNet offers the greatest savings, which are substantial when the number of sleep studies done annually is considered. PMID:25068704
Blankers, M; Barendregt, M; Dekker, J J M
2016-01-01
In mental health care centres in the Netherlands outcome data are collected using a variety of outcome instruments. This may have implications for the comparability of outcome results between different centres. To discuss recent findings regarding the extent to which the eight instruments currently used in clinical practice report comparable results. Our study is based on a combination of literature review and empirical research. The results obtained with the eight instruments are not equivalent. Patients symptom reductions appear larger with some instruments than with others. The current practice of benchmarking in the Dutch mental health system would have greater validity if the number of different instruments would be reduced. State-of-the-art calibration studies are necessary to validate the comparability of the remaining instruments. Ideally, all mental health centres will soon use one instrument per care domain to measure treatment outcome.
Rahafar, Arash; Randler, Christoph; Díaz-Morales, Juan F; Kasaeian, Ali; Heidari, Zeinab
2017-01-01
Morningness-Eveningness Stability Scale improved (MESSi) is a newly constructed measure to assess circadian types and amplitude. In this study, we applied this measure to participants from three different countries: Germany, Spain and Iran. Confirmatory factorial analysis (CFA) of MESSi displayed mediocre fit in the three countries. Comparing increasingly stringent models using multigroup confirmatory factor analyses indicated at least partial measurement invariance (metric invariance) by country for Morning Affect and Distinctness subscales. Age was positively related to Morning Affect (MA), and negatively related to Eveningness (EV) and Distinctness (DI). Men reported higher MA than women, whereas women reported higher DI than men. Regarding country effect, Iranian participants reported highest MA compared to Spaniards and Germans, whereas Germans reported higher DI compared to Iranians and Spaniards. As a conclusion, our study corroborated the validity and reliability of MESSi across three different countries with different geographical and cultural characteristics.
Validation of the Sensewear Armband during recreational in-line skating.
Soric, Maroje; Mikulic, Pavle; Misigoj-Durakovic, Marjeta; Ruzic, Lana; Markovic, Goran
2012-03-01
Multi-sensor body monitors that combine accelerometry with other physiological data are designed to overcome drawbacks of accelerometers in assessing activities with little or no vertical movement. One of such devices is the Sensewear Armband (SWA) which has been extensively validated during various activities. However, very few of the validation studies included activities other than walking and running. The aim of this investigation was to assess the validity of the SWA during recreational in-line skating. Nineteen participants (11 females and 8 males), 28 (±6) years of age, performed in-line skating exercise on a circular track at a self-selected pace. Energy expenditure was measured with the SWA and the Cosmed K4b(2) breath-by-breath portable metabolic unit. The mean (SD) energy expenditure during in-line skating estimated by the SWA [25.5 (5.8) kJ/min] was significantly lower compared with indirect calorimetry [44.2 (9.7) kJ/min, P < 0.001]. Similarly, the mean (SD) MET values recorded by the SWA were also lower compared with IC [5.3 (1.0) METs vs. 9.1 (1.6) METs, P < 0.001]. The ratio limits of agreement suggest that in 95% of cases the SWA will underestimate the energy expenditure and MET values during in-line skating by as much as 24-56% compared with indirect calorimetry. In conclusion, the results of the present study indicate that the SWA is not able to overcome the drawbacks of accelerometry in assessing activities with limited vertical movement.
Rettig, Trisha A; Ward, Claire; Pecaut, Michael J; Chapes, Stephen K
2017-07-01
Spaceflight is known to affect immune cell populations. In particular, splenic B cell numbers decrease during spaceflight and in ground-based physiological models. Although antibody isotype changes have been assessed during and after space flight, an extensive characterization of the impact of spaceflight on antibody composition has not been conducted in mice. Next Generation Sequencing and bioinformatic tools are now available to assess antibody repertoires. We can now identify immunoglobulin gene- segment usage, junctional regions, and modifications that contribute to specificity and diversity. Due to limitations on the International Space Station, alternate sample collection and storage methods must be employed. Our group compared Illumina MiSeq sequencing data from multiple sample preparation methods in normal C57Bl/6J mice to validate that sample preparation and storage would not bias the outcome of antibody repertoire characterization. In this report, we also compared sequencing techniques and a bioinformatic workflow on the data output when we assessed the IgH and Igκ variable gene usage. This included assessments of our bioinformatic workflow on Illumina HiSeq and MiSeq datasets and is specifically designed to reduce bias, capture the most information from Ig sequences, and produce a data set that provides other data mining options. We validated our workflow by comparing our normal mouse MiSeq data to existing murine antibody repertoire studies validating it for future antibody repertoire studies.
Omotosho, Tola B; Hardart, Anne; Rogers, Rebecca G; Schaffer, Joseph I; Kobak, William H; Romero, Audrey A
2009-06-01
The purpose of this study is to validate Spanish versions of the Pelvic Floor Distress Inventory (PFDI) and Pelvic Floor Impact Questionnaire (PFIQ). Spanish versions were developed using back translation and validation was performed by randomizing bilingual women to complete the Spanish or English versions of the questionnaires first. Weighted kappa statistics assessed agreement for individual questions; interclass correlation coefficients (ICC) compared primary and subscale scores. Cronbach's alpha assessed internal consistency of Spanish versions. To detect a 2.7 point difference in scores with 80% power and alpha of 0.05, 44 bilingual subjects were required. Individual questions showed good to excellent agreement (kappa > 0.6) for all but eight questions on the PFIQ. ICCs of primary and subscale scores for both questionnaires showed excellent agreement. (All ICC > 0.79). All Cronbach's alpha values were excellent (>0.84) for the primary scales of both questionnaires. Valid and reliable Spanish versions of the PFIQ and PFDI have been developed.
Ashur, S T; Shamsuddin, K; Shah, S A; Bosseri, S; Morisky, D E
2015-12-13
No validation study has previously been made for the Arabic version of the 8-item Morisky Medication Adherence Scale (MMAS-8(©)) as a measure for medication adherence in diabetes. This study in 2013 tested the reliability and validity of the Arabic MMAS-8 for type 2 diabetes mellitus patients attending a referral centre in Tripoli, Libya. A convenience sample of 103 patients self-completed the questionnaire. Reliability was tested using Cronbach alpha, average inter-item correlation and Spearman-Brown coefficient. Known-group validity was tested by comparing MMAS-8 scores of patients grouped by glycaemic control. The Arabic version showed adequate internal consistency (α = 0.70) and moderate split-half reliability (r = 0.65). Known-group validity was supported as a significant association was found between medication adherence and glycaemic control, with a moderate effect size (ϕc = 0.34). The Arabic version displayed good psychometric properties and could support diabetes research and practice in Arab countries.
Morasco, Benjamin J; Gfeller, Jeffrey D; Elder, Katherine A
2007-06-01
In this psychometric study, we compared the recently developed Validity Scales from the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992b) with the MMPI-2 (Butcher, Dahstrom, Graham, Tellegen, & Kaemmer, 1989) Validity Scales. We collected data from clients (n = 74) who completed comprehensive psychological evaluations at a university-based outpatient mental health clinic. Correlations between the Validity Scales of the NEO-PI-R and MMPI-2 were significant and in the expected directions. The relationships provide support for convergent and discriminant validity of the NEO-PI-R Validity Scales. The percent agreement of invalid responding on the two measures was high, although the diagnostic agreement was modest (kappa = .22-.33). Finally, clients who responded in an invalid manner on the NEO-PI-R Validity Scales produced significantly different clinical profiles on the NEO-PI-R and MMPI-2 than clients with valid protocols. These results provide additional support for the clinical utility of the NEO-PI-R Validity Scales as indicators of response bias.
Center of pressure based segment inertial parameters validation
Rezzoug, Nasser; Gorce, Philippe; Isableu, Brice; Venture, Gentiane
2017-01-01
By proposing efficient methods for estimating Body Segment Inertial Parameters’ (BSIP) estimation and validating them with a force plate, it is possible to improve the inverse dynamic computations that are necessary in multiple research areas. Until today a variety of studies have been conducted to improve BSIP estimation but to our knowledge a real validation has never been completely successful. In this paper, we propose a validation method using both kinematic and kinetic parameters (contact forces) gathered from optical motion capture system and a force plate respectively. To compare BSIPs, we used the measured contact forces (Force plate) as the ground truth, and reconstructed the displacements of the Center of Pressure (COP) using inverse dynamics from two different estimation techniques. Only minor differences were seen when comparing the estimated segment masses. Their influence on the COP computation however is large and the results show very distinguishable patterns of the COP movements. Improving BSIP techniques is crucial and deviation from the estimations can actually result in large errors. This method could be used as a tool to validate BSIP estimation techniques. An advantage of this approach is that it facilitates the comparison between BSIP estimation methods and more specifically it shows the accuracy of those parameters. PMID:28662090
Semi-automating the manual literature search for systematic reviews increases efficiency.
Chapman, Andrea L; Morgan, Laura C; Gartlehner, Gerald
2010-03-01
To minimise retrieval bias, manual literature searches are a key part of the search process of any systematic review. Considering the need to have accurate information, valid results of the manual literature search are essential to ensure scientific standards; likewise efficient approaches that minimise the amount of personnel time required to conduct a manual literature search are of great interest. The objective of this project was to determine the validity and efficiency of a new manual search method that utilises the scopus database. We used the traditional manual search approach as the gold standard to determine the validity and efficiency of the proposed scopus method. Outcome measures included completeness of article detection and personnel time involved. Using both methods independently, we compared the results based on accuracy of the results, validity and time spent conducting the search, efficiency. Regarding accuracy, the scopus method identified the same studies as the traditional approach indicating its validity. In terms of efficiency, using scopus led to a time saving of 62.5% compared with the traditional approach (3 h versus 8 h). The scopus method can significantly improve the efficiency of manual searches and thus of systematic reviews.
Podsakoff, Nathan P; Podsakoff, Philip M; Mackenzie, Scott B; Klinger, Ryan L
2013-01-01
Several researchers have persuasively argued that the most important evidence to consider when assessing construct validity is whether variations in the construct of interest cause corresponding variations in the measures of the focal construct. Unfortunately, the literature provides little practical guidance on how researchers can go about testing this. Therefore, the purpose of this article is to describe how researchers can use video techniques to test whether their scales measure what they purport to measure. First, we discuss how researchers can develop valid manipulations of the focal construct that they hope to measure. Next, we explain how to design a study to use this manipulation to test the validity of the scale. Finally, comparing and contrasting traditional and contemporary perspectives on validation, we discuss the advantages and limitations of video-based validation procedures. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Jalink, M B; Goris, J; Heineman, E; Pierie, J P E N; ten Cate Hoedemaker, H O
2014-02-01
Virtual reality (VR) laparoscopic simulators have been around for more than 10 years and have proven to be cost- and time-effective in laparoscopic skills training. However, most simulators are, in our experience, considered less interesting by residents and are often poorly accessible. Consequently, these devices are rarely used in actual training. In an effort to make a low-cost and more attractive simulator, a custom-made Nintendo Wii game was developed. This game could ultimately be used to train the same basic skills as VR laparoscopic simulators ought to. Before such a video game can be implemented into a surgical training program, it has to be validated according to international standards. The main goal of this study was to test construct and concurrent validity of the controls of a prototype of the game. In this study, the basic laparoscopic skills of experts (surgeons, urologists, and gynecologists, n = 15) were compared to those of complete novices (internists, n = 15) using the Wii Laparoscopy (construct validity). Scores were also compared to the Fundamentals of Laparoscopy (FLS) Peg Transfer test, an already established assessment method for measuring basic laparoscopic skills (concurrent validity). Results showed that experts were 111 % faster (P = 0.001) on the Wii Laparoscopy task than novices. Also, scores of the FLS Peg Transfer test and the Wii Laparoscopy showed a significant, high correlation (r = 0.812, P < 0.001). The prototype setup of the Wii Laparoscopy possesses solid construct and concurrent validity.
Internal consistency and validity of a new physical workload questionnaire
Bot, S; Terwee, C; van der Windt, D A W M; Feleus, A; Bierma-Zeinstra, S; Knol, D; Bouter, L; Dekker, J
2004-01-01
Aims: To examine the dimensionality, internal consistency, and construct validity of a new physical workload questionnaire in employees with musculoskeletal complaints. Methods: Factor analysis was applied to the responses in three study populations with musculoskeletal disorders (n = 406, 300, and 557) on 26 items related to physical workload. The internal consistency of the resulting subscales was examined. It was hypothesised that physical workload would vary among different occupational groups. The occupations of all subjects were classified into four groups on the basis of expected workload (heavy physical load; long lasting postures and repetitive movements; both; no physical load). Construct validity of the subscales created was tested by comparing the subscale scores among these occupational groups. Results: The pattern of the factor loadings of items was almost identical for the three study populations. Two interpretable factors were found: items related to heavy physical workload loaded highly on the first factor, and items related to static postures or repetitive work loaded highly on the second factor. The first constructed subscale "heavy physical work" had a Cronbach's α of 0.92 to 0.93 and the second subscale "long lasting postures and repetitive movements", of 0.86 to 0.87. Six of eight hypotheses regarding the construct validity of the subscales were confirmed. Conclusions: The results support the internal structure, internal consistency, and validity of the new physical workload questionnaire. Testing this questionnaire in non-symptomatic employees and comparing its performance with objective assessments of physical workload are important next steps in the validation process. PMID:15550603
AlMenhali, Entesar Ali; Khalid, Khalizani; Iyanna, Shilpa
2018-01-01
The Environmental Attitudes Inventory (EAI) was developed to evaluate the multidimensional nature of environmental attitudes; however, it is based on a dataset from outside the Arab context. This study reinvestigated the construct validity of the EAI with a new dataset and confirmed the feasibility of applying it in the Arab context. One hundred and forty-eight subjects in Study 1 and 130 in Study 2 provided valid responses. An exploratory factor analysis (EFA) was used to extract a new factor structure in Study 1, and confirmatory factor analysis (CFA) was performed in Study 2. Both studies generated a seven-factor model, and the model fit was discussed for both the studies. Study 2 exhibited satisfactory model fit indices compared to Study 1. Factor loading values of a few items in Study 1 affected the reliability values and average variance extracted values, which demonstrated low discriminant validity. Based on the results of the EFA and CFA, this study showed sufficient model fit and suggested the feasibility of applying the EAI in the Arab context with a good construct validity and internal consistency.
2018-01-01
The Environmental Attitudes Inventory (EAI) was developed to evaluate the multidimensional nature of environmental attitudes; however, it is based on a dataset from outside the Arab context. This study reinvestigated the construct validity of the EAI with a new dataset and confirmed the feasibility of applying it in the Arab context. One hundred and forty-eight subjects in Study 1 and 130 in Study 2 provided valid responses. An exploratory factor analysis (EFA) was used to extract a new factor structure in Study 1, and confirmatory factor analysis (CFA) was performed in Study 2. Both studies generated a seven-factor model, and the model fit was discussed for both the studies. Study 2 exhibited satisfactory model fit indices compared to Study 1. Factor loading values of a few items in Study 1 affected the reliability values and average variance extracted values, which demonstrated low discriminant validity. Based on the results of the EFA and CFA, this study showed sufficient model fit and suggested the feasibility of applying the EAI in the Arab context with a good construct validity and internal consistency. PMID:29758021
Le Roux, E; Mellerio, H; Guilmin-Crépon, S; Gottot, S; Jacquin, P; Boulkedid, R; Alberti, C
2017-01-01
Objective To explore the methodologies employed in studies assessing transition of care interventions, with the aim of defining goals for the improvement of future studies. Design Systematic review of comparative studies assessing transition to adult care interventions for young people with chronic conditions. Data sources MEDLINE, EMBASE, ClinicalTrial.gov. Eligibility criteria for selecting studies 2 reviewers screened comparative studies with experimental and quasi-experimental designs, published or registered before July 2015. Eligible studies evaluate transition interventions at least in part after transfer to adult care of young people with chronic conditions with at least one outcome assessed quantitatively. Results 39 studies were reviewed, 26/39 (67%) published their final results and 13/39 (33%) were in progress. In 9 studies (9/39, 23%) comparisons were made between preintervention and postintervention in a single group. Randomised control groups were used in 9/39 (23%) studies. 2 (2/39, 5%) reported blinding strategies. Use of validated questionnaires was reported in 28% (11/39) of studies. In terms of reporting in published studies 15/26 (58%) did not report age at transfer, and 6/26 (23%) did not report the time of collection of each outcome. Conclusions Few evaluative studies exist and their level of methodological quality is variable. The complexity of interventions, multiplicity of outcomes, difficulty of blinding and the small groups of patients have consequences on concluding on the effectiveness of interventions. The evaluation of the transition interventions requires an appropriate and common methodology which will provide access to a better level of evidence. We identified areas for improvement in terms of randomisation, recruitment and external validity, blinding, measurement validity, standardised assessment and reporting. Improvements will increase our capacity to determine effective interventions for transition care. PMID:28131998
Reliability and validity of the Safe Routes to school parent and student surveys.
McDonald, Noreen C; Dwelley, Amanda E; Combs, Tabitha S; Evenson, Kelly R; Winters, Richard H
2011-06-08
The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n=112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62-0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31-0.76). The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their children's mode of travel to school. © 2011 McDonald et al; licensee BioMed Central Ltd.
Kim, Hyunji; Kim, Eunbee; Suh, Eunkook M; Callan, Mitchell J
2018-01-01
The current research developed and validated a Korean-translated version of the Personal Relative Deprivation Scale (PRDS). The PRDS measures individual differences in people's tendencies to feel resentful about what they have compared to what other people like them have. Across 2 studies, Exploratory Factor Analyses revealed that the two reverse-worded items from the original PRDS did not load onto the primary factor for the Korean-translated PRDS. A reduced 3-item Korean PRDS, however, showed good convergent validity. Replicating previous findings using Western samples, greater tendencies to make social comparisons of abilities (but not opinions) were associated with higher PRDS (Studies 1 and 2), and participants scoring higher on the 3-item Korean PRDS were more materialistic (Studies 1 and 2), reported worse physical health (Study 1), had lower self-esteem (Study 2) and experienced higher stress (Study 2).
Kim, Eunbee; Suh, Eunkook M.; Callan, Mitchell J.
2018-01-01
The current research developed and validated a Korean-translated version of the Personal Relative Deprivation Scale (PRDS). The PRDS measures individual differences in people’s tendencies to feel resentful about what they have compared to what other people like them have. Across 2 studies, Exploratory Factor Analyses revealed that the two reverse-worded items from the original PRDS did not load onto the primary factor for the Korean-translated PRDS. A reduced 3-item Korean PRDS, however, showed good convergent validity. Replicating previous findings using Western samples, greater tendencies to make social comparisons of abilities (but not opinions) were associated with higher PRDS (Studies 1 and 2), and participants scoring higher on the 3-item Korean PRDS were more materialistic (Studies 1 and 2), reported worse physical health (Study 1), had lower self-esteem (Study 2) and experienced higher stress (Study 2). PMID:29746534
Chae, Soo Young; Suh, Sangil; Ryoo, Inseon; Park, Arim; Noh, Kyoung Jin; Shim, Hackjoon; Seol, Hae Young
2017-05-01
We developed a semi-automated volumetric software, NPerfusion, to segment brain tumors and quantify perfusion parameters on whole-brain CT perfusion (WBCTP) images. The purpose of this study was to assess the feasibility of the software and to validate its performance compared with manual segmentation. Twenty-nine patients with pathologically proven brain tumors who underwent preoperative WBCTP between August 2012 and February 2015 were included. Three perfusion parameters, arterial flow (AF), equivalent blood volume (EBV), and Patlak flow (PF, which is a measure of permeability of capillaries), of brain tumors were generated by a commercial software and then quantified volumetrically by NPerfusion, which also semi-automatically segmented tumor boundaries. The quantification was validated by comparison with that of manual segmentation in terms of the concordance correlation coefficient and Bland-Altman analysis. With NPerfusion, we successfully performed segmentation and quantified whole volumetric perfusion parameters of all 29 brain tumors that showed consistent perfusion trends with previous studies. The validation of the perfusion parameter quantification exhibited almost perfect agreement with manual segmentation, with Lin concordance correlation coefficients (ρ c ) for AF, EBV, and PF of 0.9988, 0.9994, and 0.9976, respectively. On Bland-Altman analysis, most differences between this software and manual segmentation on the commercial software were within the limit of agreement. NPerfusion successfully performs segmentation of brain tumors and calculates perfusion parameters of brain tumors. We validated this semi-automated segmentation software by comparing it with manual segmentation. NPerfusion can be used to calculate volumetric perfusion parameters of brain tumors from WBCTP.
Reliability and validity of a brief method to assess nociceptive flexion reflex (NFR) threshold.
Rhudy, Jamie L; France, Christopher R
2011-07-01
The nociceptive flexion reflex (NFR) is a physiological tool to study spinal nociception. However, NFR assessment can take several minutes and expose participants to repeated suprathreshold stimulations. The 4 studies reported here assessed the reliability and validity of a brief method to assess NFR threshold that uses a single ascending series of stimulations (Peak 1 NFR), by comparing it to a well-validated method that uses 3 ascending/descending staircases of stimulations (Staircase NFR). Correlations between the NFR definitions were high, were on par with test-retest correlations of Staircase NFR, and were not affected by participant sex or chronic pain status. Results also indicated the test-retest reliabilities for the 2 definitions were similar. Using larger stimulus increments (4 mAs) to assess Peak 1 NFR tended to result in higher NFR threshold estimates than using the Staircase NFR definition, whereas smaller stimulus increments (2 mAs) tended to result in lower NFR threshold estimates than the Staircase NFR definition. Neither NFR definition was correlated with anxiety, pain catastrophizing, or anxiety sensitivity. In sum, a single ascending series of electrical stimulations results in a reliable and valid estimate of NFR threshold. However, caution may be warranted when comparing NFR thresholds across studies that differ in the ascending stimulus increments. This brief method to assess NFR threshold is reliable and valid; therefore, it should be useful to clinical pain researchers interested in quickly assessing inter- and intra-individual differences in spinal nociceptive processes. Copyright © 2011 American Pain Society. Published by Elsevier Inc. All rights reserved.
Scholl, Isabelle; Kriston, Levente; Dirmaier, Jörg; Härter, Martin
2015-02-01
While there has been a clear move towards shared decision-making (SDM) in the last few years, the measurement of SDM-related constructs remains challenging. There has been a call for further psychometric testing of known scales, especially regarding validity aspects. To test convergent validity of the nine-item Shared Decision-Making Questionnaire (SDM-Q-9) by comparing it to the OPTION Scale. Cross-sectional study. Data were collected in outpatient care practices. Patients suffering from chronic diseases and facing a medical decision were included in the study. Consultations were evaluated using the OPTION Scale. Patients completed the SDM-Q-9 after the consultation. First, the internal consistency of both scales and the inter-rater reliability of the OPTION Scale were calculated. To analyse the convergent validity of the SDM-Q-9, correlation between the patient (SDM-Q-9) and expert ratings (OPTION Scale) was calculated. A total of 21 physicians provided analysable data of consultations with 63 patients. Analyses revealed good internal consistency of the SDM-Q-9 and limited internal consistency of the OPTION Scale. Inter-rater reliability of the latter was less than optimal. Association between the total scores of both instruments was weak with a Spearman correlation of r = 0.19 and did not reach statistical significance. By the use of the OPTION Scale convergent validity of the SDM-Q-9 could not be established. Several possible explanations for this result are discussed. This study shows that the measurement of SDM remains challenging. © 2012 John Wiley & Sons Ltd.
Validity of the occupational sitting and physical activity questionnaire.
Chau, Josephine Y; Van Der Ploeg, Hidde P; Dunn, Scott; Kurko, John; Bauman, Adrian E
2012-01-01
Sitting at work is an emerging occupational health risk. Few instruments designed for use in population-based research measure occupational sitting and standing as distinct behaviors. This study aimed to develop and validate brief measure of occupational sitting and physical activity. A convenience sample (n = 99, 61% female) was recruited from two medium-sized workplaces and by word-of-mouth in Sydney, Australia. Participants completed the newly developed Occupational Sitting and Physical Activity Questionnaire (OSPAQ) and a modified version of the MONICA Optional Study on Physical Activity Questionnaire (modified MOSPA-Q) twice, 1 wk apart. Participants also wore an ActiGraph accelerometer for the 7 d in between the test and retest. Analyses determined test-retest reliability with intraclass correlation coefficients and assessed criterion validity against accelerometers using the Spearman ρ. The test-retest intraclass correlation coefficients for occupational sitting, standing, and walking for OSPAQ ranged from 0.73 to 0.90, while that for the modified MOSPA-Q ranged from 0.54 to 0.89. Comparison of sitting measures with accelerometers showed higher Spearman correlations for the OSPAQ (r = 0.65) than for the modified MOSPA-Q (r = 0.52). Criterion validity correlations for occupational standing and walking measures were comparable for both instruments with accelerometers (standing: r = 0.49; walking: r = 0.27-0.29). The OSPAQ has excellent test-retest reliability and moderate validity for estimating time spent sitting and standing at work and is comparable to existing occupational physical activity measures for assessing time spent walking at work. The OSPAQ brief instrument measures sitting and standing at work as distinct behaviors and would be especially suitable in national health surveys, prospective cohort studies, and other studies that are limited by space constraints for questionnaire items.
Agent-Based vs. Equation-based Epidemiological Models:A Model Selection Case Study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sukumar, Sreenivas R; Nutaro, James J
This paper is motivated by the need to design model validation strategies for epidemiological disease-spread models. We consider both agent-based and equation-based models of pandemic disease spread and study the nuances and complexities one has to consider from the perspective of model validation. For this purpose, we instantiate an equation based model and an agent based model of the 1918 Spanish flu and we leverage data published in the literature for our case- study. We present our observations from the perspective of each implementation and discuss the application of model-selection criteria to compare the risk in choosing one modeling paradigmmore » to another. We conclude with a discussion of our experience and document future ideas for a model validation framework.« less
Ramkumar, Vidya; Vanaja, C S; Hall, James W; Selvakumar, K; Nagarajan, Roopa
2018-05-01
This study assessed the validity of DPOAE screening conducted by village health workers (VHWs) in a rural community. Real-time click evoked tele-auditory brainstem response (tele-ABR) was used as the gold standard to establish validity. A cross-sectional design was utilised to compare the results of screening by VHWs to those obtained via tele-ABR. Study samples: One hundred and nineteen subjects (0 to 5 years) were selected randomly from a sample of 2880 infants and young children who received DPOAE screening by VHWs. Real time tele-ABR was conducted by using satellite or broadband internet connectivity at the village. An audiologist located at the tertiary care hospital conducted tele-ABR testing through a remote computing paradigm. Tele-ABR was recorded using standard recording parameters recommended for infants and young children. Wave morphology, repeatability and peak latency data were used for ABR analysis. Tele-ABR and DPOAE findings were compared for 197 ears. The sensitivity of DPOAE screening conducted by the VHW was 75%, and specificity was 91%. The negative and positive predictive values were 98.8% and 27.2%, respectively. The validity of DPOAE screening conducted by trained VHW was acceptable. This study supports the engagement of grass-root workers in community-based hearing health care provision.
Sarangmath, Nagaraja; Rattihalli, Rohini; Ragothaman, Mona; Gopalkrishna, Gururaj; Doddaballapur, Subbakrishna; Louis, Elan D; Muthane, Uday B
2005-12-01
The prevalence of Parkinson's disease (PD) is low among Indians, except in the Parsis. Data for Indians come from studies using different screening tools and criteria to detect PD. An epidemiological study in India, which has nearly a billion people, more than 18 spoken languages, and varying levels of literacy, requires development and validation of a screening tool for PD. The objectives of this study are to (1) validate a modified version of a widely used screening questionnaire for PD to suit the needs of the Indian population; (2) compare the use of a nonmedical assistant (NMA) with the use of a medical person during screening; and (3) compare the effect of literacy of participants on the validity of the screening tool. The validity of the questionnaire was tested on 125 participants from a home for the elderly. NMAs of similar background and medical personnel administered the modified screening questionnaire. A movement disorder neurologist blind to the responses on the questionnaire, examined participants independently and diagnosed if participants had PD. The questionnaire was validated in the movement disorders clinic, on known PD patients and their family members without PD. In the movement disorders clinic, sensitivity and specificity of the questionnaire were 100% and 89%, respectively. Fifty-seven participants were included for analysis. The questionnaire had a higher sensitivity when NMAs (75%) rather than the medical personnel (61%) administered it, and its specificity was higher with the medical personnel (61%) than with NMAs (55% and 25%). The questionnaire had a higher specificity in literates than illiterates, whereas sensitivity varied considerably. The modified questionnaire translated in a local Indian language had reasonable sensitivity and can be used to screen individuals for PD in epidemiological studies in India. This questionnaire can be administered by NMAs to screen PD and this strategy would reduce manpower costs. Literacy may influence epidemiological estimates when screening PD.
Tabard-Fougère, Anne; Bonnefoy-Mazure, Alice; Hanquinet, Sylviane; Lascombes, Pierre; Armand, Stéphane; Dayer, Romain
2017-01-15
Test-retest study. This study aimed to evaluate the validity and reliability of rasterstereography in patients with adolescent idiopathic scoliosis (AIS) with a major curve Cobb angle (CA) between 10° and 40° for frontal, sagittal, and transverse parameters. Previous studies evaluating the validity and reliability of rasterstereography concluded that this technique had good accuracy compared with radiographs and a high intra- and interday reliability in healthy volunteers. To the best of our knowledge, the validity and reliability have not been assessed in AIS patients. Thirty-five adolescents with AIS (male = 13) aged 13.1 ± 2.0 years were included. To evaluate the validity of the scoliosis angle (SA) provided by rasterstereography, a comparison (t test, Pearson correlation) was performed with the CA obtained using 2D EOS® radiography (XR). Three rasterstereographic repeated measurements were independently performed by two operators on the same day (interrater reliability) and again by the first operator 1 week later (intrarater reliability). The variables of interest were the SA, lumbar lordosis, and thoracic kyphosis angle, trunk length, pelvic obliquity, and maximum, root mean square and amplitude of vertebral rotations. The data analyses used intraclass correlation coefficients (ICCs). The CA and SA were strongly correlated (R = 0.70) and were nonsignificantly different (P = 0.60). The intrarater reliability (same day: ICC [1, 1], n = 35; 1 week later: ICC [1, 3], n = 28) and interrater reliability (ICC [3, 3], n = 16) were globally excellent (ICC > 0.75) except for the assessment of pelvic obliquity. This study showed that the rasterstereographic system allows for the evaluation of AIS patients with a good validity compared with XR with an overall excellent intra- and interrater reliability. Based on these results, this automatic, fast, and noninvasive system can be used for monitoring the evolution of AIS in growing patients instead of repetitive radiographs, thereby reducing radiation exposure and decreasing costs. 4.
Social Networks and Mourning: A Comparative Approach.
ERIC Educational Resources Information Center
Rubin, Nissan
1990-01-01
Suggests using social network theory to explain varieties of mourning behavior in different societies. Compares participation in funeral ceremonies of members of different social circles in American society and Israeli kibbutz. Concludes that results demonstrated validity of concepts deriving from social network analysis in study of bereavement,…
Validation study of an electronic method of condensed outcomes tools reporting in orthopaedics.
Farr, Jack; Verma, Nikhil; Cole, Brian J
2013-12-01
Patient-reported outcomes (PRO) instruments are a vital source of data for evaluating the efficacy of medical treatments. Historically, outcomes instruments have been designed, validated, and implemented as paper-based questionnaires. The collection of paper-based outcomes information may result in patients becoming fatigued as they respond to redundant questions. This problem is exacerbated when multiple PRO measures are provided to a single patient. In addition, the management and analysis of data collected in paper format involves labor-intensive processes to score and render the data analyzable. Computer-based outcomes systems have the potential to mitigate these problems by reformatting multiple outcomes tools into a single, user-friendly tool.The study aimed to determine whether the electronic outcomes system presented produces results comparable with the test-retest correlations reported for the corresponding orthopedic paper-based outcomes instruments.The study is designed as a crossover study based on consecutive orthopaedic patients arriving at one of two designated orthopedic knee clinics.Patients were assigned to complete either a paper or a computer-administered questionnaire based on a similar set of questions (Knee injury and Osteoarthritis Outcome Score, International Knee Documentation Committee form, 36-Item Short Form survey, version 1, Lysholm Knee Scoring Scale). Each patient completed the same surveys using the other instrument, so that all patients had completed both paper and electronic versions. Correlations between the results from the two modes were studied and compared with test-retest data from the original validation studies.The original validation studies established test-retest reliability by computing correlation coefficients for two administrations of the paper instrument. Those correlation coefficients were all in the range of 0.7 to 0.9, which was deemed satisfactory. The present study computed correlation coefficients between the paper and electronic modes of administration. These correlation coefficients demonstrated similar results with an overall value of 0.86.On the basis of the correlation coefficients, the electronic application of commonly used knee outcome scores compare variably to the traditional paper variants with a high rate of test-retest correlation. This equivalence supports the use of the condensed electronic outcomes system and validates comparison of scores between electronic and paper modes. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Validity of the modified back-saver sit-and-reach test: a comparison with other protocols.
Hui, S S; Yuen, P Y
2000-09-01
Studies have shown that the classical sit-and-reach (CSR) test, the modified sit-and-reach (MSR), and the newly developed back-saver sit-and-reach (BS) test have poor criterion-related validity in estimating low-back flexibility but yielded moderate criterion-related validity in hamstring flexibility. The V sit-and-reach (VSR) test was found to be practical but the validity has not been established. The purpose of this study was to propose a modified back-saver sit-and-reach (MBS) test, which incorporated all advantages of the various protocols, and to compare the criterion-related validity and reliability of all these tests. 158 college students (F = 96, and M = 62; age = 20.77 +/- 2.51) performed CSR, VSR, BS (left and right leg), and MBS (left and right leg) tests in a randomized order. Scores from each test were then correlated with the criterion measures. For all sit-reach tests, intraclass reliability (single trial) was very high (r = 0.89-0.98). MBS yielded significant and highest r with low-back and hamstring criterion for men (r = 0.47-0.67) and women (r = 0.23-0.54). The low-back and right hamstring validity of MBS for men were significantly (P < 0.01) higher than those from BS and CSR, whereas no differences in criterion-related validity were found between the MBS and other protocols in women. The ratings of perceived comfort among the sit-and-reach protocols were significantly different (P < 0.001) from each other. The rating for MBS was observed the most comfortable test as compared with other protocols. The MBS test is not only a reliable test for hamstring and low-back flexibility, it is also a more practical with improved validity for hamstring and low-back flexibility in men than previous protocols.
ERIC Educational Resources Information Center
Arango, Lisa Lewis; Kurtines, William M.; Montgomery, Marilyn J.; Ritchie, Rachel
2008-01-01
The study reported in this article, a Multi-Stage Longitudinal Comparative Design Stage II evaluation conducted as a planned preliminary efficacy evaluation (psychometric evaluation of measures, short-term controlled outcome studies, etc.) of the Changing Lives Program (CLP), provided evidence for the reliability and validity of the qualitative…
ERIC Educational Resources Information Center
Woodward-Kron, Robyn; Elder, Catherine
2016-01-01
The aim of this paper is to investigate from a discourse analytic perspective task authenticity in the speaking component of the Occupational English Test (OET), an English language screening test for clinicians designed to reflect the language demands of health professional-patient communication. The study compares the OET speaking sub-test…
ERIC Educational Resources Information Center
Oh, Hunseok; Choi, Yeseul; Choi, Myungweon
2013-01-01
The purpose of this study was to assess, evaluate, and compare the competitive advantages of the human resource development systems of advanced countries. The Global Human Resource Development Index was utilized for this study, since it has been validated through an expert panel's content review and analytic hierarchy process. Using a sample of 34…
A Meta-Analytic Review of the Cover-Copy-Compare and Variations of This Self-Management Procedure
ERIC Educational Resources Information Center
Joseph, Laurice M.; Konrad, Moira; Cates, Gary; Vajcner, Terra; Eveleigh, Elisha; Fishley, Katelyn M.
2012-01-01
Studies that examined copy-cover-compare (CCC) and variations of this procedure were reviewed and analyzed. This review revealed a substantial number of studies that validated the use of CCC across spelling and math skills and across students with and without disabilities. A meta-analysis of findings indicated that CCC and variations of this…
ERIC Educational Resources Information Center
Efendov, Adele A.; Sellbom, Martin; Bagby, R. Michael
2008-01-01
The authors examined the comparative predictive capacity of the Trauma Symptom Inventory (TSI) Atypical Response Scale (ATR) and the standard set of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) fake-bad validity scales (i.e., F, F[subscript B[prime
Comparing Errors in Medicaid Reporting across Surveys: Evidence to Date
Call, Kathleen T; Davern, Michael E; Klerman, Jacob A; Lynch, Victoria
2013-01-01
Objective To synthesize evidence on the accuracy of Medicaid reporting across state and federal surveys. Data Sources All available validation studies. Study Design Compare results from existing research to understand variation in reporting across surveys. Data Collection Methods Synthesize all available studies validating survey reports of Medicaid coverage. Principal Findings Across all surveys, reporting some type of insurance coverage is better than reporting Medicaid specifically. Therefore, estimates of uninsurance are less biased than estimates of specific sources of coverage. The CPS stands out as being particularly inaccurate. Conclusions Measuring health insurance coverage is prone to some level of error, yet survey overstatements of uninsurance are modest in most surveys. Accounting for all forms of bias is complex. Researchers should consider adjusting estimates of Medicaid and uninsurance in surveys prone to high levels of misreporting. PMID:22816493
de Bruijn, Carla; van den Brink, Wim; de Graaf, Ron; Vollebergh, Wilma A M
2005-01-01
To compare the discriminant validity of the DSM-IV and the ICD-10 classification of alcohol use disorders (AUD) with an alternative classification, the craving withdrawal model (CWM). CWM requires craving and withdrawal for the diagnosis of alcohol dependence and raises the alcohol abuse threshold to two DSM-IV AUD criteria. Data were derived from The Netherlands Mental Health Survey and Incidence Study, a large representative sample of the general Dutch population. In the present study, only non-abstinent subjects were included (n=6041). Three diagnostic systems (DSM-IV, ICD-10, and CWM) were compared using the following discriminant variables: alcohol intake, psychiatric comorbidity, functional status, familial alcohol problems, and treatment sought. The year prevalence of CWM alcohol dependence was lower than the prevalence of ICD-10 and DSM-IV dependence (0.3% vs 1.4% and 1.4%). The year prevalence of abuse was similar for CWM and DSM-IV (4.7 and 4.9%), but lower for ICD-10 harmful use (1.7%). DSM-IV resulted in a poor distinction between normality and abuse and ICD-10 resulted in a poor distinction between harmful use and dependence. In contrast, the CWM distinctions between normality and abuse, and between abuse, and dependence were significant for most of the discriminant variables. This study indicates that CWM improves the discriminant validity of AUD diagnoses. The predictive validity of the CWM for alcohol and other substance use disorders remain to be studied.
Parekh, Mohit; Salvalaio, Gianni; Ferrari, Stefano; Amoureux, Marie-Claude; Albrecht, Cecile; Fortier, Denis; Ponzin, Diego
2014-12-01
To standardize a new evaluation technique for calculating the overall quality (OQ) of the donor cornea and validate it using a comparative study of corneas preserved in Optisol-GS and Cornea Cold®. Thirty pairs of donor corneas were selected for a 4 week in vitro comparative study using masked observers. Physiological parameters like thickness, transparency, viable endothelial cell density (VECD) and morphology were transformed to numerical range (0-4) to obtain the OQ. Microbiological examination was performed using Bactec instrument. Students t test showed statistically better results (p < 0.05) from week 3 for thickness, week 2 for transparency and week 1 for morphology and VECD; statistical significance (p < 0.05) was found for OQ from week 2 for the corneas preserved in Cornea Cold® compared to Optisol-GS. Epithelial quality was similar regardless of the medium. Microbiological examination showed absence of aerobic and anaerobic microorganisms in both media. OQ method is efficient, consistent and easy, now validated for comparative studies. Further refinement is necessary for its use at eye-banks, bio-banks and research or transplantation purposes. Cornea Cold® is a promising hypothermic corneal storage medium with preservation time ≤21 days. This permits higher flexibility, evaluation accuracy, longer duration for surgical preparation and ease of transportation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rose, Amy N.; Nagle, Nicholas N.
Techniques such as Iterative Proportional Fitting have been previously suggested as a means to generate new data with the demographic granularity of individual surveys and the spatial granularity of small area tabulations of censuses and surveys. This article explores internal and external validation approaches for synthetic, small area, household- and individual-level microdata using a case study for Bangladesh. Using data from the Bangladesh Census 2011 and the Demographic and Health Survey, we produce estimates of infant mortality rate and other household attributes for small areas using a variation of an iterative proportional fitting method called P-MEDM. We conduct an internalmore » validation to determine: whether the model accurately recreates the spatial variation of the input data, how each of the variables performed overall, and how the estimates compare to the published population totals. We conduct an external validation by comparing the estimates with indicators from the 2009 Multiple Indicator Cluster Survey (MICS) for Bangladesh to benchmark how well the estimates compared to a known dataset which was not used in the original model. The results indicate that the estimation process is viable for regions that are better represented in the microdata sample, but also revealed the possibility of strong overfitting in sparsely sampled sub-populations.« less
Rose, Amy N.; Nagle, Nicholas N.
2016-08-01
Techniques such as Iterative Proportional Fitting have been previously suggested as a means to generate new data with the demographic granularity of individual surveys and the spatial granularity of small area tabulations of censuses and surveys. This article explores internal and external validation approaches for synthetic, small area, household- and individual-level microdata using a case study for Bangladesh. Using data from the Bangladesh Census 2011 and the Demographic and Health Survey, we produce estimates of infant mortality rate and other household attributes for small areas using a variation of an iterative proportional fitting method called P-MEDM. We conduct an internalmore » validation to determine: whether the model accurately recreates the spatial variation of the input data, how each of the variables performed overall, and how the estimates compare to the published population totals. We conduct an external validation by comparing the estimates with indicators from the 2009 Multiple Indicator Cluster Survey (MICS) for Bangladesh to benchmark how well the estimates compared to a known dataset which was not used in the original model. The results indicate that the estimation process is viable for regions that are better represented in the microdata sample, but also revealed the possibility of strong overfitting in sparsely sampled sub-populations.« less
Helmerhorst, Hendrik J F; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf
2012-08-31
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs.A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible.In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62-0.71 for existing, and 0.74-0.76 for new PAQs. Median validity coefficients ranged from 0.30-0.39 for existing, and from 0.25-0.41 for new PAQs.Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.
Chen, Hong-Lin; Cao, Ying-Juan; Zhang, Wei; Wang, Jing; Huai, Bao-Sha
2017-02-01
The inter-rater reliability of Braden Scale is not so good. We modified the Braden(ALB) scale by defining nutrition subscale based on serum albumin, then assessed it's the validity and reliability in hospital patients. We designed a retrospective study for validity analysis, and a prospective study for reliability analysis. Receiver operating curve (ROC) and area under the curve (AUC) were used to evaluate the predictive validity. Intra-class correlation coefficient (ICC) was used to investigate the inter-rater reliability. Two thousand five hundred twenty-five patients were included for validity analysis, 76 patients (3.0%) developed pressure ulcer. Positive correlation was found between serum albumin and nutrition score in Braden scale (Spearman's coefficient 0.2203, P<0.0001). The AUCs for Braden scale and Braden(ALB) scale predicting pressure ulcer risk were 0.813 (95% CI 0.797-0.828; P<0.0001), and 0.859 (95% CI 0.845-0.872; P<0.0001), respectively. The Braden(ALB) scale was even more valid than the Braden scale (z=1.860, P=0.0628). In different age subgroups, the Braden(ALB) scale seems also more valid than the original Braden scale, but no statistically significant differences were found (P>0.05). The inter-rater reliability study showed the ICC-value for nutrition increased 45.9%, and increased 4.3% for total score. The Braden(ALB) scale has similar validity compared with the original Braden scale for in hospital patients. However, the inter-rater reliability was significantly increased. Copyright © 2016 Elsevier Inc. All rights reserved.
2012-01-01
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557
Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes
2012-08-13
Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.
Brett, Benjamin L; Solomon, Gary S
2017-04-01
Research findings to date on the stability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) Composite scores have been inconsistent, requiring further investigation. The use of test validity criteria across these studies also has been inconsistent. Using multiple measures of stability, we examined test-retest reliability of repeated ImPACT baseline assessments in high school athletes across various validity criteria reported in previous studies. A total of 1146 high school athletes completed baseline cognitive testing using the online ImPACT test battery at two time periods of approximately two-year intervals. No participant sustained a concussion between assessments. Five forms of validity criteria used in previous test-retest studies were applied to the data, and differences in reliability were compared. Intraclass correlation coefficients (ICCs) ranged in composite scores from .47 (95% confidence interval, CI [.38, .54]) to .83 (95% CI [.81, .85]) and showed little change across a two-year interval for all five sets of validity criteria. Regression based methods (RBMs) examining the test-retest stability demonstrated a lack of significant change in composite scores across the two-year interval for all forms of validity criteria, with no cases falling outside the expected range of 90% confidence intervals. The application of more stringent validity criteria does not alter test-retest reliability, nor does it account for some of the variation observed across previously performed studies. As such, use of the ImPACT manual validity criteria should be utilized in the determination of test validity and in the individualized approach to concussion management. Potential future efforts to improve test-retest reliability are discussed.
Chu, Anne H. Y.; Ng, Sheryl H. X.; Koh, David; Müller-Riemenschneider, Falk
2015-01-01
Objective The Global Physical Activity Questionnaire (GPAQ) was originally designed to be interviewer-administered by the World Health Organization in assessing physical activity. The main aim of this study was to compare the psychometric properties of a self-administered GPAQ with the original interviewer-administered approach. Additionally, this study explored whether using different accelerometry-based physical activity bout definitions might affect the questionnaire’s validity. Methods A total of 110 participants were recruited and randomly allocated to an interviewer- (n = 56) or a self-administered (n = 54) group for test-retest reliability, of which 108 participants who met the wear time criteria were included in the validity study. Reliability was assessed by administration of questionnaires twice with a one-week interval. Criterion validity was assessed by comparing against seven-day accelerometer measures. Two definitions for accelerometry-data scoring were employed: (1) total-min of activity, and (2) 10-min bout. Results Participants had similar baseline characteristics in both administration groups and no significant difference was found between the two formats in terms of validity (correlations between the GPAQ and accelerometer). For validity, the GPAQ demonstrated fair-to-moderate correlations for moderate-to-vigorous physical activity (MVPA) for self-administration (r s = 0.30) and interviewer-administration (r s = 0.46). Findings were similar when considering 10-min activity bouts in the accelerometer analysis for MVPA (r s = 0.29 vs. 0.42 for self vs. interviewer). Within each mode of administration, the strongest correlations were observed for vigorous-intensity activity. However, Bland-Altman plots illustrated bias toward overestimation for higher levels of MVPA, vigorous- and moderate-intensity activities, and underestimation for lower levels of these measures. Reliability for MVPA revealed moderate correlations (r s = 0.61 vs. 0.63 for self vs. interviewer). Conclusions Our findings showed comparability between both self- and interviewer-administration modes of the GPAQ. The GPAQ in general but especially the self-administered version may offer a relatively inexpensive method for measuring physical activity of various types and at different domains. However, there may be bias in the GPAQ measurements depending on the overall physical activity. It is advisable to incorporate accelerometers in future studies, particularly when measuring different intensities of physical activity. PMID:26327457
Sebok, Angelia; Wickens, Christopher D
2017-03-01
The objectives were to (a) implement theoretical perspectives regarding human-automation interaction (HAI) into model-based tools to assist designers in developing systems that support effective performance and (b) conduct validations to assess the ability of the models to predict operator performance. Two key concepts in HAI, the lumberjack analogy and black swan events, have been studied extensively. The lumberjack analogy describes the effects of imperfect automation on operator performance. In routine operations, an increased degree of automation supports performance, but in failure conditions, increased automation results in more significantly impaired performance. Black swans are the rare and unexpected failures of imperfect automation. The lumberjack analogy and black swan concepts have been implemented into three model-based tools that predict operator performance in different systems. These tools include a flight management system, a remotely controlled robotic arm, and an environmental process control system. Each modeling effort included a corresponding validation. In one validation, the software tool was used to compare three flight management system designs, which were ranked in the same order as predicted by subject matter experts. The second validation compared model-predicted operator complacency with empirical performance in the same conditions. The third validation compared model-predicted and empirically determined time to detect and repair faults in four automation conditions. The three model-based tools offer useful ways to predict operator performance in complex systems. The three tools offer ways to predict the effects of different automation designs on operator performance.
Sanz, Lorena; Bau, Patricia; Arribas, Ignacio; Rivera, Teresa
2016-09-01
A child's voice is used both as a tool for communication and as a form of emotional expression. Thus, voice disorders suffered by children have negative effects on their quality of life, which can be assessed using the "Pediatric Voice Handicap Index" (P-VHI). This questionnaire is completed by the parents of dysphonic patients and it has been validated in different languages: Italian, Korean, Arabic, and Spanish. More recently, the "Children Voice Handicap Index-10" test (C-VHI-10) was developed and validated, an Italian version reduced into 10 items that is answered by children themselves. The objective of this study was to develop and validate a short Spanish version of the P-VHI (P-VHI-10) and to assess whether it is comparable to the Italian C-VHI-10. We conducted a cross-sectional study on 27 patients between 6-15 years of age. We developed an abbreviated version of the P-VHI that consisted of 10 statements to be answered by parents of children with dysphonia (P-VHI-10). These statements were based on the 10 items with the highest score in the validated Spanish version of the P-VHI. In addition, the validated Italian version of C-VHI-10 was translated into Spanish and this translation was reviewed and modified by three specialists, resulting in an adapted version to be answered by parents (C*-VHI-10). The parents and children included in the study of this index were the same patients as those included in the study to validate the Spanish P-VHI. There were no significant differences in the results obtained with the extended version of the P-VHI (17.4) and with the P-VHI-10 (18.7: Pearson coefficient = 0.602, p < 0.36). A paired student's t-test identified significant differences (p < 0.0001) when comparing the P-VHI-10 and C*-VHI-10, both of which were answered by parents, with average scores of 18.7 and 9.48, respectively. Both these reduced versions have good internal consistency, with a satisfactory Cronbach's alpha coefficient (α = 0.75 to P-VHI-10 and α = 0.73 in C*-VHI-10). No statistically significant differences were found when the average total score between the C-VHI-10 and C*-VHI-10 were compared, with a Pearson's correlation coefficient of 0.559 (p < 0.9). The short version of the P-VHI10 questionnaire is a clinically valid tool that has good internal consistency. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
van Gestel, Aukje; Severens, Johan L; Webers, Carroll A B; Beckers, Henny J M; Jansonius, Nomdo M; Schouten, Jan S A G
2010-01-01
Discrete event simulation (DES) modeling has several advantages over simpler modeling techniques in health economics, such as increased flexibility and the ability to model complex systems. Nevertheless, these benefits may come at the cost of reduced transparency, which may compromise the model's face validity and credibility. We aimed to produce a transparent report on the construction and validation of a DES model using a recently developed model of ocular hypertension and glaucoma. Current evidence of associations between prognostic factors and disease progression in ocular hypertension and glaucoma was translated into DES model elements. The model was extended to simulate treatment decisions and effects. Utility and costs were linked to disease status and treatment, and clinical and health economic outcomes were defined. The model was validated at several levels. The soundness of design and the plausibility of input estimates were evaluated in interdisciplinary meetings (face validity). Individual patients were traced throughout the simulation under a multitude of model settings to debug the model, and the model was run with a variety of extreme scenarios to compare the outcomes with prior expectations (internal validity). Finally, several intermediate (clinical) outcomes of the model were compared with those observed in experimental or observational studies (external validity) and the feasibility of evaluating hypothetical treatment strategies was tested. The model performed well in all validity tests. Analyses of hypothetical treatment strategies took about 30 minutes per cohort and lead to plausible health-economic outcomes. There is added value of DES models in complex treatment strategies such as glaucoma. Achieving transparency in model structure and outcomes may require some effort in reporting and validating the model, but it is feasible.
Validating two questions in the Force Concept Inventory with subquestions
NASA Astrophysics Data System (ADS)
Yasuda, Jun-ichiro; Taniguchi, Masa-aki
2013-06-01
In this study, we evaluate the structural validity of Q.16 and Q.7 in the Force Concept Inventory (FCI). We address whether respondents who answer Q.16 and Q.7 correctly actually have an understanding of the concepts of physics tested in the questions. To examine respondents’ levels of understanding, we use subquestions that test them on concepts believed to be required to answer the actual FCI questions. Our sample size comprises 111 respondents; we derive false-positive ratios for prelearners and postlearners and then statistically test the difference between them. We find a difference at the 0.05 significance level for both Q.16 and Q.7, implying that it is possible for postlearners to answer both questions without an understanding of the concepts of physics tested in the questions; therefore, the structures of Q.16 and Q.7 are invalid. In this study, we only evaluate the validity of these two FCI questions; we do not assess the validity of previous studies that have compared total FCI scores.
Validation of an electronic device for measuring driving exposure.
Huebner, Kyla D; Porter, Michelle M; Marshall, Shawn C
2006-03-01
This study sought to evaluate an on-board diagnostic system (CarChip) for collecting driving exposure data in older drivers. Drivers (N = 20) aged 60 to 86 years from Winnipeg and surrounding communities participated. Information on driving exposure was obtained via the CarChip and global positioning system (GPS) technology on a driving course, and obtained via the CarChip and surveys over a week of driving. Velocities and distances were measured over the road course to validate the accuracy of the CarChip compared to GPS for those parameters. The results show that the CarChip does provide valid distance measurements and slightly lower maximum velocities than GPS measures. From the results obtained in this study, it was determined that retrospective self-reports of weekly driving distances are inaccurate. Therefore, an on-board diagnostic system (OBDII) electronic device like the CarChip can provide valid and detailed information about driving exposure that would be useful for studies of crash rates or driving behavior.
Mills, Jeremy F; Gray, Andrew L
2013-11-01
This study is an initial validation study of the Two-Tiered Violence Risk Estimates instrument (TTV), a violence risk appraisal instrument designed to support an integrated-actuarial approach to violence risk assessment. The TTV was scored retrospectively from file information on a sample of violent offenders. Construct validity was examined by comparing the TTV with instruments that have shown utility to predict violence that were prospectively scored: The Historical-Clinical-Risk Management-20 (HCR-20) and Lifestyle Criminality Screening Form (LCSF). Predictive validity was examined through a long-term follow-up of 12.4 years with a sample of 78 incarcerated offenders. Results show the TTV to be highly correlated with the HCR-20 and LCSF. The base rate for violence over the follow-up period was 47.4%, and the TTV was equally predictive of violent recidivism relative to the HCR-20 and LCSF. Discussion centers on the advantages of an integrated-actuarial approach to the assessment of violence risk.
Siau, Ching Sin; Wee, Lei-Hum; Ibrahim, Norhayati; Visvalingam, Uma; Wahab, Suzaily
2017-01-01
Understanding attitudes toward suicide, especially among healthcare personnel, is an important step in both suicide prevention and treatment. We document the adaptation process and establish the validity and reliability of the Attitudes Toward Suicide (ATTS) questionnaire among 262 healthcare personnel in 2 major public hospitals in the Klang Valley, Malaysia. The findings indicate that healthcare personnel in Malaysia have unique constructs on suicide attitude, compared with the original study on a Western European sample. The adapted Malay ATTS questionnaire demonstrates adequate reliability and validity for use among healthcare personnel in Malaysia.
Siau, Ching Sin; Wee, Lei-Hum; Ibrahim, Norhayati; Visvalingam, Uma; Wahab, Suzaily
2017-01-01
Understanding attitudes toward suicide, especially among healthcare personnel, is an important step in both suicide prevention and treatment. We document the adaptation process and establish the validity and reliability of the Attitudes Toward Suicide (ATTS) questionnaire among 262 healthcare personnel in 2 major public hospitals in the Klang Valley, Malaysia. The findings indicate that healthcare personnel in Malaysia have unique constructs on suicide attitude, compared with the original study on a Western European sample. The adapted Malay ATTS questionnaire demonstrates adequate reliability and validity for use among healthcare personnel in Malaysia. PMID:28486042
Persoskie, Alexander; Nguyen, Anh B.; Kaufman, Annette R.; Tworek, Cindy
2017-01-01
Beliefs about the relative harmfulness of one product compared to another (perceived relative harm) are central to research and regulation concerning tobacco and nicotine-containing products, but techniques for measuring such beliefs vary widely. We compared the validity of direct and indirect measures of perceived harm of e-cigarettes and smokeless tobacco (SLT) compared to cigarettes. On direct measures, participants explicitly compare the harmfulness of each product. On indirect measures, participants rate the harmfulness of each product separately, and ratings are compared. The U.S. Health Information National Trends Survey (HINTS-FDA-2015; N=3738) included direct measures of perceived harm of e-cigarettes and SLT compared to cigarettes. Indirect measures were created by comparing ratings of harm from e-cigarettes, SLT, and cigarettes on 3-point scales. Logistic regressions tested validity by assessing whether direct and indirect measures were associated with criterion variables including: ever-trying e-cigarettes, ever-trying snus, and SLT use status. Compared to the indirect measures, the direct measures of harm were more consistently associated with criterion variables. On direct measures, 26% of adults rated e-cigarettes as less harmful than cigarettes, and 11% rated SLT as less harmful than cigarettes. Direct measures appear to provide valid information about individuals’ harm beliefs, which may be used to inform research and tobacco control policy. Further validation research is encouraged. PMID:28073035
Hobden, Breanne; Schwandt, Melanie L.; Carey, Mariko; Lee, Mary R.; Farokhnia, Mehdi; Bouhlal, Sofia; Oldmeadow, Christopher; Leggio, Lorenzo
2017-01-01
Background The Montgomery-Asberg Depression Rating Scale (MADRS) is commonly used to examine depressive symptoms in clinical settings, including facilities treating patients for alcohol addiction. No studies have examined the validity of the MADRS compared to an established clinical diagnostic tool of depression in this population. This study aimed to examine: 1) the validity of the MADRS compared to a clinical diagnosis of a depressive disorder (using the Structured Clinical Interview for DSM-IV (SCID)) in patients seeking treatment for alcohol dependence (AD); 2) whether the validity of the MADRS differs by type of SCID-based diagnosis of depression; and 3) which items contribute to the optimal predictive model of the MADRS compared to a SCID diagnosis of a depressive disorder. Methods Individuals seeking treatment for AD and admitted to an inpatient unit were administered the MADRS at day 2 of their detoxification program. Clinical diagnoses of AD and depression were made via the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders-IV at the beginning of treatment. Results In total, 803 participants were included in the study. The MADRS demonstrated low overall accuracy relative to the clinical diagnosis of depression with an area under the curve of 0.68. The optimal threshold for balancing sensitivity and specificity identified by the Euclidean distance was >14. This cut-point demonstrated a sensitivity of 66%, a specificity of 60%, a positive predictive value of 50% and a negative predictive value of 75%. The MADRS performed slightly better for major depressive disorders compared to alcohol-induced depression. Items related to lassitude, concentration and appetite slightly decreased the accuracy of the MADRS. Conclusion The MADRS does not appear to be an appropriate substitute for a diagnostic tool among alcohol-dependent patients. The MADRS may, however, still be a useful screening tool assuming careful consideration of cut-off scores. PMID:28421616
Hobden, Breanne; Schwandt, Melanie L; Carey, Mariko; Lee, Mary R; Farokhnia, Mehdi; Bouhlal, Sofia; Oldmeadow, Christopher; Leggio, Lorenzo
2017-06-01
The Montgomery-Asberg Depression Rating Scale (MADRS) is commonly used to examine depressive symptoms in clinical settings, including facilities treating patients for alcohol addiction. No studies have examined the validity of the MADRS compared to an established clinical diagnostic tool of depression in this population. This study aimed to examine the following: (i) the validity of the MADRS compared to a clinical diagnosis of a depressive disorder (using the Structured Clinical Interview for DSM-IV-TR [SCID-IV-TR]) in patients seeking treatment for alcohol dependence (AD); (ii) whether the validity of the MADRS differs by type of SCID-IV-TR-based diagnosis of depression; and (iii) which items contribute to the optimal predictive model of the MADRS compared to a SCID-IV-TR diagnosis of a depressive disorder. Individuals seeking treatment for AD and admitted to an inpatient unit were administered the MADRS at day 2 of their detoxification program. Clinical diagnoses of AD and depression were made via the SCID-IV-TR at the beginning of treatment. In total, 803 participants were included in the study. The MADRS demonstrated low overall accuracy relative to the clinical diagnosis of depression with an area under the receiver operating characteristic curve of 0.68. The optimal threshold for balancing sensitivity and specificity identified by the Euclidean distance was >14. This cut-point demonstrated a sensitivity of 66%, a specificity of 60%, a positive predictive value of 50%, and a negative predictive value of 75%. The MADRS performed slightly better for major depressive disorders compared to alcohol-induced depression. Items related to lassitude, concentration, and appetite slightly decreased the accuracy of the MADRS. The MADRS does not appear to be an appropriate substitute for a diagnostic tool among alcohol-dependent patients. The MADRS may, however, still be a useful screening tool assuming careful consideration of cut-points. Copyright © 2017 by the Research Society on Alcoholism.
Clinical Validation of a Smartphone-Based Adapter for Optic Disc Imaging in Kenya.
Bastawrous, Andrew; Giardini, Mario Ettore; Bolster, Nigel M; Peto, Tunde; Shah, Nisha; Livingstone, Iain A T; Weiss, Helen A; Hu, Sen; Rono, Hillary; Kuper, Hannah; Burton, Matthew
2016-02-01
Visualization and interpretation of the optic nerve and retina are essential parts of most physical examinations. To design and validate a smartphone-based retinal adapter enabling image capture and remote grading of the retina. This validation study compared the grading of optic nerves from smartphone images with those of a digital retinal camera. Both image sets were independently graded at Moorfields Eye Hospital Reading Centre. Nested within the 6-year follow-up (January 7, 2013, to March 12, 2014) of the Nakuru Eye Disease Cohort in Kenya, 1460 adults (2920 eyes) 55 years and older were recruited consecutively from the study. A subset of 100 optic disc images from both methods were further used to validate a grading app for the optic nerves. Data analysis was performed April 7 to April 12, 2015. Vertical cup-disc ratio for each test was compared in terms of agreement (Bland-Altman and weighted κ) and test-retest variability. A total of 2152 optic nerve images were available from both methods (also 371 from the reference camera but not the smartphone, 170 from the smartphone but not the reference camera, and 227 from neither the reference camera nor the smartphone). Bland-Altman analysis revealed a mean difference of 0.02 (95% CI, -0.21 to 0.17) and a weighted κ coefficient of 0.69 (excellent agreement). The grades of an experienced retinal photographer were compared with those of a lay photographer (no health care experience before the study), and no observable difference in image acquisition quality was found. Nonclinical photographers using the low-cost smartphone adapter were able to acquire optic nerve images at a standard that enabled independent remote grading of the images comparable to those acquired using a desktop retinal camera operated by an ophthalmic assistant. The potential for task shifting and the detection of avoidable causes of blindness in the most at-risk communities makes this an attractive public health intervention.
Cross-cultural adaptation and validation of the Korean version of the neck disability index.
Song, Kyung-Jin; Choi, Byung-Wan; Choi, Byung-Ryeul; Seo, Gyeu-Beom
2010-09-15
Validation of a translated, culturally adapted questionnaire. The purpose of this study is to translate and culturally adapt the Neck Disability Index (NDI) and to validate the use of the derived version in Korean patient. Although several valid measures exist for measurement of neck pain and functional impairment, these measures have yet been validated in Korean version. The NDI was linguistically translated into Korean, and prefinal version was assessed and modified by a pilot study. The reliability and validity of the derived Korean version was examined in 78 patients with degenerative cervical spine disease. Test-retest reliability, internal consistency, and construct validity were investigated by comparing Visual Analogue Scale (VAS) and Short Form Health Survey (SF-36) scores. Factor analysis of Korean NDI extracted 2 factors with eigenvalues >1. The intraclass-correlation coefficient of test-retest reliability was 0.93. Reliability, estimated by internal consistency, had a Cronbach alpha value of 0.82. The correlation between NDI and VAS scores was r = 0.49, and the correlation between NDI and SF-36 scores was r = -0.44. The physical health component score of SF-36 was highly correlated with NDI, and the correlation between VAS scores and the mental health component scores of SF-36 was high. The derived Korean version of the NDI was found to be a reliable and valid instrument for measuring disability in Korean patients with cervical problems. The authors recommend its use in future Korean clinical studies.
ERIC Educational Resources Information Center
Lee, Chul-Joo; Kim, Daniel
2013-01-01
The goals of this study were to validate a number of available collective social capital measures at the US state and county levels, and to examine the relative extent to which these social capital measures are associated with population health outcomes. Measures of social capital at the US state level included aggregate indices based on the…
ERIC Educational Resources Information Center
McLoughlin, M. Padraig M. M.; Bluford, Dontrell A.
2004-01-01
This study investigated the predictive validity of the Descriptive Tests of Mathematical Skills (DTMS) and the SAT-Mathematics (SAT-M) tests as placement tools for entering students in a small, liberal arts, historically black institution (HBI) using regression analysis. The placement schema is four-tiered: for a remedial algebra course, college…
ERIC Educational Resources Information Center
Evans, Linda Garner; Oehler-Stinnett, Judy
2008-01-01
Tornadoes and other natural disasters can lead to anxiety and posttraumatic stress disorder (PTSD) in children. This study provides further validity for the Oklahoma State University Post-Traumatic Stress Disorder Scale-Child Form (OSU PTSDS-CF) by comparing it to the Behavior Assessment System for Children Self-Report of Personality (BASC-SRP).…
Schliep, Karen C; Schisterman, Enrique F; Mumford, Sunni L; Perkins, Neil J; Ye, Aijun; Pollack, Anna Z; Zhang, Cuilin; Porucznik, Christina A; VanDerslice, James A; Stanford, Joseph B; Wactawski-Wende, Jean
2013-04-01
Effects of caffeine on women's health are inconclusive, in part because of inadequate exposure assessment. In this study we determined 1) validity of a food frequency questionnaire compared with multiple 24-hour dietary recalls (24HDRs) for measuring monthly caffeine and caffeinated beverage intakes; and 2) validity of the 24HDR compared with the prior day's diary record for measuring daily caffeinated coffee intake. BioCycle Study (2005-2007) participants, women (n = 259) aged 18-44 years from western New York State, were followed for 2 menstrual cycles. Participants completed a food frequency questionnaire at the end of each cycle, four 24HDRs per cycle, and daily diaries. Caffeine intakes reported for the food frequency questionnaires were greater than those reported for the 24HDRs (mean = 114.1 vs. 92.6mg/day, P = 0.01) but showed high correlation (r = 0.73, P < 0.001) and moderate agreement (К = 0.51, 95% confidence interval: 0.43, 0.57). Women reported less caffeinated coffee intake in their 24HDRs compared with their corresponding diary days (mean = 0.51 vs. 0.80 cups/day, P < 0.001) (1 cup = 237 mL). Although caffeine and coffee exposures were highly correlated, absolute intakes differed significantly between measurement tools. These results highlight the importance of considering potential misclassification of caffeine exposure.
Schliep, Karen C.; Schisterman, Enrique F.; Mumford, Sunni L.; Perkins, Neil J.; Ye, Aijun; Pollack, Anna Z.; Zhang, Cuilin; Porucznik, Christina A.; VanDerslice, James A.; Stanford, Joseph B.; Wactawski-Wende, Jean
2013-01-01
Effects of caffeine on women's health are inconclusive, in part because of inadequate exposure assessment. In this study we determined 1) validity of a food frequency questionnaire compared with multiple 24-hour dietary recalls (24HDRs) for measuring monthly caffeine and caffeinated beverage intakes; and 2) validity of the 24HDR compared with the prior day's diary record for measuring daily caffeinated coffee intake. BioCycle Study (2005–2007) participants, women (n = 259) aged 18–44 years from western New York State, were followed for 2 menstrual cycles. Participants completed a food frequency questionnaire at the end of each cycle, four 24HDRs per cycle, and daily diaries. Caffeine intakes reported for the food frequency questionnaires were greater than those reported for the 24HDRs (mean = 114.1 vs. 92.6mg/day, P = 0.01) but showed high correlation (r = 0.73, P < 0.001) and moderate agreement (К = 0.51, 95% confidence interval: 0.43, 0.57). Women reported less caffeinated coffee intake in their 24HDRs compared with their corresponding diary days (mean = 0.51 vs. 0.80 cups/day, P < 0.001) (1 cup = 237 mL). Although caffeine and coffee exposures were highly correlated, absolute intakes differed significantly between measurement tools. These results highlight the importance of considering potential misclassification of caffeine exposure. PMID:23462965
Sturm, Jonathan W; Osborne, Richard H; Dewey, Helen M; Donnan, Geoffrey A; Macdonell, Richard A L; Thrift, Amanda G
2002-12-01
Generic utility health-related quality of life instruments are useful in assessing stroke outcome because they facilitate a broader description of the disease and outcomes, allow comparisons between diseases, and can be used in cost-benefit analysis. The aim of this study was to validate the Assessment of Quality of Life (AQoL) instrument in a stroke population. Ninety-three patients recruited from the community-based North East Melbourne Stroke Incidence Study between July 13, 1996, and April 30, 1997, were interviewed 3 months after stroke. Validity of the AQoL was assessed by examining associations between the AQoL and comparator instruments: the Medical Outcomes Short-Form Health Survey (SF-36); London Handicap Scale; Barthel Index; National Institutes of Health Stroke Scale; and Irritability, Depression, Anxiety scale. Sensitivity of the AQoL was assessed by comparing AQoL scores from groups of patients categorized by severity of impairment and disability and with total anterior circulation syndrome (TACS) versus non-TACS. Predictive validity was assessed by examining the association between 3-month AQoL scores and outcomes of death or institutionalization 12 months after stroke. Overall AQoL utility scores and individual dimension scores were most highly correlated with relevant scales on the comparator instruments. AQoL scores clearly differentiated between patients in categories of severity of impairment and disability and between patients with TACS and non-TACS. AQoL scores at 3 months after stroke predicted death and institutionalization at 12 months. The AQoL demonstrated strong psychometric properties and appears to be a valid and sensitive measure of health-related QoL after stroke.
Sandberg, Rory P; Sherman, Nathan C; Latt, L Daniel; Hardy, Jolene C
2017-11-01
The goal of this study was to validate the cigar box arthroscopy trainer (CBAT) as a training tool and then compare its effectiveness to didactic training and to another previously validated low-fidelity but anatomic model, the anatomic knee arthroscopy trainer (AKAT). A nonanatomic knee arthroscopy training module was developed at our institution. Twenty-four medical students with no prior arthroscopic or laparoscopic experience were enrolled as subjects. Eight subjects served as controls. The remaining 16 subjects were randomized to participate in 4 hours of either the CBAT or a previously validated AKAT. Subjects' skills were assessed by 1 of 2 faculty members through repeated attempts at performing a diagnostic knee arthroscopy on a cadaveric specimen. Objective scores were given using a minimally adapted version of the Basic Arthroscopic Knee Skill Scoring System. Total cost differences were calculated. Seventy-five percent of subjects in the CBAT and AKAT groups succeeded in reaching minimum proficiency in the allotted time compared with 25% in the control group (P < .05). There was no significant difference in the number of attempts to reach proficiency between the CBAT and AKAT groups. The cost to build the CBAT was $44.12, whereas the cost was $324.33 for the AKAT. This pilot study suggests the CBAT is an effective knee arthroscopy trainer that may decrease the learning curve of residents without significant cost to a residency program. This study demonstrates the need for an agreed-upon objective scoring system to properly evaluate residents and compare the effectiveness of different training tools. Copyright © 2017 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Scopolamine provocation-based pharmacological MRI model for testing procognitive agents.
Hegedűs, Nikolett; Laszy, Judit; Gyertyán, István; Kocsis, Pál; Gajári, Dávid; Dávid, Szabolcs; Deli, Levente; Pozsgay, Zsófia; Tihanyi, Károly
2015-04-01
There is a huge unmet need to understand and treat pathological cognitive impairment. The development of disease modifying cognitive enhancers is hindered by the lack of correct pathomechanism and suitable animal models. Most animal models to study cognition and pathology do not fulfil either the predictive validity, face validity or construct validity criteria, and also outcome measures greatly differ from those of human trials. Fortunately, some pharmacological agents such as scopolamine evoke similar effects on cognition and cerebral circulation in rodents and humans and functional MRI enables us to compare cognitive agents directly in different species. In this paper we report the validation of a scopolamine based rodent pharmacological MRI provocation model. The effects of deemed procognitive agents (donepezil, vinpocetine, piracetam, alpha 7 selective cholinergic compounds EVP-6124, PNU-120596) were compared on the blood-oxygen-level dependent responses and also linked to rodent cognitive models. These drugs revealed significant effect on scopolamine induced blood-oxygen-level dependent change except for piracetam. In the water labyrinth test only PNU-120596 did not show a significant effect. This provocational model is suitable for testing procognitive compounds. These functional MR imaging experiments can be paralleled with human studies, which may help reduce the number of false cognitive clinical trials. © The Author(s) 2015.
The validity and utility of subtyping bulimia nervosa.
van Hoeken, Daphne; Veling, Wim; Sinke, Sjoukje; Mitchell, James E; Hoek, Hans W
2009-11-01
To review the evidence for the validity and utility of subtyping bulimia nervosa (BN) into a purging (BN-P) and a nonpurging subtype (BN-NP), and of distinguishing BN-NP from binge eating disorder (BED), by comparing course, complications, and treatment. A literature search of psychiatry databases for studies published in peer-reviewed journals that used the DSM-definitions of BN and BED, and included both individuals with BN-NP and individuals with BN-P and/or BED. Twenty-three studies compared individuals with BN-NP (N = 671) to individuals with BN-P (N = 1795) and/or individuals with BED (N = 1921), two of which reported on course, 12 on comorbidity and none on treatment response-the indicators for validity and clinical utility. The differences found were mainly quantitative rather than qualitative, suggesting a gradual difference in severity from BN-P (most severe) through BN-NP to BED (least severe). None of the comparisons provided convincing evidence for the validity or utility of the BN-NP diagnosis. Three options for the position of BN-NP in DSM-V were suggested: (1) maintaining the BN-NP subtype, (2) dropping nonpurging compensatory behavior as a criterion for BN, so that individuals currently designated as having BN-NP would be designated as having BED, and (3) including BN-NP in a broad BN category.
Validity of a New Patient Engagement Measure: The Altarum Consumer Engagement (ACE) Measure.
Duke, Christopher C; Lynch, Wendy D; Smith, Brad; Winstanley, Julie
2015-12-01
The objective of this study was to report on the validation of new scales [called the Altarum Consumer Engagement (ACE) Measure] that are indicative of an individual's engagement in health and healthcare decisions. The instrument was created to broaden the scope of how engagement is measured and understood, and to update the concept of engagement to include modern information sources, such as online health resources and ratings of providers and patient health. Data were collected through an online survey with a US population of 2079 participants. A combination of Principal Component Analysis (PCA) and detailed Rasch analyses were conducted to identify specific subscales of engagement. Results were compared to another commonly used survey instrument, and outcomes were compared for construct validity. The PCA identified a four-factor structure composed of 21 items. The factors were named Commitment, Informed Choice, Navigation, and Ownership. Rasch analyses confirmed scale stability. Relevant outcomes were correlated in the expected direction, such as health status, lifestyle behaviors, medication adherence, and observed expected group differences. This study confirmed the validity of the new ACE Measure and its utility in screening for and finding group differences in activities related to patient engagement and health consumerism, such as using provider comparison tools and asking about medical costs.
The reliability and validity of a sexual functioning questionnaire.
Corty, E W; Althof, S E; Kurit, D M
1996-01-01
The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.
Code of Federal Regulations, 2013 CFR
2013-04-01
... studies (including proposals for such studies), assay validation data, final release testing on the last... route of administration; (4) Make a comparative efficacy claim naming another drug product; (5... based on at least one adequate and well-controlled clinical study. FDA means the Food and Drug...
Code of Federal Regulations, 2010 CFR
2010-04-01
... studies (including proposals for such studies), assay validation data, final release testing on the last... route of administration; (4) Make a comparative efficacy claim naming another drug product; (5... based on at least one adequate and well-controlled clinical study. FDA means the Food and Drug...
Code of Federal Regulations, 2012 CFR
2012-04-01
... studies (including proposals for such studies), assay validation data, final release testing on the last... route of administration; (4) Make a comparative efficacy claim naming another drug product; (5... based on at least one adequate and well-controlled clinical study. FDA means the Food and Drug...
Code of Federal Regulations, 2011 CFR
2011-04-01
... studies (including proposals for such studies), assay validation data, final release testing on the last... route of administration; (4) Make a comparative efficacy claim naming another drug product; (5... based on at least one adequate and well-controlled clinical study. FDA means the Food and Drug...
Code of Federal Regulations, 2014 CFR
2014-04-01
... studies (including proposals for such studies), assay validation data, final release testing on the last... route of administration; (4) Make a comparative efficacy claim naming another drug product; (5... based on at least one adequate and well-controlled clinical study. FDA means the Food and Drug...
Pereira, Taísa Sabrina Silva; Cade, Nágela Valadão; Mill, José Geraldo; Sichieri, Rosely; Molina, Maria del Carmen Bisi
2016-01-01
Introduction Biomarkers are a good choice to be used in the validation of food frequency questionnaire due to the independence of their random errors. Objective To assess the validity of the potassium and sodium intake estimated using the Food Frequency Questionnaire ELSA-Brasil. Subjects/Methods A subsample of participants in the ELSA-Brasil cohort was included in this study in 2009. Sodium and potassium intake were estimated using three methods: Semi-quantitative food frequency questionnaire, 12-hour nocturnal urinary excretion and three 24-hour food records. Correlation coefficients were calculated between the methods, and the validity coefficient was calculated using the method of triads. The 95% confidence intervals for the validity coefficient were estimated using bootstrap sampling. Exact and adjacent agreement and disagreement of the estimated sodium and potassium intake quintiles were compared among three methods. Results The sample consisted of 246 participants, aged 53±8 years, 52% of women. Validity coefficient for sodium were considered weak (рfood frequency questionnaire actual intake = 0.37 and рbiomarker actual intake = 0.21) and moderate (рfood records actual intake 0.56). The validity coefficient were higher for potassium (рfood frequency questionnaire actual intake = 0.60; рbiomarker actual intake = 0.42; рfood records actual intake = 0.79). Conclusions: The Food Frequency Questionnaire ELSA-Brasil showed good validity in estimating potassium intake in epidemiological studies. For sodium validity was weak, likely due to the non-quantification of the added salt to prepared food. PMID:28030625
Pereira, Taísa Sabrina Silva; Cade, Nágela Valadão; Mill, José Geraldo; Sichieri, Rosely; Molina, Maria Del Carmen Bisi
2016-01-01
Biomarkers are a good choice to be used in the validation of food frequency questionnaire due to the independence of their random errors. To assess the validity of the potassium and sodium intake estimated using the Food Frequency Questionnaire ELSA-Brasil. A subsample of participants in the ELSA-Brasil cohort was included in this study in 2009. Sodium and potassium intake were estimated using three methods: Semi-quantitative food frequency questionnaire, 12-hour nocturnal urinary excretion and three 24-hour food records. Correlation coefficients were calculated between the methods, and the validity coefficient was calculated using the method of triads. The 95% confidence intervals for the validity coefficient were estimated using bootstrap sampling. Exact and adjacent agreement and disagreement of the estimated sodium and potassium intake quintiles were compared among three methods. The sample consisted of 246 participants, aged 53±8 years, 52% of women. Validity coefficient for sodium were considered weak (рfood frequency questionnaire actual intake = 0.37 and рbiomarker actual intake = 0.21) and moderate (рfood records actual intake 0.56). The validity coefficient were higher for potassium (рfood frequency questionnaire actual intake = 0.60; рbiomarker actual intake = 0.42; рfood records actual intake = 0.79). Conclusions: The Food Frequency Questionnaire ELSA-Brasil showed good validity in estimating potassium intake in epidemiological studies. For sodium validity was weak, likely due to the non-quantification of the added salt to prepared food.
The Self-Stigma of Depression Scale: Translation and Validation of the Arabic Version
Darraj, Hussain Ahmed; Mahfouz, Mohamed Salih; Al Sanosi, Rashad Mohamed; Badedi, Mohammed; Sabai, Abdullah
2017-01-01
Background: Self-stigma may feature strongly and be detrimental for people with depression, but the understanding of its nature and prevalence is limited by the lack of psychometrically validated measures. This study is aimed to validate the Arabic version self-stigma of depression scale (SSDS) among adolescents. Materials and Methods: A cross-sectional study involved 100 adolescents randomly selected. The analyses include face validation, factor analysis, and reliability testing. A test–retest was conducted within a 2-week interval. Results: The mean score for self-stigma of depression among study participants was 68.9 (Standard deviation = 8.76) median equal to 71 and range was 47. Descriptive analysis showed that the percentage of those who scored below the mean score (41.7%) is shown less than those who scored above the mean score (58.3%). Preliminary construct validation analysis confirmed that factor analysis was appropriate for the Arabic-translated version of the SSDS. Furthermore, the factor analysis showed similar factor loadings to the original English version. The total internal consistency of the translated version, which was measured by Cronbach's alphas ranged from 0.70 to 0.77 for the four subscales and 0.84 for the total scale. Test–retest reliability was assessed in 65 respondents after 2 weeks. Cronbach's alphas ranged from 0.70 to 0.77 for the four subscales and 0.84 for the total scale. Conclusions: Face validity, construct validity, and reliability analysis were found satisfactory for the Arabic-translated version of the SSDS. The Arabic-translated version of the SSDS was found valid and reliable to be used in future studies, with comparable properties to the original version and to previous studies. PMID:28149090
Development and validation of a cerebral oximeter capable of absolute accuracy.
MacLeod, David B; Ikeda, Keita; Vacchiano, Charles; Lobbestael, Aaron; Wahr, Joyce A; Shaw, Andrew D
2012-12-01
Cerebral oximetry may be a valuable monitor, but few validation data are available, and most report the change from baseline rather than absolute accuracy, which may be affected by individuals whose oximetric values are outside the expected range. The authors sought to develop and validate a cerebral oximeter capable of absolute accuracy. An in vivo research study. A university human physiology laboratory. Healthy human volunteers were enrolled in calibration and validation studies of 2 cerebral oximetric sensors, the Nonin 8000CA and 8004CA. The 8000CA validation study identified 5 individuals with atypical cerebral oxygenation values; their data were used to design the 8004CA sensor, which subsequently underwent calibration and validation. Volunteers were taken through a stepwise hypoxia protocol to a minimum saturation of peripheral oxygen. Arteriovenous saturation (70% jugular bulb venous saturation and 30% arterial saturation) at 6 hypoxic plateaus was used as the reference value for the cerebral oximeter. Absolute accuracy was defined using a combination of the bias and precision of the paired saturations (A(RMS)). In the validation study for the 8000CA sensor (n = 9, 106 plateaus), relative accuracy was an A(RMS) of 2.7, with an absolute accuracy of 8.1, meeting the criteria for a relative (trend) monitor, but not an absolute monitor. In the validation study for the 8004CA sensor (n = 11, 119 plateaus), the A(RMS) of the 8004CA was 4.1, meeting the prespecified success criterion of <5.0. The Nonin cerebral oximeter using the 8004CA sensor can provide absolute data on regional cerebral saturation compared with arteriovenous saturation, even in subjects previously shown to have values outside the normal population distribution curves. Copyright © 2012 Elsevier Inc. All rights reserved.
Mohammadifard, Noushin; Sajjadi, Firouzeh; Maghroun, Maryam; Alikhasi, Hassan; Nilforoushzadeh, Farzaneh; Sarrafzadegan, Nizal
2015-03-01
Dietary assessment is the first step of dietary modification in community-based interventional programs. This study was performed to validate a simple food frequency questionnaire (SFFQ) for assessment of selected food items in epidemiological studies with a large sample size as well as community trails. This validation study was carried out on 264 healthy adults aged ≥ 41 years old living in 3 district central of Iran, including Isfahan, Najafabad, and Arak. Selected food intakes were assessed using a 48-item food frequency questionnaire (FFQ). The FFQ was interviewer-administered, which was completed twice; at the beginning of the study and 2 weeks thereafter. The validity of this SFFQ was examined compared to estimated amount by single 24 h dietary recall and 2 days dietary record. Validation of the FFQ was determined using Spearman correlation coefficients between daily frequency consumption of food groups as assessed by the FFQ and the qualitative amount of daily food groups intake accessed by dietary reference method was applied to evaluate validity. Intraclass correlation coefficients (ICC) were used to determine the reproducibility. Spearman correlation coefficient between the estimated amount of food groups intake by examined and reference methods ranged from 0.105 (P = 0.378) in pickles to 0.48 (P < 0.001) in plant protein. ICC for reproducibility of FFQ were between 0.47-0.69 in different food groups (P < 0.001). The designed SFFQ has a good relative validity and reproducibility for assessment of selected food groups intake. Thus, it can serve as a valid tool in epidemiological studies and clinical trial with large participants.
Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck
2018-01-01
The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.
Vázquez Peña, Fernando; Harzheim, Erno; Terrasa, Sergio; Berra, Silvina
2017-02-01
To validate the Brazilian short version of the PCAT for adult patients in Spanish. Analysis of secondary data from studies made to validate the extended version of the PCAT questionnaire. City of Córdoba, Argentina. Primary health care. The sample consisted of 46% of parents, whose children were enrolled in secondary education in three institutes in the city of Cordoba, and the remaining 54% were adult users of the National University of Cordoba Health Insurance. Pearson's correlation coefficient comparing the extended and short versions. Goodness-of-fit indices in confirmatory factor analysis, composite reliability, average variance extracted, and Cronbach's alpha values, in order to assess the construct validity and the reliability of the short version. The values of Pearson's correlation coefficient between this short version and the long version were high .818 (P<.001), implying a very good criterion validity. The indicators of good global adjustment to the confirmatory factor analysis were good. The value of composite reliability was good (.802), but under the variance media extracted: .3306, since 3 variables had weak factorials loads. The Cronbach's alpha was acceptable (.85). The short version of the PCAT-users developed in Brazil showed an acceptable psychometric performance in Spanish as a quick assessment tool, in a comparative study with the extended version. Copyright © 2016 Elsevier España, S.L.U. All rights reserved.
Study on evaluation methods for Rayleigh wave dispersion characteristic
Shi, L.; Tao, X.; Kayen, R.; Shi, H.; Yan, S.
2005-01-01
The evaluation of Rayleigh wave dispersion characteristic is the key step for detecting S-wave velocity structure. By comparing the dispersion curves directly with the spectra analysis of surface waves (SASW) method, rather than comparing the S-wave velocity structure, the validity and precision of microtremor-array method (MAM) can be evaluated more objectively. The results from the China - US joint surface wave investigation in 26 sites in Tangshan, China, show that the MAM has the same precision with SASW method in 83% of the 26 sites. The MAM is valid for Rayleigh wave dispersion characteristic testing and has great application potentiality for site S-wave velocity structure detection.
Wise, Edward A; Streiner, David L
2010-12-01
There is a lack of normative data on broadband omnibus types of personality tests with medical populations. In fact, the only two tests normed on medical populations are the Millon Behavioral Medicine Diagnostic (MBMD) and the Millon Behavioral Health Inventory (MBHI). The internal consistency, test-retest reliabilities, and validity studies of these instruments are reviewed and compared in an effort to aid clinicians in discerning their relative psychometric strengths and weaknesses. Due to the lack of validity studies with the MBMD and the fact that reliability limits the ceiling of validity coefficients, the MBMD has yet to meet the challenges it was designed to meet. Implications for practice are addressed. © 2010 Wiley Periodicals, Inc.
Three-factor structure for Epistemic Belief Inventory: A cross-validation study
2017-01-01
Research on epistemic beliefs has been hampered by lack of validated models and measurement instruments. The most widely used instrument is the Epistemological Questionnaire, which has been criticized for validity, and it has been proposed a new instrument based in the Epistemological Questionnaire: the Epistemic Belief Inventory. The Spanish-language version of Epistemic Belief Inventory was applied to 1,785 Chilean high school students. Exploratory and confirmatory factor analyses in independent subsamples were performed. A three factor structure emerged and was confirmed. Reliability was comparable to other studies, and the factor structure was invariant among randomized subsamples. The structure that was found does not replicate the one proposed originally, but results are interpreted in light of embedded systemic model of epistemological beliefs. PMID:28278258
Cook, Karon F; Jensen, Sally E; Schalet, Benjamin D; Beaumont, Jennifer L; Amtmann, Dagmar; Czajkowski, Susan; Dewalt, Darren A; Fries, James F; Pilkonis, Paul A; Reeve, Bryce B; Stone, Arthur A; Weinfurt, Kevin P; Cella, David
2016-05-01
To present an overview of a series of studies in which the clinical validity of the National Institutes of Health's Patient Reported Outcome Measurement Information System (NIH; PROMIS) measures was evaluated, by domain, across six clinical populations. Approximately 1,500 individuals at baseline and 1,300 at follow-up completed PROMIS measures. The analyses reported in this issue were conducted post hoc, pooling data across six previous studies, and accommodating the different designs of the six, within-condition, parent studies. Changes in T-scores, standardized response means, and effect sizes were calculated in each study. When a parent study design allowed, known groups validity was calculated using a linear mixed model. The results provide substantial support for the clinical validity of nine PROMIS measures in a range of chronic conditions. The cross-condition focus of the analyses provided a unique and multifaceted perspective on how PROMIS measures function in "real-world" clinical settings and provides external anchors that can support comparative effectiveness research. The current body of clinical validity evidence for the nine PROMIS measures indicates the success of NIH PROMIS in developing measures that are effective across a range of chronic conditions. Copyright © 2016 Elsevier Inc. All rights reserved.
Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B
2009-03-01
The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.
Testing and Validation of Computational Methods for Mass Spectrometry.
Gatto, Laurent; Hansen, Kasper D; Hoopmann, Michael R; Hermjakob, Henning; Kohlbacher, Oliver; Beyer, Andreas
2016-03-04
High-throughput methods based on mass spectrometry (proteomics, metabolomics, lipidomics, etc.) produce a wealth of data that cannot be analyzed without computational methods. The impact of the choice of method on the overall result of a biological study is often underappreciated, but different methods can result in very different biological findings. It is thus essential to evaluate and compare the correctness and relative performance of computational methods. The volume of the data as well as the complexity of the algorithms render unbiased comparisons challenging. This paper discusses some problems and challenges in testing and validation of computational methods. We discuss the different types of data (simulated and experimental validation data) as well as different metrics to compare methods. We also introduce a new public repository for mass spectrometric reference data sets ( http://compms.org/RefData ) that contains a collection of publicly available data sets for performance evaluation for a wide range of different methods.
Ruuska, Salla; Hämäläinen, Wilhelmiina; Kajava, Sari; Mughal, Mikaela; Matilainen, Pekka; Mononen, Jaakko
2018-03-01
The aim of the present study was to evaluate empirically confusion matrices in device validation. We compared the confusion matrix method to linear regression and error indices in the validation of a device measuring feeding behaviour of dairy cattle. In addition, we studied how to extract additional information on classification errors with confusion probabilities. The data consisted of 12 h behaviour measurements from five dairy cows; feeding and other behaviour were detected simultaneously with a device and from video recordings. The resulting 216 000 pairs of classifications were used to construct confusion matrices and calculate performance measures. In addition, hourly durations of each behaviour were calculated and the accuracy of measurements was evaluated with linear regression and error indices. All three validation methods agreed when the behaviour was detected very accurately or inaccurately. Otherwise, in the intermediate cases, the confusion matrix method and error indices produced relatively concordant results, but the linear regression method often disagreed with them. Our study supports the use of confusion matrix analysis in validation since it is robust to any data distribution and type of relationship, it makes a stringent evaluation of validity, and it offers extra information on the type and sources of errors. Copyright © 2018 Elsevier B.V. All rights reserved.
Scarponi, Letizia; de Felicio, Claudia Maria; Sforza, Chiarella; Pimenta Ferreira, Claudia Lucia; Ginocchio, Daniela; Pizzorni, Nicole; Barozzi, Stefania; Mozzanica, Francesco; Schindler, Antonio
2018-05-30
To evaluate the reliability, validity, and responsiveness of the Italian OMES (I-OMES). The study consisted of 3 phases: (1) internal consistency and reliability, (2) validity, and (3) responsiveness analysis. The recruited population included 27 patients with orofacial myofunctional disorders (OMD) and 174 healthy volunteers. Forty-seven subjects, 18 healthy and all recruited patients with OMD were assessed for inter-rater and test-retest reliability analysis. I-OMES and Nordic Orofacial Test - Screening (NOT-S) scores of the patients were correlated for concurrent validity analysis. I-OMES scores from 27 patients with OMD and 27 age- and gender-matched healthy subjects were compared to investigate construct validity. I-OMES scores before and after successful swallowing rehabilitation in patients were compared for responsiveness analysis. Adequate internal consistency (Cronbach α = 0.71) and strong inter-rater and test-retest reliability (intraclass coefficient correlation = 0.97 and 0.98, respectively) were found. I-OMES and NOT-S scores significantly and inversely correlated (r = -0.38). A statistical significance (p < 0.001) was found between the pathological group and the control group for the total I-OMES score. The mean I-OMES score improved from 90 (78-102) to 99 (89-103) after myofunctional rehabilitation (p < 0.001). The I-OMES is a reliable and valid tool to evaluate OMD. © 2018 S. Karger AG, Basel.
Validity of the MMPI Personality Disorder scales (MMPI-PD).
Schuler, C E; Snibbe, J R; Buckwalter, J G
1994-03-01
The MMPI Personality Disorder scales, developed by Morey, Waugh, and Blashfield (1985), were validated on an inpatient population by comparing 104 patients' MMPI-PD scores with the MCMI and with DSM-III-R diagnosis. Conservative significance levels were used to ensure more valid conclusions. Schizoid, Avoidant, Dependent, Histrionic, and Narcissistic scales were correlated significantly. Passive-Aggressive, Schizotypal, and Borderline scales did not correlate with corresponding MCMI scales. The MMPI-PD nonoverlapping scales were most effective in predicting diagnosis, specifically the Personality Disorder NOS, Eccentric and Borderline groups. The overlapping scales were not as effective in predicting diagnosis, but best predicted the Eccentric and Borderline groups. This study provides support for the validity of specific scales and circumscribed diagnostic utility for both measures.
Validation Test Results for Orthogonal Probe Eddy Current Thruster Inspection System
NASA Technical Reports Server (NTRS)
Wincheski, Russell A.
2007-01-01
Recent nondestructive evaluation efforts within NASA have focused on an inspection system for the detection of intergranular cracking originating in the relief radius of Primary Reaction Control System (PCRS) Thrusters. Of particular concern is deep cracking in this area which could lead to combustion leakage in the event of through wall cracking from the relief radius into an acoustic cavity of the combustion chamber. In order to reliably detect such defects while ensuring minimal false positives during inspection, the Orthogonal Probe Eddy Current (OPEC) system has been developed and an extensive validation study performed. This report describes the validation procedure, sample set, and inspection results as well as comparing validation flaws with the response from naturally occuring damage.
Biodemography of Exceptional Longevity: Early-life and Mid-life predictors of Human Longevity
Gavrilov, Leonid A.; Gavrilova, Natalia S.
2011-01-01
Effects of early-life and middle-life conditions on exceptional longevity are explored in this study using two matched case-control studies. The first study compares 198 validated centenarians born in the United States in 1890-1893 to their shorter-lived siblings. Family histories of centenarians were reconstructed and exceptional longevity validated using early U.S. censuses, Social Security Administration Death Master File, state death indexes, online genealogies and other supplementary data resources. Siblings born to young mothers (<25 years) had significantly higher chances to live to 100 compared to siblings born to older mothers (odds ratio = 2.03, 95% CI = 1.33 - 3.11, P = 0.001) while paternal age and birth order were not associated with exceptional longevity. The second study explores whether people living to 100 and beyond are any different in physical characteristics at young age from their shorter-lived peers. A random representative sample of 240 men born in 1887 and survived to age 100 was selected from the US Social Security Administration database and linked to the US WWI civil draft registration cards collected in 1917 when these men were 30 years old. These validated centenarians were then compared to randomly selected controls matched by calendar year of birth, race and place of draft registration in 1917. It was found that ‘stout’ body build (being in the heaviest 15% of population) was negatively associated with survival to age 100 years. Farmer occupation and large number of children (4+) at age 30 increased the chances of exceptional longevity. Detailed description of dataset development, data cleaning procedure and validation of exceptional longevity is provided for both studies. These results demonstrate that matched case-control design is a useful approach in exploring effects of early-life conditions and middle-life characteristics on exceptional longevity. PMID:22582891
Portuguese Children's Sleep Habits Questionnaire - validation and cross-cultural comparison.
Silva, Filipe Glória; Silva, Cláudia Rocha; Braga, Lígia Barbosa; Neto, Ana Serrão
2014-01-01
To validate the Portuguese version of the Children's Sleep Habits Questionnaire (CSHQ-PT) and compare it to the versions from other countries. The questionnaire was previously adapted to the Portuguese language according to international guidelines. 500 questionnaires were delivered to the parents of a Portuguese community sample of children aged 2 to 10 years old. 370 (74%) valid questionnaires were obtained, 55 children met exclusion criteria and 315 entered in the validation study. The CSHQ-PT internal consistency (Cronbach's α) was 0.78 for the total scale and ranged from 0.44 to 0.74 for subscales. The test-retest reliability for subscales (Pearson's correlations, n=58) ranged from 0.59 to 0.85. Our data did not adjust to the original 8 domains structure in Confirmatory Factor Analysis but the Exploratory Factor Analysis extracted 5 factors that have correspondence to CSHQ subscales. The CSHQ-PT evidenced psychometric properties that are comparable to the versions from other countries and adequate for the screening of sleep disturbances in children from 2 to 10 years old. Copyright © 2013 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.
Skinner, Timothy C; Blick, Julie; Coffin, Juli; Dudgeon, Pat; Forrest, Simon; Morrison, David
2013-01-01
This study sought to determine the construct validity of two self-report measures of attitudes towards Aboriginal Australians and Torres Strait Islanders against an implicit measure of attitude. Total of 102 volunteer participants completed the three measures in a randomized order. The explicit measures of prejudice towards Aboriginal Australians were the Modern Racism Scale (MRS) and the Attitudes Towards Indigenous Australians Scale (ATIAS). The implicit attitudes measure was an adaptation of the Implicit Association Test (IAT) and utilised simple drawn head-and-shoulder images of Aboriginal Australians and White Australians as the stimuli. Both explicit measures and implicit measure varied in the extent to which negative prejudicial attitudes were held by participants, and the corresponding construct validities were unimpressive. The MRS was significantly correlated with the IAT, (r =.314;p<.05) where the ATIAS was not significantly correlated with IAT scores (r =.12). Of the two self-report measures of attitudes towards Aboriginal Australians, only the MRS evidenced validity when compared with the use of an implicit attitude measure.
Validity of field expedient devices to assess core temperature during exercise in the cold.
Bagley, James R; Judelson, Daniel A; Spiering, Barry A; Beam, William C; Bartolini, J Albert; Washburn, Brian V; Carney, Keven R; Muñoz, Colleen X; Yeargin, Susan W; Casa, Douglas J
2011-12-01
Exposure to cold environments affects human performance and physiological function. Major medical organizations recommend rectal temperature (TREC) to evaluate core body temperature (TcORE) during exercise in the cold; however, other field expedient devices claim to measure TCORE. The purpose of this study was to determine if field expedient devices provide valid measures of TcRE during rest and exercise in the cold. Participants included 13 men and 12 women (age = 24 +/- 3 yr, height = 170.7 +/- 10.6 cm, mass = 73.4 +/- 16.7 kg, body fat = 18 +/- 7%) who reported being healthy and at least recreationally active. During 150 min of cold exposure, subjects sequentially rested for 30 min, cycled for 90 min (heart rate = 120-140 bpm), and rested for an additional 30 min. Investigators compared aural (T(AUR)), expensive axillary (T(AXLe)), inexpensive axillary (T(AXLi)), forehead (T(FOR)), gastrointestinal (T(GI)), expensive oral (T(ORLe)), inexpensive oral (T(ORLi)), and temporal (T(TEM)) temperatures to T(REc) every 15 min. Researchers used mean difference between each device and T(REC) (i.e., mean bias) as the primary criterion for validity. T(AUR), T(AXLe), T(AXLi), T(FOR), TORLe, T(ORLi), and TTEM provided significantly lower measures compared to T(REC) and fell below our validity criterion. T(GI) significantly exceeded T(REC) at three of eleven time points, but no significant difference existed between mean T(REC) and T(GI) across time. Only T(GI) achieved our validity criterion and compared favorably to T(REC). T(GI) offers a valid measurement with which to assess T(CORE) during rest and exercise in the cold; athletic trainers, mountain rescuers, and military medical personnel should avoid other field expedient devices in similar conditions.
Bertucci, W; Duc, S; Villerius, V; Pernin, J N; Grappe, F
2005-12-01
The SRM power measuring crank system is nowadays a popular device for cycling power output (PO) measurements in the field and in laboratories. The PowerTap (CycleOps, Madison, USA) is a more recent and less well-known device that allows mobile PO measurements of cycling via the rear wheel hub. The aim of this study is to test the validity and reliability of the PowerTap by comparing it with the most accurate (i.e. the scientific model) of the SRM system. The validity of the PowerTap is tested during i) sub-maximal incremental intensities (ranging from 100 to 420 W) on a treadmill with different pedalling cadences (45 to 120 rpm) and cycling positions (standing and seated) on different grades, ii) a continuous sub-maximal intensity lasting 30 min, iii) a maximal intensity (8-s sprint), and iiii) real road cycling. The reliability is assessed by repeating ten times the sub-maximal incremental and continuous tests. The results show a good validity of the PowerTap during sub-maximal intensities between 100 and 450 W (mean PO difference -1.2 +/- 1.3 %) when it is compared to the scientific SRM model, but less validity for the maximal PO during sprint exercise, where the validity appears to depend on the gear ratio. The reliability of the PowerTap during the sub-maximal intensities is similar to the scientific SRM model (the coefficient of variation is respectively 0.9 to 2.9 % and 0.7 to 2.1 % for PowerTap and SRM). The PowerTap must be considered as a suitable device for PO measurements during sub-maximal real road cycling and in sub-maximal laboratory tests.
2017-01-01
Objectives Muscular targets that are deep or inaccessible to surface electromyography (sEMG) require intrinsic recording using fine-wire electromyography (fEMG). It is unknown if fEMG validly record cortically evoked muscle responses compared to sEMG. The purpose of this investigation was to establish the validity and agreement of fEMG compared to sEMG to quantify typical transcranial magnetic stimulation (TMS) measures pre and post repetitive TMS (rTMS). The hypotheses were that fEMG would demonstrate excellent validity and agreement compared with sEMG. Materials and methods In ten healthy volunteers, paired pulse and cortical silent period (CSP) TMS measures were collected before and after 1200 pulses of 1Hz rTMS to the motor cortex. Data were simultaneously recorded with sEMG and fEMG in the first dorsal interosseous. Concurrent validity (r and rho) and agreement (Tukey mean-difference) were calculated. Results fEMG quantified corticospinal excitability with good to excellent validity compared to sEMG data at both pretest (r = 0.77–0.97) and posttest (r = 0.83–0.92). Pairwise comparisons indicated no difference between sEMG and fEMG for all outcomes; however, Tukey mean-difference plots display increased variance and questionable agreement for paired pulse outcomes. CSP displayed the highest estimates of validity and agreement. Paired pulse MEP responses recorded with fEMG displayed reduced validity, agreement and less sensitivity to changes in MEP amplitude compared to sEMG. Change scores following rTMS were not significantly different between sEMG and fEMG. Conclusion fEMG electrodes are a valid means to measure CSP and paired pulse MEP responses. CSP displays the highest validity estimates, while caution is warranted when assessing paired pulse responses with fEMG. Corticospinal excitability and neuromodulatory aftereffects from rTMS may be assessed using fEMG. PMID:28231250
Christensen, Sara E; Möller, Elisabeth; Bonn, Stephanie E; Ploner, Alexander; Bälter, Olle; Lissner, Lauren; Bälter, Katarina
2014-02-21
The meal- and Web-based food frequency questionnaires, Meal-Q and MiniMeal-Q, were developed for cost-efficient assessment of dietary intake in epidemiological studies. The objective of this study was to evaluate the relative validity of micronutrient and fiber intake assessed with Meal-Q and MiniMeal-Q. The reproducibility of Meal-Q was also evaluated. A total of 163 volunteer men and women aged between 20 and 63 years were recruited from Stockholm County, Sweden. Assessment of micronutrient and fiber intake with the 174-item Meal-Q was compared to a Web-based 7-day weighed food record (WFR). Two administered Meal-Q questionnaires were compared for reproducibility. The 126-item MiniMeal-Q, developed after the validation study, was evaluated in a simulated validation by using truncated Meal-Q data. The study population consisted of approximately 80% women (129/163) with a mean age of 33 years (SD 12) who were highly educated (130/163, 80% with >12 years of education) on average. Cross-classification of quartiles with the WFR placed 69% to 90% in the same/adjacent quartile for Meal-Q and 67% to 89% for MiniMeal-Q. Bland-Altman plots with the WFR and the questionnaires showed large variances and a trend of increasing underestimation with increasing intakes. Deattenuated and energy-adjusted Spearman rank correlations between the questionnaires and the WFR were in the range ρ=.25-.69, excluding sodium that was not statistically significant. Cross-classifications of quartiles of the 2 Meal-Q administrations placed 86% to 97% in the same/adjacent quartile. Intraclass correlation coefficients for energy-adjusted intakes were in the range of .50-.76. With the exception of sodium, this validation study demonstrates Meal-Q and MiniMeal-Q to be useful methods for ranking micronutrient and fiber intake in epidemiological studies with Web-based data collection.
NASA Astrophysics Data System (ADS)
Robinson, Trevor P.
The number of robotics competitions has steadily increased over the past 30 years. Schools are implementing robotics competitions to increase student content knowledge and interest in science, technology, engineering, and mathematics (STEM). Companies in STEM-related fields are financially supporting robotics competitions to help increase the number of students pursuing careers in STEM among other reasons. These financial supporters and school administrations are asking what the outcomes of students participating in competitive robotics are. Few studies have been conducted to investigate these outcomes. The studies that have been conducted usually compare students in robotics to students not in robotics. There have not been any studies that compare students to themselves before and after participating in robotics competitions. This may be due to the lack of available instruments to measure student outcomes. This study developed an instrument to measure the self-efficacy of students participating in VEX Robotics Competitions (VRC). The VRC is the world's largest and fastest growing robotics competition available for middle and high school students. Self-efficacy was measured because of its importance to the education community. Students with higher self-efficacy tend to persevere through difficult tasks more frequently than students with low self-efficacy. A person's self-efficacy has major influence over what interests, activities, classes, college majors, and careers he or she will pursue in life. The self-efficacy survey instrument created through this study was developed through an occupational and task analysis (OTA), and initial content and face validity was established through the OTA process. Exploratory and confirmatory factor analyses were also conducted to assist in instrument validation. The reliability was calculated using Cronbach's alpha. Face validity was established through the OTA process. Construct validity was established through the factor analyses. The processes of the OTA and factor analyses have created an instrument that results indicate is reliable and valid to use in further research studies.
Sobral, Maria P; Costa, Maria E; Schmidt, Lone; Martins, Mariana V
2017-02-01
Are the Copenhagen Multi-Centre Psychosocial Infertility research program Fertility Problem Stress Scales (COMPI-FPSS) a reliable and valid measure across gender and culture? The COMPI-FPSS is a valid and reliable measure, presenting excellent or good fit in the majority of the analyzed countries, and demonstrating full invariance across genders and partial invariance across cultures. Cross-cultural and gender validation is needed to consider a measure as standard care within fertility. The present study is the first attempting to establish comparability of fertility-related stress across genders and countries. Cross-sectional study. First, we tested the structure of the COMPI-FPSS. Then, reliability and validity (convergent and discriminant) were examined for the final model. Finally, measurement invariance both across genders and cultures was tested. Our final sample had 3923 fertility patients (1691 men and 2232 women) recruited in clinical settings from seven different countries: Denmark, China, Croatia, Germany, Greece, Hungary and Sweden. Participants had a mean age of 34 years and the majority (84%) were childless. Findings confirmed the original three-factor structure of the COMPI-FPSS, although suggesting a shortened measurement model using less items that fitted the data better than the full version model. While data from the Chinese and Croatian subsamples did not fit, all other counties presented good fit (χ 2 /df ≤ 5.4; comparative fit index ≥ 0.94; root-mean-square error of approximation ≤ 0.07; modified expected cross-validation index ≤ 0.77). In general, reliability, convergent validity, and discriminant validity were observed in all subscales from each country (composite reliability ≥ 0.63; average variance extracted ≥ 0.38; squared correlation ≥ 0.13). Full invariance was established across genders, and partial invariance was demonstrated across countries. Generalizability regarding the validation of the COMPI-FPSS cannot be made regarding infertile individuals not seeking treatment, or non-European patients. This study did not investigate predictive validity, and hence the capability of this instrument in detecting changes in fertility-specific adjustment over time and predicting the psychological impact needs to be established in future research. Besides extending knowledge on the psychometric properties of one of the most used fertility stress questionnaire, this study demonstrates both research and clinical usefulness of the COMPI-FPSS. This study was supported by European Union Funds (FEDER/COMPETE-Operational Competitiveness Program, and by national funds (FCT-Portuguese Foundation for Science and Technology) under the projects PTDC/MHC-PSC/4195/2012 and SFRH/BPD/85789/2012). There are no conflicts of interest to declare. N/A. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Davies, John R; Chang, Yu-mei; Bishop, D Timothy; Armstrong, Bruce K; Bataille, Veronique; Bergman, Wilma; Berwick, Marianne; Bracci, Paige M; Elwood, J Mark; Ernstoff, Marc S; Green, Adele; Gruis, Nelleke A; Holly, Elizabeth A; Ingvar, Christian; Kanetsky, Peter A; Karagas, Margaret R; Lee, Tim K; Le Marchand, Loïc; Mackie, Rona M; Olsson, Håkan; Østerlind, Anne; Rebbeck, Timothy R; Reich, Kristian; Sasieni, Peter; Siskind, Victor; Swerdlow, Anthony J; Titus, Linda; Zens, Michael S; Ziegler, Andreas; Gallagher, Richard P.; Barrett, Jennifer H; Newton-Bishop, Julia
2015-01-01
Background We report the development of a cutaneous melanoma risk algorithm based upon 7 factors; hair colour, skin type, family history, freckling, nevus count, number of large nevi and history of sunburn, intended to form the basis of a self-assessment webtool for the general public. Methods Predicted odds of melanoma were estimated by analysing a pooled dataset from 16 case-control studies using logistic random coefficients models. Risk categories were defined based on the distribution of the predicted odds in the controls from these studies. Imputation was used to estimate missing data in the pooled datasets. The 30th, 60th and 90th centiles were used to distribute individuals into four risk groups for their age, sex and geographic location. Cross-validation was used to test the robustness of the thresholds for each group by leaving out each study one by one. Performance of the model was assessed in an independent UK case-control study dataset. Results Cross-validation confirmed the robustness of the threshold estimates. Cases and controls were well discriminated in the independent dataset (area under the curve 0.75, 95% CI 0.73-0.78). 29% of cases were in the highest risk group compared with 7% of controls, and 43% of controls were in the lowest risk group compared with 13% of cases. Conclusion We have identified a composite score representing an estimate of relative risk and successfully validated this score in an independent dataset. Impact This score may be a useful tool to inform members of the public about their melanoma risk. PMID:25713022
Ghorbani, Abbas; Chitsaz, Ahmad
2011-01-01
Migraine is one of the most common headaches that affect 11% or more adult population. Recently, researchers have designed two questionnaires, namely Headache Impact Test (HIT) and Migraine Disability Assessment (MIDAS), with the aim of improving migraine care. These two tests provide a standard measurement about migraine's effects on people's life style that divide patients into 4 groups (grades) based on headaches intensity. The aim of this study was to compare the validity and reliability of these two tests. This study was designed as a multicenter, descriptive study to compare validity and reliability of Persian version of MIDAS and HIT questionnaires in 240 males and females with a migraine diagnosis according to criteria for headache and facial pain of the International Headache Society (IHS). The patients were enrolled in the study from 3 neurology clinics in Isfahan, Iran, between July 2004 and January 2005 and were evaluated at baseline (visit 1) and 4 weeks later (visit 2). According to our study, there was a high correlation between two tests (r = 0.94). This decreased their MIDAS grade in comparison to their grade HIT questionnaire. These findings demonstrated that Persian version of HIT have the same validity and reliability as MIDAS. Replying to HIT questionnaire was easier than MIDAS for Iranian patients. Physicians can reliably use the Persian translation of both MIDAS and HIT questionnaires to define the severity of illness and its treatment strategy as a self-administered report by migraine patients. However, we recommend HIT for its simplicity in headache clinics.
Screening for cognitive impairment in older individuals. Validation study of a computer-based test.
Green, R C; Green, J; Harrison, J M; Kutner, M H
1994-08-01
This study examined the validity of a computer-based cognitive test that was recently designed to screen the elderly for cognitive impairment. Criterion-related validity was examined by comparing test scores of impaired patients and normal control subjects. Construct-related validity was computed through correlations between computer-based subtests and related conventional neuropsychological subtests. University center for memory disorders. Fifty-two patients with mild cognitive impairment by strict clinical criteria and 50 unimpaired, age- and education-matched control subjects. Control subjects were rigorously screened by neurological, neuropsychological, imaging, and electrophysiological criteria to identify and exclude individuals with occult abnormalities. Using a cut-off total score of 126, this computer-based instrument had a sensitivity of 0.83 and a specificity of 0.96. Using a prevalence estimate of 10%, predictive values, positive and negative, were 0.70 and 0.96, respectively. Computer-based subtests correlated significantly with conventional neuropsychological tests measuring similar cognitive domains. Thirteen (17.8%) of 73 volunteers with normal medical histories were excluded from the control group, with unsuspected abnormalities on standard neuropsychological tests, electroencephalograms, or magnetic resonance imaging scans. Computer-based testing is a valid screening methodology for the detection of mild cognitive impairment in the elderly, although this particular test has important limitations. Broader applications of computer-based testing will require extensive population-based validation. Future studies should recognize that normal control subjects without a history of disease who are typically used in validation studies may have a high incidence of unsuspected abnormalities on neurodiagnostic studies.
Wang, Yao; Xiao, Lily Dongxia; He, Guo-Ping
2015-02-01
Suboptimal care for people with dementia in hospital settings has been reported and is attributed to the lack of knowledge and inadequate attitudes in dementia care among health professionals. Educational interventions have been widely used to improve care outcomes; however, Chinese-language instruments used in dementia educational interventions for health professionals are lacking. The aims of this study were to select, translate and evaluate instruments used in dementia educational interventions for Chinese health professionals in acute-care hospitals. A cross-sectional study design was used. A modified stratified random sampling was used to recruit 442 participants from different levels of hospitals in Changsha, China. Dementia care competence was used as a framework for the selection and evaluation of Alzheimer's Disease Knowledge Scale and Dementia Care Attitudes Scale for health professionals in the study. These two scales were translated into Chinese using forward and back translation method. Content validity, test-retest reliability and internal consistency were assessed. Construct validity was tested using exploratory factor analysis. Known-group validity was established by comparing scores of Alzheimer's Disease Knowledge Scale and Dementia Care Attitudes Scale in two sub-groups. A person-centred care scale was utilised as a gold standard to establish concurrent validity of these two scales. Results demonstrated acceptable content validity, internal consistency, test-retest reliability and concurrent validity. Exploratory factor analysis presented a single-factor structure of the Chinese Alzheimer's Disease Knowledge Scale and a two-factor structure of the Chinese Dementia Care Attitudes Scale, supporting the conceptual dimensions of the original scales. The Chinese Alzheimer's Disease Knowledge Scale and Chinese Dementia Care Attitudes Scale demonstrated known-group validity evidenced by significantly higher scores identified from the sub-group with a longer work experience compared to those in the sub-group with less work experience. The use of dementia care competence as a framework to inform the selection and evaluation of instruments used in dementia educational interventions for health professionals has wide applicability in other areas. The results support that Chinese Alzheimer's Disease Knowledge Scale and Chinese Dementia Care Attitudes Scale are reliable and valid instruments for health professionals to use in acute-care settings. Copyright © 2014 Elsevier Ltd. All rights reserved.
Developing a tool to measure satisfaction among health professionals in sub-Saharan Africa
2013-01-01
Background In sub-Saharan Africa, lack of motivation and job dissatisfaction have been cited as causes of poor healthcare quality and outcomes. Measurement of health workers’ satisfaction adapted to sub-Saharan African working conditions and cultures is a challenge. The objective of this study was to develop a valid and reliable instrument to measure satisfaction among health professionals in the sub-Saharan African context. Methods A survey was conducted in Senegal and Mali in 2011 among 962 care providers (doctors, midwives, nurses and technicians) practicing in 46 hospitals (capital, regional and district). The participation rate was very high: 97% (937/962). After exploratory factor analysis (EFA), construct validity was assessed through confirmatory factor analysis (CFA). The discriminant validity of our subscales was evaluated by comparing the average variance extracted (AVE) for each of the constructs with the squared interconstruct correlation (SIC), and finally for criterion validity, each subscale was tested with two hypotheses. Two dimensions of reliability were assessed: internal consistency with Cronbach’s alpha subscales and stability over time using a test-retest process. Results Eight dimensions of satisfaction encompassing 24 items were identified and validated using a process that combined psychometric analyses and expert opinions: continuing education, salary and benefits, management style, tasks, work environment, workload, moral satisfaction and job stability. All eight dimensions demonstrated significant discriminant validity. The final model showed good performance, with a root mean square error of approximation (RMSEA) of 0.0508 (90% CI: 0.0448 to 0.0569) and a comparative fit index (CFI) of 0.9415. The concurrent criterion validity of the eight dimensions was good. Reliability was assessed based on internal consistency, which was good for all dimensions but one (moral satisfaction < 0.70). Test-retest showed satisfactory temporal stability (intra class coefficient range: 0.60 to 0.91). Conclusions Job satisfaction is a complex construct; this study provides a multidimensional instrument whose content, construct and criterion validities were verified to ensure its suitability for the sub-Saharan African context. When using these subscales in further studies, the variability of the reliability of the subscales should be taken in to account for calculating the sample sizes. The instrument will be useful in evaluative studies which will help guide interventions aimed at improving both the quality of care and its effectiveness. PMID:23826720
Patient-specific lean body mass can be estimated from limited-coverage computed tomography images.
Devriese, Joke; Beels, Laurence; Maes, Alex; van de Wiele, Christophe; Pottel, Hans
2018-06-01
In PET/CT, quantitative evaluation of tumour metabolic activity is possible through standardized uptake values, usually normalized for body weight (BW) or lean body mass (LBM). Patient-specific LBM can be estimated from whole-body (WB) CT images. As most clinical indications only warrant PET/CT examinations covering head to midthigh, the aim of this study was to develop a simple and reliable method to estimate LBM from limited-coverage (LC) CT images and test its validity. Head-to-toe PET/CT examinations were retrospectively retrieved and semiautomatically segmented into tissue types based on thresholding of CT Hounsfield units. LC was obtained by omitting image slices. Image segmentation was validated on the WB CT examinations by comparing CT-estimated BW with actual BW, and LBM estimated from LC images were compared with LBM estimated from WB images. A direct method and an indirect method were developed and validated on an independent data set. Comparing LBM estimated from LC examinations with estimates from WB examinations (LBMWB) showed a significant but limited bias of 1.2 kg (direct method) and nonsignificant bias of 0.05 kg (indirect method). This study demonstrates that LBM can be estimated from LC CT images with no significant difference from LBMWB.
MicroRNA Changes in Cerebrospinal Fluid After Subarachnoid Hemorrhage.
Bache, Søren; Rasmussen, Rune; Rossing, Maria; Laigaard, Finn Pedersen; Nielsen, Finn Cilius; Møller, Kirsten
2017-09-01
Delayed cerebral ischemia (DCI) accounts for a major part of the morbidity and mortality after aneurysmal subarachnoid hemorrhage (SAH). MicroRNAs (miRNAs) are pathophysiologically involved in acute cerebral ischemia. This study compared miRNA profiles in cerebrospinal fluid from neurologically healthy patients, as well as SAH patients with and without subsequent development of DCI. In a prospective case-control study of SAH patients treated with external ventricular drainage and neurologically healthy patients, miRNA profiles in cerebrospinal fluid were screened and validated using 2 different high-throughput real-time quantification polymerase chain reaction techniques. The occurrence of DCI was documented in patient charts and subsequently reviewed independently by 2 physicians. MiRNA profiles from 27 SAH patients and 10 neurologically healthy patients passed quality control. In the validation, 66 miRNAs showed a relative increase in cerebrospinal fluid from SAH patients compared with neurologically healthy patients ( P <0.001); 2 (miR-21 and miR-221) showed a relative increase in SAH patients with DCI compared with those without ( P <0.05) in both the screening and validation. SAH is associated with marked changes in the cerebrospinal fluid miRNA profile. These changes could be associated to the development of DCI. URL: http://www.clinicaltrials.gov. Unique identifier: NCT01791257. © 2017 The Authors.
Boysen, Guy A; VanBergen, Alexandra
2014-02-01
Dissociative Identity Disorder (DID) has long been surrounded by controversy due to disagreement about its etiology and the validity of its associated phenomena. Researchers have conducted studies comparing people diagnosed with DID and people simulating DID in order to better understand the disorder. The current research presents a systematic review of this DID simulation research. The literature consists of 20 studies and contains several replicated findings. Replicated differences between the groups include symptom presentation, identity presentation, and cognitive processing deficits. Replicated similarities between the groups include interidentity transfer of information as shown by measures of recall, recognition, and priming. Despite some consistent findings, this research literature is hindered by methodological flaws that reduce experimental validity. Copyright © 2013 Elsevier Ltd. All rights reserved.
Comparing errors in Medicaid reporting across surveys: evidence to date.
Call, Kathleen T; Davern, Michael E; Klerman, Jacob A; Lynch, Victoria
2013-04-01
To synthesize evidence on the accuracy of Medicaid reporting across state and federal surveys. All available validation studies. Compare results from existing research to understand variation in reporting across surveys. Synthesize all available studies validating survey reports of Medicaid coverage. Across all surveys, reporting some type of insurance coverage is better than reporting Medicaid specifically. Therefore, estimates of uninsurance are less biased than estimates of specific sources of coverage. The CPS stands out as being particularly inaccurate. Measuring health insurance coverage is prone to some level of error, yet survey overstatements of uninsurance are modest in most surveys. Accounting for all forms of bias is complex. Researchers should consider adjusting estimates of Medicaid and uninsurance in surveys prone to high levels of misreporting. © Health Research and Educational Trust.
The Validity of Truant Youths’ Marijuana Use and Its Impact on Alcohol Use and Sexual Risk Taking
Dembo, Richard; Robinson, Rhissa Briones; Barrett, Kimberly; Winters, Ken C.; Ungaro, Rocío; Karas, Lora; Belenko, Steven; Wareham, Jennifer
2013-01-01
Few studies investigating the validity of marijuana use have used samples of truant youth. In the current study, self-reports of marijuana use are compared with urine test results for marijuana to identify marijuana underreporting among adolescents participating in a longitudinal Brief Intervention for drug-involved truant youth. It was hypothesized that marijuana underreporting would be associated with alcohol underreporting and engaging in sexual risk behaviors. The results indicated marijuana underreporting was significantly associated with self-denial of alcohol use, but not associated with sexual risk behavior. Also, there was an age effect in marijuana use underreporting such that younger truant youth were more likely to underreport marijuana use, compared to older truant youth. Implications for policy and future research are discussed. PMID:26478691
Stephenson, Christopher R; Vaa, Brianna E; Wang, Amy T; Schroeder, Darrell R; Beckman, Thomas J; Reed, Darcy A; Sawatsky, Adam P
2017-11-09
There is little evidence regarding the comparative quality of abstracts and articles in medical education research. The Medical Education Research Study Quality Instrument (MERSQI), which was developed to evaluate the quality of reporting in medical education, has strong validity evidence for content, internal structure, and relationships to other variables. We used the MERSQI to compare the quality of reporting for conference abstracts, journal abstracts, and published articles. This is a retrospective study of all 46 medical education research abstracts submitted to the Society of General Internal Medicine 2009 Annual Meeting that were subsequently published in a peer-reviewed journal. We compared MERSQI scores of the abstracts with scores for their corresponding published journal abstracts and articles. Comparisons were performed using the signed rank test. Overall MERSQI scores increased significantly for published articles compared with conference abstracts (11.33 vs 9.67; P < .001) and journal abstracts (11.33 vs 9.96; P < .001). Regarding MERSQI subscales, published articles had higher MERSQI scores than conference abstracts in the domains of sampling (1.59 vs 1.34; P = .006), data analysis (3.00 vs 2.43; P < .001), and validity of evaluation instrument (1.04 vs 0.28; P < .001). Published articles also had higher MERSQI scores than journal abstracts in the domains of data analysis (3.00 vs 2.70; P = .004) and validity of evaluation instrument (1.04 vs 0.26; P < .001). To our knowledge, this is the first study to compare the quality of medical education abstracts and journal articles using the MERSQI. Overall, the quality of articles was greater than that of abstracts. However, there were no significant differences between abstracts and articles for the domains of study design and outcomes, which indicates that these MERSQI elements may be applicable to abstracts. Findings also suggest that abstract quality is generally preserved from original presentation to publication.
Dreyer, Nancy A; Velentgas, Priscilla; Westrich, Kimberly; Dubois, Robert
2014-03-01
While there is growing demand for information about comparative effectiveness (CE), there is substantial debate about whether and when observational studies have sufficient quality to support decision making. To develop and test an item checklist that can be used to qualify those observational CE studies sufficiently rigorous in design and execution to contribute meaningfully to the evidence base for decision support. An 11-item checklist about data and methods (the GRACE checklist) was developed through literature review and consultation with experts from professional societies, payer groups, the private sector, and academia. Since no single gold standard exists for validation, checklist item responses were compared with 3 different types of external quality ratings (N=88 articles). The articles compared treatment effectiveness and/or safety of drugs, medical devices, and medical procedures. We validated checklist item responses 3 ways against external quality ratings, using published articles of observational CE or safety studies: (a) Systematic Review-quality assessment from a published systematic review; (b) Single Expert Review-quality assessment made according to the solicited "expert opinion" of a senior researcher; and (c) Concordant Expert Review-quality assessments from 2 experts for which there was concordance. Volunteers (N=113) from 5 continents completed 280 article assessments using the checklist. Positive and negative predictive values (PPV, NPV, respectively) of individual items were estimated to compare testers' assessments with those of experts. Taken as a whole, the scale had better NPV than PPV, for both data and methods. The most consistent predictor of quality relates to the validity of the primary outcomes measurement for the study purpose. Other consistent markers of quality relate to using concurrent comparators, minimizing the effects of bias by prudent choice of covariates, and using sensitivity analysis to test robustness of results. Concordance of expert opinion on the quality of the rated articles was 52%; most checklist items performed better. The 11-item GRACE checklist provides guidance to help determine which observational studies of CE have used strong scientific methods and good data that are fit for purpose and merit consideration for decision making. The checklist contains a parsimonious set of elements that can be objectively assessed in published studies, and user testing shows that it can be successfully applied to studies of drugs, medical devices, and clinical and surgical interventions. Although no scoring is provided, study reports that rate relatively well across checklist items merit in-depth examination to understand applicability, effect size, and likelihood of residual bias. The current testing and validation efforts did not achieve clear discrimination between studies fit for purpose and those not, but we have identified a critical, though remediable, limitation in our approach. Not specifying a specific granular decision for evaluation, or not identifying a single study objective in reports that included more than one, left reviewers with too broad an assessment challenge. We believe that future efforts will be more successful if reviewers are asked to focus on a specific objective or question. Despite the challenges encountered in this testing, an agreed upon set of assessment elements, checklists, or score cards is critical for the maturation of this field. Substantial resources will be expended on studies of real-world effectiveness, and if the rigor of these observational assessments cannot be assessed, then the impact of the studies will be suboptimal. Similarly, agreement on key elements of quality will ensure that budgets are appropriately directed toward those elements. Given the importance of this task and the lessons learned from these extensive efforts at validation and user testing, we are optimistic about the potential for improved assessments that can be used for diverse situations by people with a wide range of experience and training. Future testing would benefit by directing reviewers to address a single, granular research question, which would avoid problems that arose by using the checklist to evaluate multiple objectives, by using other types of validation test sets, and by employing further multivariate analysis to see if any combination or sequence of item responses has particularly high predictive validity.
Austin, Peter C; Mamdani, Muhammad M; Juurlink, David N; Hux, Janet E
2006-09-01
To illustrate how multiple hypotheses testing can produce associations with no clinical plausibility. We conducted a study of all 10,674,945 residents of Ontario aged between 18 and 100 years in 2000. Residents were randomly assigned to equally sized derivation and validation cohorts and classified according to their astrological sign. Using the derivation cohort, we searched through 223 of the most common diagnoses for hospitalization until we identified two for which subjects born under one astrological sign had a significantly higher probability of hospitalization compared to subjects born under the remaining signs combined (P<0.05). We tested these 24 associations in the independent validation cohort. Residents born under Leo had a higher probability of gastrointestinal hemorrhage (P=0.0447), while Sagittarians had a higher probability of humerus fracture (P=0.0123) compared to all other signs combined. After adjusting the significance level to account for multiple comparisons, none of the identified associations remained significant in either the derivation or validation cohort. Our analyses illustrate how the testing of multiple, non-prespecified hypotheses increases the likelihood of detecting implausible associations. Our findings have important implications for the analysis and interpretation of clinical studies.
Development and Validation of a Food-Associated Olfactory Test (FAOT).
Denzer-Lippmann, Melanie Yvonne; Beauchamp, Jonathan; Freiherr, Jessica; Thuerauf, Norbert; Kornhuber, Johannes; Buettner, Andrea
2017-01-01
Olfactory tests are an important tool in human nutritional research for studying food preferences, yet comprehensive tests dedicated solely to food odors are currently lacking. Therefore, within this study, an innovative food-associated olfactory test (FAOT) system was developed. The FAOT comprises 16 odorant pens that contain representative food odors relating to different macronutrient classes. The test underwent a sensory validation based on identification rate, intensity, hedonic value, and food association scores. The accuracy of the test was further compared to the accuracy of the established Sniffin' Sticks identification test. The identification rates and intensities of this new FAOT were found to be comparable to the Sniffin' Sticks olfactory identification test. The odorant pens were also assessed chemo-analytically and were found to be chemically stable for at least 24 weeks. Overall, this new identification test for use in assessing olfaction in a food-associated context is valid both in terms of its use in sensory perception studies and its chemical stability. The FOAT is particularly suited to examinations of the sense of smell regarding food odors. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Vijayaraj, Ramadoss; Devi, Mekapothula Lakshmi Vasavi; Subramanian, Venkatesan; Chattaraj, Pratim Kumar
2012-06-01
Three-dimensional quantitative structure activity relationship (3D-QSAR) study has been carried out on the Escherichia coli DHFR inhibitors 2,4-diamino-5-(substituted-benzyl)pyrimidine derivatives to understand the structural features responsible for the improved potency. To construct highly predictive 3D-QSAR models, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods were used. The predicted models show statistically significant cross-validated and non-cross-validated correlation coefficient of r2 CV and r2 nCV, respectively. The final 3D-QSAR models were validated using structurally diverse test set compounds. Analysis of the contour maps generated from CoMFA and CoMSIA methods reveals that the substitution of electronegative groups at the first and second position along with electropositive group at the third position of R2 substitution significantly increases the potency of the derivatives. The results obtained from the CoMFA and CoMSIA study delineate the substituents on the trimethoprim analogues responsible for the enhanced potency and also provide valuable directions for the design of new trimethoprim analogues with improved affinity. © 2012 John Wiley & Sons A/S.
Validation of asthma recording in electronic health records: protocol for a systematic review.
Nissen, Francis; Quint, Jennifer K; Wilkinson, Samantha; Mullerova, Hana; Smeeth, Liam; Douglas, Ian J
2017-05-29
Asthma is a common, heterogeneous disease with significant morbidity and mortality worldwide. It can be difficult to define in epidemiological studies using electronic health records as the diagnosis is based on non-specific respiratory symptoms and spirometry, neither of which are routinely registered. Electronic health records can nonetheless be valuable to study the epidemiology, management, healthcare use and control of asthma. For health databases to be useful sources of information, asthma diagnoses should ideally be validated. The primary objectives are to provide an overview of the methods used to validate asthma diagnoses in electronic health records and summarise the results of the validation studies. EMBASE and MEDLINE will be systematically searched for appropriate search terms. The searches will cover all studies in these databases up to October 2016 with no start date and will yield studies that have validated algorithms or codes for the diagnosis of asthma in electronic health records. At least one test validation measure (sensitivity, specificity, positive predictive value, negative predictive value or other) is necessary for inclusion. In addition, we require the validated algorithms to be compared with an external golden standard, such as a manual review, a questionnaire or an independent second database. We will summarise key data including author, year of publication, country, time period, date, data source, population, case characteristics, clinical events, algorithms, gold standard and validation statistics in a uniform table. This study is a synthesis of previously published studies and, therefore, no ethical approval is required. The results will be submitted to a peer-reviewed journal for publication. Results from this systematic review can be used to study outcome research on asthma and can be used to identify case definitions for asthma. CRD42016041798. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
NASA Technical Reports Server (NTRS)
Sundstrom, J. L.
1980-01-01
The techniques required to produce and validate six detailed task timeline scenarios for crew workload studies are described. Specific emphasis is given to: general aviation single pilot instrument flight rules operations in a high density traffic area; fixed path metering and spacing operations; and comparative workload operation between the forward and aft-flight decks of the NASA terminal control vehicle. The validation efforts also provide a cursory examination of the resultant demand workload based on the operating procedures depicted in the detailed task scenarios.
Complex and elementary histological scoring systems for articular cartilage repair.
Orth, Patrick; Madry, Henning
2015-08-01
The repair of articular cartilage defects is increasingly moving into the focus of experimental and clinical investigations. Histological analysis is the gold standard for a valid and objective evaluation of cartilaginous repair tissue and predominantly relies on the use of established scoring systems. In the past three decades, numerous elementary and complex scoring systems have been described and modified, including those of O'Driscoll, Pineda, Wakitani, Sellers and Fortier for entire defects as well as those according to the International Cartilage Repair Society (ICRS-I/II) for osteochondral tissue biopsies. Yet, this coexistence of different grading scales inconsistently addressing diverse parameters may impede comparability between reported study outcomes. Furthermore, validation of these histological scoring systems has only seldom been performed to date. The aim of this review is (1) to give a comprehensive overview and to compare the most important established histological scoring systems for articular cartilage repair, (2) to describe their specific advantages and pitfalls, and (3) to provide valid recommendations for their use in translational and clinical studies of articular cartilage repair.
Devaluation and sequential decisions: linking goal-directed and model-based behavior
Friedel, Eva; Koch, Stefan P.; Wendt, Jean; Heinz, Andreas; Deserno, Lorenz; Schlagenhauf, Florian
2014-01-01
In experimental psychology different experiments have been developed to assess goal–directed as compared to habitual control over instrumental decisions. Similar to animal studies selective devaluation procedures have been used. More recently sequential decision-making tasks have been designed to assess the degree of goal-directed vs. habitual choice behavior in terms of an influential computational theory of model-based compared to model-free behavioral control. As recently suggested, different measurements are thought to reflect the same construct. Yet, there has been no attempt to directly assess the construct validity of these different measurements. In the present study, we used a devaluation paradigm and a sequential decision-making task to address this question of construct validity in a sample of 18 healthy male human participants. Correlational analysis revealed a positive association between model-based choices during sequential decisions and goal-directed behavior after devaluation suggesting a single framework underlying both operationalizations and speaking in favor of construct validity of both measurement approaches. Up to now, this has been merely assumed but never been directly tested in humans. PMID:25136310
Konstan, Joseph; Iantaffi, Alex; Wilkerson, J. Michael; Galos, Dylan; Simon Rosser, B. R.
2017-01-01
Researchers use protocols to screen for suspicious survey submissions in online studies. We evaluated how well a de-duplication and cross-validation process detected invalid entries. Data were from the Sexually Explicit Media Study, an Internet-based HIV prevention survey of men who have sex with men. Using our protocol, 146 (11.6 %) of 1254 entries were identified as invalid. Most indicated changes to the screening questionnaire to gain entry (n = 109, 74.7 %), matched other submissions’ payment profiles (n = 56, 41.8 %), or featured an IP address that was recorded previously (n = 43, 29.5 %). We found few demographic or behavioral differences between valid and invalid samples, however. Invalid submissions had lower odds of reporting HIV testing in the past year (OR 0.63), and higher odds of requesting no payment compared to check payments (OR 2.75). Thus, rates of HIV testing would have been underestimated if invalid submissions had not been removed, and payment may not be the only incentive for invalid participation. PMID:25805443
Timed activity performance in persons with upper limb amputation: A preliminary study.
Resnik, Linda; Borgia, Mathew; Acluche, Frantzy
55 subjects with upper limb amputation were administered the T-MAP twice within one week. To develop a timed measure of activity performance for persons with upper limb amputation (T-MAP); examine the measure's internal consistency, test-retest reliability and validity; and compare scores by prosthesis use. Measures of activity performance for persons with upper limb amputation are needed The time required to perform daily activities is a meaningful metric that implication for participation in life roles. Internal consistency and test-retest reliability were evaluated. Construct validity was examined by comparing scores by amputation level. Exploratory analyses compared sub-group scores, and examined correlations with other measures. Scale alpha was 0.77, ICC was 0.93. Timed scores differed by amputation level. Subjects using a prosthesis took longer to perform all tasks. T-MAP was not correlated with other measures of dexterity or activity, but was correlated with pain for non-prosthesis users. The timed scale had adequate internal consistency and excellent test-retest reliability. Analyses support reliability and construct validity of the T-MAP. 2c "outcomes" research. Published by Elsevier Inc.
ERIC Educational Resources Information Center
Huesman, Ronald L., Jr.; Frisbie, David A.
This study investigated the effect of extended-time limits in terms of performance levels and score comparability for reading comprehension scores on the Iowa Tests of Basic Skills (ITBS). The first part of the study compared the average reading comprehension scores on the ITBS of 61 sixth-graders with learning disabilities and 397 non learning…
Mungroop, Timothy H; van Rijssen, L Bengt; van Klaveren, David; Smits, F Jasmijn; van Woerden, Victor; Linnemann, Ralph J; de Pastena, Matteo; Klompmaker, Sjors; Marchegiani, Giovanni; Ecker, Brett L; van Dieren, Susan; Bonsing, Bert; Busch, Olivier R; van Dam, Ronald M; Erdmann, Joris; van Eijck, Casper H; Gerhards, Michael F; van Goor, Harry; van der Harst, Erwin; de Hingh, Ignace H; de Jong, Koert P; Kazemier, Geert; Luyer, Misha; Shamali, Awad; Barbaro, Salvatore; Armstrong, Thomas; Takhar, Arjun; Hamady, Zaed; Klaase, Joost; Lips, Daan J; Molenaar, I Quintus; Nieuwenhuijs, Vincent B; Rupert, Coen; van Santvoort, Hjalmar C; Scheepers, Joris J; van der Schelling, George P; Bassi, Claudio; Vollmer, Charles M; Steyerberg, Ewout W; Abu Hilal, Mohammed; Groot Koerkamp, Bas; Besselink, Marc G
2017-12-12
The aim of this study was to develop an alternative fistula risk score (a-FRS) for postoperative pancreatic fistula (POPF) after pancreatoduodenectomy, without blood loss as a predictor. Blood loss, one of the predictors of the original-FRS, was not a significant factor during 2 recent external validations. The a-FRS was developed in 2 databases: the Dutch Pancreatic Cancer Audit (18 centers) and the University Hospital Southampton NHS. Primary outcome was grade B/C POPF according to the 2005 International Study Group on Pancreatic Surgery (ISGPS) definition. The score was externally validated in 2 independent databases (University Hospital of Verona and University Hospital of Pennsylvania), using both 2005 and 2016 ISGPS definitions. The a-FRS was also compared with the original-FRS. For model design, 1924 patients were included of whom 12% developed POPF. Three predictors were strongly associated with POPF: soft pancreatic texture [odds ratio (OR) 2.58, 95% confidence interval (95% CI) 1.80-3.69], small pancreatic duct diameter (per mm increase, OR: 0.68, 95% CI: 0.61-0.76), and high body mass index (BMI) (per kg/m increase, OR: 1.07, 95% CI: 1.04-1.11). Discrimination was adequate with an area under curve (AUC) of 0.75 (95% CI: 0.71-0.78) after internal validation, and 0.78 (0.74-0.82) after external validation. The predictive capacity of a-FRS was comparable with the original-FRS, both for the 2005 definition (AUC 0.78 vs 0.75, P = 0.03), and 2016 definition (AUC 0.72 vs 0.70, P = 0.05). The a-FRS predicts POPF after pancreatoduodenectomy based on 3 easily available variables (pancreatic texture, duct diameter, BMI) without blood loss and pathology, and was successfully validated for both the 2005 and 2016 POPF definition.
Merolla, Giovanni; Corona, Katia; Zanoli, Gustavo; Cerciello, Simone; Giannotti, Stefano; Porcellini, Giuseppe
2017-12-01
The Kerlan-Jobe Orthopaedic Clinic (KJOC) Shoulder and Elbow score is a reliable and sensitive tool to measure the performance of overhead athletes. The purpose of this study was to carry out a cross-cultural adaptation and validation of the KJOC questionnaire in Italian and to assess its reliability, validity, and responsiveness. Ninety professional athletes with a painful shoulder were included in this study and were assigned to the "injury group" (n = 32) or the "overuse group" (n = 58); 65 were managed conservatively and 25 were treated by arthroscopic surgery. To assess the reliability of the KJOC score, patients were asked to fill in the questionnaire at baseline and after 2 weeks. To test the construct validity, KJOC scores were compared to those obtained with the Italian version of the Disabilities of the Arm, Shoulder, and Hand (DASH) scale, and with the DASH sports/performing arts module. To test KJOC score responsiveness, the follow-up KJOC scores of the participants treated conservatively were compared to those of the patients treated by arthroscopic surgery. Statistical analysis demonstrated that the KJOC questionnaire is reliable in terms of the single items and the overall score (ICC 0.95-0.99); that it has high construct validity (r s = -0.697; p < 0.01); and that it is responsive to clinical differences in shoulder function (p < 0.0001). The Italian version of the KJOC Shoulder and Elbow score performed in a similar way to the English version and demonstrated good validity, reliability, and responsiveness after conservative and surgical treatment. II.
The development and validation of a test of science critical thinking for fifth graders.
Mapeala, Ruslan; Siew, Nyet Moi
2015-01-01
The paper described the development and validation of the Test of Science Critical Thinking (TSCT) to measure the three critical thinking skill constructs: comparing and contrasting, sequencing, and identifying cause and effect. The initial TSCT consisted of 55 multiple choice test items, each of which required participants to select a correct response and a correct choice of critical thinking used for their response. Data were obtained from a purposive sampling of 30 fifth graders in a pilot study carried out in a primary school in Sabah, Malaysia. Students underwent the sessions of teaching and learning activities for 9 weeks using the Thinking Maps-aided Problem-Based Learning Module before they answered the TSCT test. Analyses were conducted to check on difficulty index (p) and discrimination index (d), internal consistency reliability, content validity, and face validity. Analysis of the test-retest reliability data was conducted separately for a group of fifth graders with similar ability. Findings of the pilot study showed that out of initial 55 administered items, only 30 items with relatively good difficulty index (p) ranged from 0.40 to 0.60 and with good discrimination index (d) ranged within 0.20-1.00 were selected. The Kuder-Richardson reliability value was found to be appropriate and relatively high with 0.70, 0.73 and 0.92 for identifying cause and effect, sequencing, and comparing and contrasting respectively. The content validity index obtained from three expert judgments equalled or exceeded 0.95. In addition, test-retest reliability showed good, statistically significant correlations ([Formula: see text]). From the above results, the selected 30-item TSCT was found to have sufficient reliability and validity and would therefore represent a useful tool for measuring critical thinking ability among fifth graders in primary science.
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.
Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy
2017-01-01
Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.
The prone bridge test: Performance, validity, and reliability among older and younger adults.
Bohannon, Richard W; Steffl, Michal; Glenney, Susan S; Green, Michelle; Cashwell, Leah; Prajerova, Kveta; Bunn, Jennifer
2018-04-01
The prone bridge maneuver, or plank, has been viewed as a potential alternative to curl-ups for assessing trunk muscle performance. The purpose of this study was to assess prone bridge test performance, validity, and reliability among younger and older adults. Sixty younger (20-35 years old) and 60 older (60-79 years old) participants completed this study. Groups were evenly divided by sex. Participants completed surveys regarding physical activity and abdominal exercise participation. Height, weight, body mass index (BMI), and waist circumference were measured. On two occasions, 5-9 days apart, participants held a prone bridge until volitional exhaustion or until repeated technique failure. Validity was examined using data from the first session: convergent validity by calculating correlations between survey responses, anthropometrics, and prone bridge time, known groups validity by using an ANOVA comparing bridge times of younger and older adults and of men and women. Test-retest reliability was examined by using a paired t-test to compare prone bridge times for Session1 and Session 2. Furthermore, an intraclass correlation coefficient (ICC) was used to characterize relative reliability and minimal detectable change (MDC 95% ) was used to describe absolute reliability. The mean prone bridge time was 145.3 ± 71.5 s, and was positively correlated with physical activity participation (p ≤ 0.001) and negatively correlated with BMI and waist circumference (p ≤ 0.003). Younger participants had significantly longer plank times than older participants (p = 0.003). The ICC between testing sessions was 0.915. The prone bridge test is a valid and reliable measure for evaluating abdominal performance in both younger and older adults. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validity and inter-observer reliability of subjective hand-arm vibration assessments.
Coenen, Pieter; Formanoy, Margriet; Douwes, Marjolein; Bosch, Tim; de Kraker, Heleen
2014-07-01
Exposure to mechanical vibrations at work (e.g., due to handling powered tools) is a potential occupational risk as it may cause upper extremity complaints. However, reliable and valid assessment methods for vibration exposure at work are lacking. Measuring hand-arm vibration objectively is often difficult and expensive, while often used information provided by manufacturers lacks detail. Therefore, a subjective hand-arm vibration assessment method was tested on validity and inter-observer reliability. In an experimental protocol, sixteen tasks handling powered tools were executed by two workers. Hand-arm vibration was assessed subjectively by 16 observers according to the proposed subjective assessment method. As a gold standard reference, hand-arm vibration was measured objectively using a vibration measurement device. Weighted κ's were calculated to assess validity, intra-class-correlation coefficients (ICCs) were calculated to assess inter-observer reliability. Inter-observer reliability of the subjective assessments depicting the agreement among observers can be expressed by an ICC of 0.708 (0.511-0.873). The validity of the subjective assessments as compared to the gold-standard reference can be expressed by a weighted κ of 0.535 (0.285-0.785). Besides, the percentage of exact agreement of the subjective assessment compared to the objective measurement was relatively low (i.e., 52% of all tasks). This study shows that subjectively assessed hand-arm vibrations are fairly reliable among observers and moderately valid. This assessment method is a first attempt to use subjective risk assessments of hand-arm vibration. Although, this assessment method can benefit from some future improvement, it can be of use in future studies and in field-based ergonomic assessments. Copyright © 2014 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Cross-Cultural Adaptation and Validation of the Italian Version of SWAL-QOL.
Ginocchio, Daniela; Alfonsi, Enrico; Mozzanica, Francesco; Accornero, Anna Rosa; Bergonzoni, Antonella; Chiarello, Giulia; De Luca, Nicoletta; Farneti, Daniele; Marilia, Simonelli; Calcagno, Paola; Turroni, Valentina; Schindler, Antonio
2016-10-01
The aim of the study was to evaluate the reliability and validity of the Italian SWAL-QOL (I-SWAL-QOL). The study consisted of five phases: item generation, reliability analysis, normative data generation, validity analysis, and responsiveness analysis. The item generation phase followed the five-step, cross-cultural, adaptation process of translation and back-translation. A group of 92 dysphagic patients was enrolled for the internal consistency analysis. Seventy-eight patients completed the I-SWAL-QOL twice, 2 weeks apart, for test-retest reliability analysis. A group of 200 asymptomatic subjects completed the I-SWAL-QOL for normative data generation. I-SWAL-QOL scores obtained by both the group of dysphagic subjects and asymptomatic ones were compared for validity analysis. I-SWAL-QOL scores were correlated with SF-36 scores in 67 patients with dysphagia for concurrent validity analysis. Finally, I-SWAL-QOL scores obtained in a group of 30 dysphagic patients before and after successful rehabilitation treatment were compared for responsiveness analysis. All the enrolled patients managed to complete the I-SWAL-QOL without needing any assistance, within 20 min. Internal consistency was acceptable for all I-SWAL-QOL subscales (α > 0.70). Test-retest reliability was also satisfactory for all subscales (ICC > 0.7). A significant difference between the dysphagic group and the control group was found in all I-SWAL-QOL subscales (p < 0.05). Mild to moderate correlations between I-SWAL-QOL and SF-36 subscales were observed. I-SWAL-QOL scores obtained in the pre-treatment condition were significantly lower than those obtained after swallowing rehabilitation. I-SWAL-QOL is reliable, valid, responsive to changes in QOL, and recommended for clinical practice and outcome research.
Validation of SAM 2 and SAGE satellite
NASA Technical Reports Server (NTRS)
Kent, G. S.; Wang, P.-H.; Farrukh, U. O.; Yue, G. K.
1987-01-01
Presented are the results of a validation study of data obtained by the Stratospheric Aerosol and Gas Experiment I (SAGE I) and Stratospheric Aerosol Measurement II (SAM II) satellite experiments. The study includes the entire SAGE I data set (February 1979 - November 1981) and the first four and one-half years of SAM II data (October 1978 - February 1983). These data sets have been validated by their use in the analysis of dynamical, physical and chemical processes in the stratosphere. They have been compared with other existing data sets and the SAGE I and SAM II data sets intercompared where possible. The study has shown the data to be of great value in the study of the climatological behavior of stratospheric aerosols and ozone. Several scientific publications and user-oriented data summaries have appeared as a result of the work carried out under this contract.
Bertram, Christof A; Gurtner, Corinne; Dettwiler, Martina; Kershaw, Olivia; Dietert, Kristina; Pieper, Laura; Pischon, Hannah; Gruber, Achim D; Klopfleisch, Robert
2018-07-01
Integration of new technologies, such as digital microscopy, into a highly standardized laboratory routine requires the validation of its performance in terms of reliability, specificity, and sensitivity. However, a validation study of digital microscopy is currently lacking in veterinary pathology. The aim of the current study was to validate the usability of digital microscopy in terms of diagnostic accuracy, speed, and confidence for diagnosing and differentiating common canine cutaneous tumor types and to compare it to classical light microscopy. Therefore, 80 histologic sections including 17 different skin tumor types were examined twice as glass slides and twice as digital whole-slide images by 6 pathologists with different levels of experience at 4 time points. Comparison of both methods found digital microscopy to be noninferior for differentiating individual tumor types within the category epithelial and mesenchymal tumors, but diagnostic concordance was slightly lower for differentiating individual round cell tumor types by digital microscopy. In addition, digital microscopy was associated with significantly shorter diagnostic time, but diagnostic confidence was lower and technical quality was considered inferior for whole-slide images compared with glass slides. Of note, diagnostic performance for whole-slide images scanned at 200× magnification was noninferior in diagnostic performance for slides scanned at 400×. In conclusion, digital microscopy differs only minimally from light microscopy in few aspects of diagnostic performance and overall appears adequate for the diagnosis of individual canine cutaneous tumors with minor limitations for differentiating individual round cell tumor types and grading of mast cell tumors.
Validation of 2D flood models with insurance claims
NASA Astrophysics Data System (ADS)
Zischg, Andreas Paul; Mosimann, Markus; Bernet, Daniel Benjamin; Röthlisberger, Veronika
2018-02-01
Flood impact modelling requires reliable models for the simulation of flood processes. In recent years, flood inundation models have been remarkably improved and widely used for flood hazard simulation, flood exposure and loss analyses. In this study, we validate a 2D inundation model for the purpose of flood exposure analysis at the river reach scale. We validate the BASEMENT simulation model with insurance claims using conventional validation metrics. The flood model is established on the basis of available topographic data in a high spatial resolution for four test cases. The validation metrics were calculated with two different datasets; a dataset of event documentations reporting flooded areas and a dataset of insurance claims. The model fit relating to insurance claims is in three out of four test cases slightly lower than the model fit computed on the basis of the observed inundation areas. This comparison between two independent validation data sets suggests that validation metrics using insurance claims can be compared to conventional validation data, such as the flooded area. However, a validation on the basis of insurance claims might be more conservative in cases where model errors are more pronounced in areas with a high density of values at risk.
Nazary-Moghadam, Salman; Zeinalzadeh, Afsaneh; Salavati, Mahyar; Almasi, Simin; Negahban, Hossein
2017-01-01
The aim of the present study was to culturally adapt and evaluate reliability and validity of Health Assessment Questionnaire-Disability Index (HAQ-DI) in Iranian patients with rheumatoid arthritis (RA). 234 patients with RA for validation study, Eighty-six participants for reliability study. Test-retest relative reliability and internal consistency of Persian version of HAQ-DI were examined by intraclass correlation coefficient (ICC) and Cronbach's alpha, respectively. Additionally, HAQ-DI construct validity (Spearman's correlation) was examined using Persian version of Short-Form 36 Health survey (SF-36), activity and severity parameters. Persian version of HAQ-DI total score showed excellent test-retest reliability (ICC = 0.98) and internal consistency (Cronbach's alpha = 0.95). Spearman's correlations between the total PHAQ-DI score and activity and severity parameters were above 0.55. Correlation between PHAQ-DI and SF-36 Physical Health were higher as compared with SF-36 Mental Health. Persian version of HAQ-DI is a reliable and valid culturally-adapted instrument in order to measure functional limitations in Iranian people with RA. Copyright © 2016 Elsevier Ltd. All rights reserved.
Noor, Norhayati Mohd; Aziz, Aniza Abd; Mostapa, Mohd Rosmizaki; Awang, Zainudin
2015-01-01
This study was designed to examine the psychometric properties of Malay version of the Inventory of Functional Status after Childbirth (IFSAC). A cross-sectional study. A total of 108 postpartum mothers attending Obstetrics and Gynaecology Clinic, in a tertiary teaching hospital in Malaysia, were involved. Construct validity and internal consistency were performed after the translation, content validity, and face validity process. The data were analyzed using Analysis of Moment Structure version 18 and Statistical Packages for the Social Sciences version 20. The final model consists of four constructs, namely, infant care, personal care, household activities, and social and community activities, with 18 items demonstrating acceptable factor loadings, domain to domain correlation, and best fit (Chi-squared/degree of freedom = 1.678; Tucker-Lewis index = 0.923; comparative fit index = 0.936; and root mean square error of approximation = 0.080). Composite reliability and average variance extracted of the domains ranged from 0.659 to 0.921 and from 0.499 to 0.628, respectively. The study suggested that the four-factor model with 18 items of the Malay version of IFSAC was acceptable to be used to measure functional status after childbirth because it is valid, reliable, and simple.
Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen
2018-01-01
This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6–8.0%; CV: 1.1–5.1%) and sprint mechanical properties (TEE: 4.5–14.3%; CV: 3.1–7.5%) than the 10 Hz GPS (TEE: 3.0–12.9%; CV: 2.5–13.0% and TEE: 4.1–23.1%; CV: 3.3–20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0–6.0%; CV: 0.7–5.0% and TEE: 2.1–9.2%; CV: 1.6–7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that 18 Hz GPS has enhanced validity and reliability for determining movement patterns in team sports compared to 10 Hz GPS, whereas 20 Hz LPS had superior validity and reliability overall. However, compared to 10 Hz GPS, 18 Hz GPS and 20 Hz LPS technologies had more outliers due to measurement errors, which limits their practical applications at this time. PMID:29420620
The Validation of a Case-Based, Cumulative Assessment and Progressions Examination
Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David
2016-01-01
Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435
de Freitas, Ricardo Miguel Costa; Andrade, Celi Santos; Caldas, José Guilherme Mendes Pereira; Kanas, Alexandre Fligelman; Cabral, Richard Halti; Tsunemi, Miriam Harumi; Rodríguez, Hernán Joel Cervantes; Rabbani, Said Rahnamaye
2015-05-01
New spinal interventions or implants have been tested on ex vivo or in vivo porcine spines, as they are readily available and have been accepted as a comparable model to human cadaver spines. Imaging-guided interventional procedures of the spine are mostly based on fluoroscopy or, still, on multidetector computed tomography (MDCT). Cone-beam computed tomography (CBCT) and magnetic resonance imaging (MRI) are also available methods to guide interventional procedures. Although some MDCT data from porcine spines are available in the literature, validation of the measurements on CBCT and MRI is lacking. To describe and compare the anatomical measurements accomplished with MDCT, CBCT, and MRI of lumbar porcine spines to determine if CBCT and MRI are also useful methods for experimental studies. An experimental descriptive-comparative study. Sixteen anatomical measurements of an individual vertebra from six lumbar porcine spines (n=36 vertebrae) were compared with their MDCT, CBCT, and MRI equivalents. Comparisons were made for the absolute values of the parameters. Similarities were found in all imaging methods. Significant correlation (p<.05) was observed with all variables except those that included cartilaginous tissue from the end plates when the anatomical study was compared with the imaging methods. The CBCT and MRI provided imaging measurements of the lumbar porcine spines that were similar to the anatomical and MDCT data, and they can be useful for specific experimental research studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Lam, Lucia L.; Ghadessi, Mercedeh; Erho, Nicholas; Vergara, Ismael A.; Alshalalfa, Mohammed; Buerki, Christine; Haddad, Zaid; Sierocinski, Thomas; Triche, Timothy J.; Skinner, Eila C.; Davicioni, Elai; Daneshmand, Siamak; Black, Peter C.
2014-01-01
Background Nearly half of muscle-invasive bladder cancer patients succumb to their disease following cystectomy. Selecting candidates for adjuvant therapy is currently based on clinical parameters with limited predictive power. This study aimed to develop and validate genomic-based signatures that can better identify patients at risk for recurrence than clinical models alone. Methods Transcriptome-wide expression profiles were generated using 1.4 million feature-arrays on archival tumors from 225 patients who underwent radical cystectomy and had muscle-invasive and/or node-positive bladder cancer. Genomic (GC) and clinical (CC) classifiers for predicting recurrence were developed on a discovery set (n = 133). Performances of GC, CC, an independent clinical nomogram (IBCNC), and genomic-clinicopathologic classifiers (G-CC, G-IBCNC) were assessed in the discovery and independent validation (n = 66) sets. GC was further validated on four external datasets (n = 341). Discrimination and prognostic abilities of classifiers were compared using area under receiver-operating characteristic curves (AUCs). All statistical tests were two-sided. Results A 15-feature GC was developed on the discovery set with area under curve (AUC) of 0.77 in the validation set. This was higher than individual clinical variables, IBCNC (AUC = 0.73), and comparable to CC (AUC = 0.78). Performance was improved upon combining GC with clinical nomograms (G-IBCNC, AUC = 0.82; G-CC, AUC = 0.86). G-CC high-risk patients had elevated recurrence probabilities (P < .001), with GC being the best predictor by multivariable analysis (P = .005). Genomic-clinicopathologic classifiers outperformed clinical nomograms by decision curve and reclassification analyses. GC performed the best in validation compared with seven prior signatures. GC markers remained prognostic across four independent datasets. Conclusions The validated genomic-based classifiers outperform clinical models for predicting postcystectomy bladder cancer recurrence. This may be used to better identify patients who need more aggressive management. PMID:25344601
ERIC Educational Resources Information Center
Varela Mato, Veronica; Yates, Thomas; Stensel, David; Biddle, Stuart; Clemes, Stacy A.
2017-01-01
This study explored the validity of ActiGraph-determined sedentary time (<50 cpm, <100 cpm, <150 cpm, <200 cpm, <250 cpm) compared with the activPAL in a free-living sample of bus drivers. Twenty-eight participants were recruited between November 2013 and February 2014. Participants wore an activPAL3 and ActiGraph GT3X+ concurrently…
ERIC Educational Resources Information Center
Murphy, Joseph J.; Murphy, Marie H.; MacDonncha, Ciaran; Murphy, Niamh; Nevill, Alan M.; Woods, Catherine B.
2017-01-01
The purpose of this study was to compare the validity and reliability of three short physical activity self-report instruments to determine their potential for use with university student populations. The participants (N = 155; 44.5% male; 22.9 ± 5.13 years) wore an accelerometer for 9 consecutive days and completed a single-item measure, the a…
NASA Technical Reports Server (NTRS)
Laymon, Charles A.; Crosson, William L.; Limaye, Ashutosh; Manu, Andrew; Archer, Frank
2005-01-01
We compare soil moisture retrieved with an inverse algorithm with observations of mean moisture in the 0-6 cm soil layer. A significant discrepancy is noted between the retrieved and observed moisture. Using emitting depth functions as weighting functions to convert the observed mean moisture to observed effective moisture removes nearly one-half of the discrepancy noted. This result has important implications in remote sensing validation studies.
Schmettow, Martin; Schnittker, Raphaela; Schraagen, Jan Maarten
2017-05-01
This paper proposes and demonstrates an extended protocol for usability validation testing of medical devices. A review of currently used methods for the usability evaluation of medical devices revealed two main shortcomings. Firstly, the lack of methods to closely trace the interaction sequences and derive performance measures. Secondly, a prevailing focus on cross-sectional validation studies, ignoring the issues of learnability and training. The U.S. Federal Drug and Food Administration's recent proposal for a validation testing protocol for medical devices is then extended to address these shortcomings: (1) a novel process measure 'normative path deviations' is introduced that is useful for both quantitative and qualitative usability studies and (2) a longitudinal, completely within-subject study design is presented that assesses learnability, training effects and allows analysis of diversity of users. A reference regression model is introduced to analyze data from this and similar studies, drawing upon generalized linear mixed-effects models and a Bayesian estimation approach. The extended protocol is implemented and demonstrated in a study comparing a novel syringe infusion pump prototype to an existing design with a sample of 25 healthcare professionals. Strong performance differences between designs were observed with a variety of usability measures, as well as varying training-on-the-job effects. We discuss our findings with regard to validation testing guidelines, reflect on the extensions and discuss the perspectives they add to the validation process. Copyright © 2017 Elsevier Inc. All rights reserved.
Zheng, Chengyi; Luo, Yi; Mercado, Cheryl; Sy, Lina; Jacobsen, Steven J; Ackerson, Brad; Lewin, Bruno; Tseng, Hung Fu
2018-06-19
Diagnosis codes are inadequate for accurately identifying herpes zoster ophthalmicus (HZO). There is significant lack of population-based studies on HZO due to the high expense of manual review of medical records. To assess whether HZO can be identified from the clinical notes using natural language processing (NLP). To investigate the epidemiology of HZO among HZ population based on the developed approach. A retrospective cohort analysis. A total of 49,914 southern California residents aged over 18 years, who had a new diagnosis of HZ. An NLP-based algorithm was developed and validated with the manually curated validation dataset (n=461). The algorithm was applied on over 1 million clinical notes associated with the study population. HZO versus non-HZO cases were compared by age, sex, race, and comorbidities. We measured the accuracy of NLP algorithm. NLP algorithm achieved 95.6% sensitivity and 99.3% specificity. Compared to the diagnosis codes, NLP identified significant more HZO cases among HZ population (13.9% versus 1.7%). Compared to the non-HZO group, the HZO group was older, had more males, had more Whites, and had more outpatient visits. We developed and validated an automatic method to identify HZO cases with high accuracy. As one of the largest studies on HZO, our finding emphasizes the importance of preventing HZ in the elderly population. This method can be a valuable tool to support population-based studies and clinical care of HZO in the era of big data. This article is protected by copyright. All rights reserved.
Vaughan, Frances L; Neal, Jo Anne; Mulla, Farzana Nizam; Edwards, Barbara; Coetzer, Rudi
2017-04-01
The Brain Injury Cognitive Screen (BICS) was developed as an in-service cognitive assessment battery for acquired brain injury patients entering community rehabilitation. The BICS focuses on domains that are particularly compromised following TBI, and provides a broader and more detailed assessment of executive function, attention and information processing than comparable screening assessments. The BICS also includes brief assessments of perception, naming, and construction, which were predicted to be more sensitive to impairments following non-traumatic brain injury. The studies reported here examine preliminary evidence for its validity in post-acute rehabilitation. In Study 1, TBI patients completed the BICS and were compared with matched controls. Patients with focal lesions and matched controls were compared in Study 2. Study 3 examined demographic effects in a sample of normative data. TBI and focal lesion patients obtained significantly lower composite memory, executive function and attention and information processing BICS scores than healthy controls. Injury severity effects were also obtained. Logistic regression analyses indicated that each group of BICS memory, executive function and attention measures reliably differentiated TBI and focal lesion participants from controls. Design Recall, Prospective Memory, Verbal Fluency, and Visual Search test scores showed significant independent regression effects. Other subtest measures showed evidence of sensitivity to brain injury. The study provides preliminary evidence of the BICS' sensitivity to cognitive impairment caused by acquired brain injury, and its potential clinical utility as a cognitive screen. Further validation based on a revised version of the BICS and more normative data are required.
Munkácsy, Gyöngyi; Sztupinszki, Zsófia; Herman, Péter; Bán, Bence; Pénzváltó, Zsófia; Szarvas, Nóra; Győrffy, Balázs
2016-09-27
No independent cross-validation of success rate for studies utilizing small interfering RNA (siRNA) for gene silencing has been completed before. To assess the influence of experimental parameters like cell line, transfection technique, validation method, and type of control, we have to validate these in a large set of studies. We utilized gene chip data published for siRNA experiments to assess success rate and to compare methods used in these experiments. We searched NCBI GEO for samples with whole transcriptome analysis before and after gene silencing and evaluated the efficiency for the target and off-target genes using the array-based expression data. Wilcoxon signed-rank test was used to assess silencing efficacy and Kruskal-Wallis tests and Spearman rank correlation were used to evaluate study parameters. All together 1,643 samples representing 429 experiments published in 207 studies were evaluated. The fold change (FC) of down-regulation of the target gene was above 0.7 in 18.5% and was above 0.5 in 38.7% of experiments. Silencing efficiency was lowest in MCF7 and highest in SW480 cells (FC = 0.59 and FC = 0.30, respectively, P = 9.3E-06). Studies utilizing Western blot for validation performed better than those with quantitative polymerase chain reaction (qPCR) or microarray (FC = 0.43, FC = 0.47, and FC = 0.55, respectively, P = 2.8E-04). There was no correlation between type of control, transfection method, publication year, and silencing efficiency. Although gene silencing is a robust feature successfully cross-validated in the majority of experiments, efficiency remained insufficient in a significant proportion of studies. Selection of cell line model and validation method had the highest influence on silencing proficiency.
Reliability and Validity of Wisconsin Upper Respiratory Symptom Survey, Korean Version
Yang, Su-Young; Kang, Weechang; Yeo, Yoon; Park, Yang-Chun
2011-01-01
Background The Wisconsin Upper Respiratory Symptom Survey (WURSS) is a self-administered questionnaire developed in the United States to evaluate the severity of the common cold and its reliability has been validated. We developed a Korean language version of this questionnaire by using a sequential forward and backward translation approach. The purpose of this study was to validate the Korean version of the Wisconsin Upper Respiratory Symptom Survey (WURSS-K) in Korean patients with common cold. Methods This multicenter prospective study enrolled 107 participants who were diagnosed with common cold and consented to participate in the study. The WURSS-K includes 1 global illness severity item, 32 symptom-based items, 10 functional quality-of-life (QOL) items, and 1 item assessing global change. The SF-8 was used as an external comparator. Results The participants were 54 women and 53 men aged 18 to 42 years. The WURSS-K showed good reliability in 10 domains, with Cronbach’s alphas ranging from 0.67 to 0.96 (mean: 0.84). Comparison of the reliability coefficients of the WURSS-K and WURSS yielded a Pearson correlation coefficient of 0.71 (P = 0.02). Validity of the WURSS-K was evaluated by comparing it with the SF-8, which yielded a Pearson correlation coefficient of −0.267 (P < 0.001). The Guyatt’s responsiveness index of the WURSS-K ranged from 0.13 to 0.46, and the correlation coefficient with the WURSS was 0.534 (P < 0.001), indicating that there was close correlation between the WURSS-K and WURSS. Conclusions The WURSS-K is a reliable, valid, and responsive disease-specific questionnaire for assessing symptoms and QOL in Korean patients with common cold. PMID:21691034
Yip, T C-F; Ma, A J; Wong, V W-S; Tse, Y-K; Chan, H L-Y; Yuen, P-C; Wong, G L-H
2017-08-01
Non-alcoholic fatty liver disease (NAFLD) affects 20%-40% of the general population in developed countries and is an increasingly important cause of hepatocellular carcinoma. Electronic medical records facilitate large-scale epidemiological studies, existing NAFLD scores often require clinical and anthropometric parameters that may not be captured in those databases. To develop and validate a laboratory parameter-based machine learning model to detect NAFLD for the general population. We randomly divided 922 subjects from a population screening study into training and validation groups; NAFLD was diagnosed by proton-magnetic resonance spectroscopy. On the basis of machine learning from 23 routine clinical and laboratory parameters after elastic net regulation, we evaluated the logistic regression, ridge regression, AdaBoost and decision tree models. The areas under receiver-operating characteristic curve (AUROC) of models in validation group were compared. Six predictors including alanine aminotransferase, high-density lipoprotein cholesterol, triglyceride, haemoglobin A 1c , white blood cell count and the presence of hypertension were selected. The NAFLD ridge score achieved AUROC of 0.87 (95% CI 0.83-0.90) and 0.88 (0.84-0.91) in the training and validation groups respectively. Using dual cut-offs of 0.24 and 0.44, NAFLD ridge score achieved 92% (86%-96%) sensitivity and 90% (86%-93%) specificity with corresponding negative and positive predictive values of 96% (91%-98%) and 69% (59%-78%), and 87% of overall accuracy among 70% of classifiable subjects in the validation group; 30% of subjects remained indeterminate. NAFLD ridge score is a simple and robust reference comparable to existing NAFLD scores to exclude NAFLD patients in epidemiological studies. © 2017 John Wiley & Sons Ltd.
Iordanova, B.; Rosenbaum, D.; Norman, D.; Weiner, M.; Studholme, C.
2007-01-01
BACKGROUND AND PURPOSE Brain volumetry is widely used for evaluating tissue degeneration; however, the parcellation methods are rarely validated and use arbitrary planes to mark boundaries of brain regions. The goal of this study was to develop, validate, and apply an MR imaging tracing method for the parcellation of 3 major gyri of the frontal lobe, which uses only local landmarks intrinsic to the structures of interest, without the need for global reorientation or the use of dividing planes or lines. METHODS Studies were performed on 25 subjects—healthy controls and subjects diagnosed with Lewy body dementia and Alzheimer disease—with significant variation in the underlying gyral anatomy and state of atrophy. The protocol was evaluated by using multiple observers tracing scans of subjects diagnosed with neurodegenerative disease and those aging normally, and the results were compared by spatial overlap agreement. To confirm the results, observers marked the same locations in different brains. We illustrated the variabilities of the key boundaries that pose the greatest challenge to defining consistent parcellations across subjects. RESULTS The resulting gyral volumes were evaluated, and their consistency across raters was used as an additional assessment of the validity of our marking method. The agreement on a scale of 0–1 was found to be 0.83 spatial and 0.90 volumetric for the same rater and 0.85 spatial and 0.90 volumetric for 2 different raters. The results revealed that the protocol remained consistent across different neurodegenerative conditions. CONCLUSION Our method provides a simple and reliable way for the volumetric evaluation of frontal lobe neurodegeneration and can be used as a resource for larger comparative studies as well as a validation procedure of automated algorithms. PMID:16971629
Validation of Computerized Automatic Calculation of the Sequential Organ Failure Assessment Score
Harrison, Andrew M.; Pickering, Brian W.; Herasevich, Vitaly
2013-01-01
Purpose. To validate the use of a computer program for the automatic calculation of the sequential organ failure assessment (SOFA) score, as compared to the gold standard of manual chart review. Materials and Methods. Adult admissions (age > 18 years) to the medical ICU with a length of stay greater than 24 hours were studied in the setting of an academic tertiary referral center. A retrospective cross-sectional analysis was performed using a derivation cohort to compare automatic calculation of the SOFA score to the gold standard of manual chart review. After critical appraisal of sources of disagreement, another analysis was performed using an independent validation cohort. Then, a prospective observational analysis was performed using an implementation of this computer program in AWARE Dashboard, which is an existing real-time patient EMR system for use in the ICU. Results. Good agreement between the manual and automatic SOFA calculations was observed for both the derivation (N=94) and validation (N=268) cohorts: 0.02 ± 2.33 and 0.29 ± 1.75 points, respectively. These results were validated in AWARE (N=60). Conclusion. This EMR-based automatic tool accurately calculates SOFA scores and can facilitate ICU decisions without the need for manual data collection. This tool can also be employed in a real-time electronic environment. PMID:23936639
NASA Astrophysics Data System (ADS)
Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa
2018-03-01
In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.
Lindberg, Marc A; Fugett, April; Thomas, Stuart W
2012-01-01
The Attachment and Clinical Issues Questionnaire (ACIQ; M. A. Lindberg & S. W. Thomas, 2011), was developed over an 18-year period containing 29 scales. The purpose of the present study was to test (a) the validity of the attachment scales in terms of how they predict to whom one turns in times of stress and for affective sharing, and (b) how the attachment scales compared with the Experiences in Close Relationship Questionnaire (ECR) in terms of concurrent, convergent, and discriminant evidence. The relevant secure scales of the ACIQ predicted to whom one turned in study 1, and study 2 demonstrated good convergent evidence with the ECR, but superior concurrent evidence in predicting partner satisfaction, and superior discriminant evidence in differentially correlating with mother and father warmth. Thus, the ACIQ passed essential validity and psychometric tests and was a more robust measure than the ECR with these defining characteristics of attachment.
Stavrakos, S-K; Ahmed-Kristensen, S; Goldman, T
2016-09-01
Designers at the conceptual phase of products such as headphones, stress the importance of comfort, e.g. executing comfort studies and the need for a reliable user panel. This paper proposes a methodology to issue a reliable user panel to represent large populations and validates the proposed framework to predict comfort factors, such as physical fit. Data of 200 heads was analyzed by forming clusters, 9 archetypal people were identified out of a 200 people's ear database. The archetypes were validated by comparing the archetypes' responses on physical fit against those of 20 participants interacting with 6 headsets. This paper suggests a new method of selecting representative user samples for prototype testing compared to costly and time consuming methods which relied on the analysis of human geometry of large populations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Modeling the Dynamic Interrelations between Mobility, Utility, and Land Asking Price
NASA Astrophysics Data System (ADS)
Hidayat, E.; Rudiarto, I.; Siegert, F.; Vries, W. D.
2018-02-01
Limited and insufficient information about the dynamic interrelation among mobility, utility, and land price is the main reason to conduct this research. Several studies, with several approaches, and several variables have been conducted so far in order to model the land price. However, most of these models appear to generate primarily static land prices. Thus, a research is required to compare, design, and validate different models which calculate and/or compare the inter-relational changes of mobility, utility, and land price. The applied method is a combination of analysis of literature review, expert interview, and statistical analysis. The result is newly improved mathematical model which have been validated and is suitable for the case study location. This improved model consists of 12 appropriate variables. This model can be implemented in the Salatiga city as the case study location in order to arrange better land use planning to mitigate the uncontrolled urban growth.
Behavior and emotional disturbance in Prader-Willi syndrome.
Einfeld, S L; Smith, A; Durvasula, S; Florio, T; Tonge, B J
1999-01-15
To determine if persons with the Prader-Willi syndrome (PWS) have increased psychopathology when compared with matched controls, and whether there is a specific behavior phenotype in PWS, the behavior of 46 persons with PWS was compared with that of control individuals derived from a community sample (N = 454) of persons with mental retardation (MR). Behaviors were studied using the Developmental Behaviour Checklist, an instrument of established validity in the evaluation of behavioral disturbance in individuals with MR. PWS subjects were found to be more behaviorally disturbed than controls overall, and especially in antisocial behavior. In addition, some individual behaviors were more common in PWS subjects than controls. When these behaviors are considered together with findings from other studies using acceptably rigorous methods, a consensus behavior phenotype for PWS can be formulated. This will provide a valid foundation for studies of the mechanism of genetic pathogenesis of behavior in PWS.
Smith, Lindsey P; Hua, Jenna; Seto, Edmund; Du, Shufa; Zang, Jiajie; Zou, Shurong; Popkin, Barry M; Mendez, Michelle A
2014-01-01
This paper addresses the need for diet assessment methods that capture the rapidly changing beverage consumption patterns in China. The objective of this study was to develop a 3-day smartphone-assisted 24-hour recall to improve the quantification of beverage intake amongst young Chinese adults (n=110) and validate, in a small subset (n=34), the extent to which the written record and smartphone-assisted recalls adequately estimated total fluid intake, using 24-hour urine samples. The smartphone-assisted method showed improved validity compared with the written record-assisted method, when comparing reported total fluid intake to total urine volume. However, participants reported consuming fewer beverages on the smartphone-assisted method compared with the written record-assisted method, primarily due to decreased consumption of traditional zero-energy beverages (i.e. water, tea) in the smartphone-assisted method. It is unclear why participants reported fewer beverages in the smartphone-assisted method than the written record -assisted method. One possibility is that participants found the smartphone method too cumbersome, and responded by decreasing beverage intake. These results suggest that smartphone-assisted 24-hour recalls perform comparably but do not appear to substantially improve beverage quantification compared with the current written record-based approach. In addition, we piloted a beverage screener to identify consumers of episodically consumed SSBs. As expected, a substantially higher proportion of consumers reported consuming SSBs on the beverage screener compared with either recall type, suggesting that a beverage screener may be useful in characterizing consumption of episodically consumed beverages in China's dynamic food and beverage landscape.
Smith, Lindsey P.; Hua, Jenna; Seto, Edmund; Du, Shufa; Zang, Jiajie; Zou, Shurong; Popkin, Barry M.; Mendez, Michelle A.
2014-01-01
This paper addresses the need for diet assessment methods that capture the rapidly changing beverage consumption patterns in China. The objective of this study was to develop a 3-day smartphone-assisted 24-hour recall to improve the quantification of beverage intake amongst young Chinese adults (n=110) and validate, in a small subset (n=34), the extent to which the written record and smartphone-assisted recalls adequately estimated total fluid intake, using 24-hour urine samples. The smartphone-assisted method showed improved validity compared to the written-assisted method, when comparing reported total fluid intake to total urine volume. However, participants reported consuming fewer beverages on the smartphone-assisted method compared to the written-assisted method, primarily due to decreased consumption of traditional zero-energy beverages (i.e. water, tea) in the smartphone-assisted method. It is unclear why participants reported fewer beverages in the smartphone-assisted method than the written-assisted method. One possibility is that participants found the smartphone method too cumbersome, and responded by decreasing beverage intake. These results suggest that smartphone-assisted 24-hour recalls perform comparably but do not appear to substantially improve beverage quantification compared to the current written record based approach. In addition, we piloted a beverage screener to identify consumers of episodically consumed SSBs. As expected, a substantially higher proportion of consumers reported consuming SSBs on the beverage screener compared to either recall type, suggesting that a beverage screener may be useful in characterizing consumption of episodically consumed beverages in China’s dynamic food and beverage landscape. PMID:25516327
van den Boer, Janet H W; Kranendonk, Jentina; van de Wiel, Anne; Feskens, Edith J M; Geelen, Anouk; Mars, Monica
2017-09-08
Observational studies performed in Asian populations suggest that eating rate is related to BMI. This paper investigates the association between self-reported eating rate (SRER) and body mass index (BMI) in a Dutch population, after having validated SRER against actual eating rate. Two studies were performed; a validation and a cross-sectional study. In the validation study SRER (i.e., 'slow', 'average', or 'fast') was obtained from 57 participants (men/women = 16/41, age: mean ± SD = 22.6 ± 2.8 yrs., BMI: mean ± SD = 22.1 ± 2.8 kg/m 2 ) and in these participants actual eating rate was measured for three food products. Using analysis of variance the association between SRER and actual eating rate was studied. The association between SRER and BMI was investigated in cross-sectional data from the NQplus cohort (i.e., 1473 Dutch adults; men/women = 741/732, age: mean ± SD = 54.6 ± 11.7 yrs., BMI: mean ± SD = 25.9 ± 4.0 kg/m 2 ) using (multiple) linear regression analysis. In the validation study actual eating rate increased proportionally with SRER (for all three food products P < 0.01). In the cross-sectional study SRER was positively associated with BMI in both men and women (P = 0.03 and P < 0.001, respectively). Self-reported fast-eating women had a 1.13 kg/m 2 (95% CI 0.43, 1.84) higher BMI compared to average-speed-eating women, after adjusting for confounders. This was not the case in men; self-reported fast-eating men had a 0.29 kg/m 2 (95% CI -0.22, 0.80) higher BMI compared to average-speed-eating men, after adjusting for confounders. These studies show that self-reported eating rate reflects actual eating rate on a group-level, and that a high self-reported eating rate is associated with a higher BMI in this Dutch population.
Joyce, Christopher; Burnett, Angus; Ball, Kevin
2010-09-01
It is believed that increasing the X-factor (movement of the shoulders relative to the hips) during the golf swing can increase ball velocity at impact. Increasing the X-factor may also increase the risk of low back pain. The aim of this study was to provide recommendations for the three-dimensional (3D) measurement of the X-factor and lower trunk movement during the golf swing. This three-part validation study involved; (1) developing and validating models and related algorithms (2) comparing 3D data obtained during static positions representative of the golf swing to visual estimates and (3) comparing 3D data obtained during dynamic golf swings to images gained from high-speed video. Of particular interest were issues related to sequence dependency. After models and algorithms were validated, results from parts two and three of the study supported the conclusion that a lateral bending/flexion-extension/axial rotation (ZYX) order of rotation was deemed to be the most suitable Cardanic sequence to use in the assessment of the X-factor and lower trunk movement in the golf swing. The findings of this study have relevance for further research examining the X-factor its relationship to club head speed and lower trunk movement and low back pain in golf.
Ojiambo, Robert; Konstabel, Kenn; Veidebaum, Toomas; Reilly, John; Verbestel, Vera; Huybrechts, Inge; Sioen, Isabelle; Casajús, José A; Moreno, Luis A; Vicente-Rodriguez, German; Bammann, Karin; Tubic, Bojan M; Marild, Staffan; Westerterp, Klaas; Pitsiladis, Yannis P
2012-11-01
One of the aims of Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants (IDEFICS) validation study is to validate field measures of physical activity (PA) and energy expenditure (EE) in young children. This study compared the validity of uniaxial accelerometry with heart-rate (HR) monitoring vs. triaxial accelerometry against doubly labeled water (DLW) criterion method for assessment of free-living EE in young children. Forty-nine European children (25 female, 24 male) aged 4-10 yr (mean age: 6.9 ± 1.5 yr) were assessed by uniaxial ActiTrainer with HR, uniaxial 3DNX, and triaxial 3DNX accelerometry. Total energy expenditure (TEE) was estimated using DLW over a 1-wk period. The longitudinal axis of both devices and triaxial 3DNX counts per minute (CPM) were significantly (P < 0.05) associated with physical activity level (PAL; r = 0.51 ActiTrainer, r = 0.49 uniaxial-3DNX, and r = 0.42 triaxial Σ3DNX). Eight-six percent of the variance in TEE could be predicted by a model combining body mass (partial r(2) = 71%; P < 0.05), CPM-ActiTrainer (partial r(2) = 11%; P < 0.05), and difference between HR at moderate and sedentary activities (ModHR - SedHR) (partial r(2) = 4%; P < 0.05). The SE of TEE estimate for ActiTrainer and 3DNX models ranged from 0.44 to 0.74 MJ/days or ∼7-11% of the average TEE. The SE of activity-induced energy expenditure (AEE) model estimates ranged from 0.38 to 0.57 MJ/day or 24-26% of the average AEE. It is concluded that the comparative validity of hip-mounted uniaxial and triaxial accelerometers for assessing PA and EE is similar.
Virtual reality simulator training for laparoscopic colectomy: what metrics have construct validity?
Shanmugan, Skandan; Leblanc, Fabien; Senagore, Anthony J; Ellis, C Neal; Stein, Sharon L; Khan, Sadaf; Delaney, Conor P; Champagne, Bradley J
2014-02-01
Virtual reality simulation for laparoscopic colectomy has been used for training of surgical residents and has been considered as a model for technical skills assessment of board-eligible colorectal surgeons. However, construct validity (the ability to distinguish between skill levels) must be confirmed before widespread implementation. This study was designed to specifically determine which metrics for laparoscopic sigmoid colectomy have evidence of construct validity. General surgeons that had performed fewer than 30 laparoscopic colon resections and laparoscopic colorectal experts (>200 laparoscopic colon resections) performed laparoscopic sigmoid colectomy on the LAP Mentor model. All participants received a 15-minute instructional warm-up and had never used the simulator before the study. Performance was then compared between each group for 21 metrics (procedural, 14; intraoperative errors, 7) to determine specifically which measurements demonstrate construct validity. Performance was compared with the Mann-Whitney U-test (p < 0.05 was significant). Fifty-three surgeons; 29 general surgeons, and 24 colorectal surgeons enrolled in the study. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 of 14 procedural metrics by distinguishing levels of surgical experience (p < 0.05). The most discriminatory procedural metrics (p < 0.01) favoring experts were reduced instrument path length, accuracy of the peritoneal/medial mobilization, and dissection of the inferior mesenteric artery. Intraoperative errors were not discriminatory for most metrics and favored general surgeons for colonic wall injury (general surgeons, 0.7; colorectal surgeons, 3.5; p = 0.045). Individual variability within the general surgeon and colorectal surgeon groups was not accounted for. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 procedure-specific metrics. However, using virtual reality simulator metrics to detect intraoperative errors did not discriminate between groups. If the virtual reality simulator continues to be used for the technical assessment of trainees and board-eligible surgeons, the evaluation of performance should be limited to procedural metrics.
Validation of the Urdu version of the Epworth Sleepiness Scale.
Surani, Asif Anwar; Ramar, Kannan; Surani, Arif Anwar; Khaliqdina, Jehangir Shehryar; Subramanian, Shyam; Surani, Salim
2012-09-01
To translate and validate the Epworth Sleepiness Scale (ESS) for use in Urdu-speaking population. The original Epworth Sleepiness Scale was translated into the Urdu version (ESS-Ur) in three phases - translation and back-translation; committee-based translation; and testing in bilingual individuals. The final was subsequently tested on 89 healthy bilingual subjects between February and April, 2010, to assess the validity of the translation compared to the original version. The subjects were students and employees of Dow University of Health Sciences, Karachi. Both English and Urdu versions of the Epworth Sleepiness Scale were administered to 59 (67%) women and 30 (33%) men. The mean composite Epworth score was 7.53 in English language and 7.7 in the Urdu version (p=0.76). The translated version was found to be highly correlated with the original scale (rho=0.938; p<.01). The study validated the scale's Urdu version as an effective tool for measuring daytime sleepiness in Urdu-speaking population. Future studies assessing the validity of such patients with sleep disorders need to be undertaken.
Validation of the Gratitude Questionnaire in Filipino Secondary School Students.
Valdez, Jana Patricia M; Yang, Weipeng; Datu, Jesus Alfonso D
2017-10-11
Most studies have assessed the psychometric properties of the Gratitude Questionnaire - Six-Item Form (GQ-6) in the Western contexts while very few research has been generated to explore the applicability of this scale in non-Western settings. To address this gap, the aim of the study was to examine the factorial validity and gender invariance of the Gratitude Questionnaire in the Philippines through a construct validation approach. There were 383 Filipino high school students who participated in the research. In terms of within-network construct validity, results of confirmatory factor analyses revealed that the five-item version of the questionnaire (GQ-5) had better fit compared to the original six-item version of the gratitude questionnaire. The scores from the GQ-5 also exhibited invariance across gender. Between-network construct validation showed that gratitude was associated with higher levels of academic achievement (β = .46, p <.001), autonomous motivation (β = .73, p <.001), and controlled motivation (β = .28, p <.01). Conversely, gratitude was linked to lower degree of amotivation (β = -.51, p <.001). Theoretical and practical implications are discussed.
Sfendla, Anis; Zouini, Btissame; Lemrani, Dina; Berman, Anne H; Senhaji, Meftaha; Kerekes, Nóra
2017-04-01
The study aimed to validate the Arabic version of the Drug Use Disorders Identification Test (DUDIT) by (1) assessing its factor structure, (2) determining structural validity, (3) evaluating item-total and inter-item correlation, and (4) assessing its predictive validity. The study population included 169 prison inmates, 51 patients with clinical diagnosis of substance used disorder, and 53 students (N = 273). All participants completed the self-report version of the Arabic DUDIT. After exploratory factor analysis, internal consistency of the Arabic DUDIT was determined and external validation was performed. Principal factor analysis showed that Arabic DUDIT exhibited only one factor, which explained 66.9% of the variance. Reliability based on Cronbach's alpha was .95. When compared to the DSM-IV substance use disorder diagnosis in a clinical sample, DUDIT had an area under the curve (AUC) of .98, with a sensitivity of .98 and a specificity of .90. The Arabic version of DUDIT is a valid and reliable tool for screening for drug use in Arabic-speaking countries.
Seo, Hyun-Ju; Kim, Soo Young; Lee, Yoon Jae; Jang, Bo-Hyoung; Park, Ji-Eun; Sheen, Seung-Soo; Hahn, Seo Kyung
2016-02-01
To develop a study Design Algorithm for Medical Literature on Intervention (DAMI) and test its interrater reliability, construct validity, and ease of use. We developed and then revised the DAMI to include detailed instructions. To test the DAMI's reliability, we used a purposive sample of 134 primary, mainly nonrandomized studies. We then compared the study designs as classified by the original authors and through the DAMI. Unweighted kappa statistics were computed to test interrater reliability and construct validity based on the level of agreement between the original and DAMI classifications. Assessment time was also recorded to evaluate ease of use. The DAMI includes 13 study designs, including experimental and observational studies of interventions and exposure. Both the interrater reliability (unweighted kappa = 0.67; 95% CI [0.64-0.75]) and construct validity (unweighted kappa = 0.63, 95% CI [0.52-0.67]) were substantial. Mean classification time using the DAMI was 4.08 ± 2.44 minutes (range, 0.51-10.92). The DAMI showed substantial interrater reliability and construct validity. Furthermore, given its ease of use, it could be used to accurately classify medical literature for systematic reviews of interventions although minimizing disagreement between authors of such reviews. Copyright © 2016 Elsevier Inc. All rights reserved.
Gorgulho, B M; Pot, G K; Marchioni, D M
2017-05-01
The aim of this study was to evaluate the validity and reliability of the Main Meal Quality Index when applied on the UK population. The indicator was developed to assess meal quality in different populations, and is composed of 10 components: fruit, vegetables (excluding potatoes), ratio of animal protein to total protein, fiber, carbohydrate, total fat, saturated fat, processed meat, sugary beverages and desserts, and energy density, resulting in a score range of 0-100 points. The performance of the indicator was measured using strategies for assessing content validity, construct validity, discriminant validity and reliability, including principal component analysis, linear regression models and Cronbach's alpha. The indicator presented good reliability. The Main Meal Quality Index has been shown to be valid for use as an instrument to evaluate, monitor and compare the quality of meals consumed by adults in the United Kingdom.
Lonsdale, Chris; Hodge, Ken; Rose, Elaine A
2008-06-01
The purpose of the four studies described in this article was to develop and test a new measure of competitive sport participants' intrinsic motivation, extrinsic motivation, and amotivation (self-determination theory; Deci & Ryan, 1985). The items for the new measure, named the Behavioral Regulation in Sport Questionnaire (BRSQ), were constructed using interviews, expert review, and pilot testing. Analyses supported the internal consistency, test-retest reliability, and factorial validity of the BRSQ scores. Nomological validity evidence was also supportive, as BRSQ subscale scores were correlated in the expected pattern with scores derived from measures of motivational consequences. When directly compared with scores derived from the Sport Motivation Scale (SMS; Pelletier, Fortier, Vallerand, Tuson, & Blais, 1995) and a revised version of that questionnaire (SMS-6; Mallett, Kawabata, Newcombe, Otero-Forero, & Jackson, 2007), BRSQ scores demonstrated equal or superior reliability and factorial validity as well as better nomological validity.
PSI-Center Simulations of Validation Platform Experiments
NASA Astrophysics Data System (ADS)
Nelson, B. A.; Akcay, C.; Glasser, A. H.; Hansen, C. J.; Jarboe, T. R.; Marklin, G. J.; Milroy, R. D.; Morgan, K. D.; Norgaard, P. C.; Shumlak, U.; Victor, B. S.; Sovinec, C. R.; O'Bryan, J. B.; Held, E. D.; Ji, J.-Y.; Lukin, V. S.
2013-10-01
The Plasma Science and Innovation Center (PSI-Center - http://www.psicenter.org) supports collaborating validation platform experiments with extended MHD simulations. Collaborators include the Bellan Plasma Group (Caltech), CTH (Auburn U), FRX-L (Los Alamos National Laboratory), HIT-SI (U Wash - UW), LTX (PPPL), MAST (Culham), Pegasus (U Wisc-Madison), PHD/ELF (UW/MSNW), SSX (Swarthmore College), TCSU (UW), and ZaP/ZaP-HD (UW). Modifications have been made to the NIMROD, HiFi, and PSI-Tet codes to specifically model these experiments, including mesh generation/refinement, non-local closures, appropriate boundary conditions (external fields, insulating BCs, etc.), and kinetic and neutral particle interactions. The PSI-Center is exploring application of validation metrics between experimental data and simulations results. Biorthogonal decomposition is proving to be a powerful method to compare global temporal and spatial structures for validation. Results from these simulation and validation studies, as well as an overview of the PSI-Center status will be presented.
Baldo, Matías N; Angeli, Emmanuel; Gareis, Natalia C; Hunzicker, Gabriel A; Murguía, Marcelo C; Ortega, Hugo H; Hein, Gustavo J
2018-04-01
A relative bioavailability study (RBA) of two phenytoin (PHT) formulations was conducted in rabbits, in order to compare the results obtained from different matrices (plasma and blood from dried blood spot (DBS) sampling) and different experimental designs (classic and block). The method was developed by liquid chromatography tandem-mass spectrometry (LC-MS/MS) in plasma and blood samples. The different sample preparation techniques, plasma protein precipitation and DBS, were validated according to international requirements. The analytical method was validated with ranges 0.20-50.80 and 0.12-20.32 µg ml -1 , r > 0.999 for plasma and blood, respectively. Accuracy and precision were within acceptance criteria for bioanalytical assay validation (< 15 for bias and CV% and < 20 for limit of quantification (LOQ)). PHT showed long-term stability, both for plasma and blood, and under refrigerated and room temperature conditions. Haematocrit values were measured during the validation process and RBA study. Finally, the pharmacokinetic parameters (C max , T max and AUC 0-t ) obtained from the RBA study were tested. Results were highly comparable for matrices and experimental designs. A matrix correlation higher than 0.975 and a ratio of (PHT blood) = 1.158 (PHT plasma) were obtained. The results obtained herein show that the use of classic experimental design and DBS sampling for animal pharmacokinetic studies should be encouraged as they could help to prevent the use of a large number of animals and also animal euthanasia. Finally, the combination of DBS sampling with LC-MS/MS technology showed to be an excellent tool not only for therapeutic drug monitoring but also for RBA studies.
Automated Essay Scoring versus Human Scoring: A Comparative Study
ERIC Educational Resources Information Center
Wang, Jinhao; Brown, Michelle Stallone
2007-01-01
The current research was conducted to investigate the validity of automated essay scoring (AES) by comparing group mean scores assigned by an AES tool, IntelliMetric [TM] and human raters. Data collection included administering the Texas version of the WriterPlacer "Plus" test and obtaining scores assigned by IntelliMetric [TM] and by…
Comparing Indicators of Sexual Sadism as Predictors of Recidivism among Adult Male Sexual Offenders
ERIC Educational Resources Information Center
Kingston, Drew A.; Seto, Michael C.; Firestone, Philip; Bradford, John M.
2010-01-01
Objective: In this longitudinal study, the predictive validity of a psychiatric diagnosis of sexual sadism was compared with three behavioral indicators of sadism: index sexual offense violence, sexual intrusiveness, and phallometrically assessed sexual arousal to depictions of sexual or nonsexual violence. Method: Five hundred and eighty six…
Comparability and Reliability Considerations of Adequate Yearly Progress
ERIC Educational Resources Information Center
Maier, Kimberly S.; Maiti, Tapabrata; Dass, Sarat C.; Lim, Chae Young
2012-01-01
The purpose of this study is to develop an estimate of Adequate Yearly Progress (AYP) that will allow for reliable and valid comparisons among student subgroups, schools, and districts. A shrinkage-type estimator of AYP using the Bayesian framework is described. Using simulated data, the performance of the Bayes estimator will be compared to…
Yen, Po-Yin; Sousa, Karen H; Bakken, Suzanne
2014-01-01
Background In a previous study, we developed the Health Information Technology Usability Evaluation Scale (Health-ITUES), which is designed to support customization at the item level. Such customization matches the specific tasks/expectations of a health IT system while retaining comparability at the construct level, and provides evidence of its factorial validity and internal consistency reliability through exploratory factor analysis. Objective In this study, we advanced the development of Health-ITUES to examine its construct validity and predictive validity. Methods The health IT system studied was a web-based communication system that supported nurse staffing and scheduling. Using Health-ITUES, we conducted a cross-sectional study to evaluate users’ perception toward the web-based communication system after system implementation. We examined Health-ITUES's construct validity through first and second order confirmatory factor analysis (CFA), and its predictive validity via structural equation modeling (SEM). Results The sample comprised 541 staff nurses in two healthcare organizations. The CFA (n=165) showed that a general usability factor accounted for 78.1%, 93.4%, 51.0%, and 39.9% of the explained variance in ‘Quality of Work Life’, ‘Perceived Usefulness’, ‘Perceived Ease of Use’, and ‘User Control’, respectively. The SEM (n=541) supported the predictive validity of Health-ITUES, explaining 64% of the variance in intention for system use. Conclusions The results of CFA and SEM provide additional evidence for the construct and predictive validity of Health-ITUES. The customizability of Health-ITUES has the potential to support comparisons at the construct level, while allowing variation at the item level. We also illustrate application of Health-ITUES across stages of system development. PMID:24567081
Self-management in chronic conditions: partners in health scale instrument validation.
Peñarrieta-de Córdova, Isabel; Barrios, Flores Florabel; Gutierrez-Gomes, Tranquilina; Piñonez-Martinez, Ma del Socorro; Quintero-Valle, Luz Maria; Castañeda-Hidalgo, Hortensia
2014-03-01
This article describes a study that aimed to validate the Self-care in Chronic Conditions Partners in Health Scale instrument in the Mexican population. The instrument has been validated in Australia for use as a screening tool by primary healthcare professionals to assess the self-care skills and abilities of people with a chronic illness. Validation was conducted using baseline data for 552 people with diabetes, hypertension and cancer aged 18 or older who were users of healthcare centres in Tampico, Tamaulipas, Mexico. Results show high reliability and validity of the instrument and three themes were identified: knowledge, adherence, and dealing with and managing side effects. The findings suggest the scale is useful as a generic self-rated clinical tool for assessing self-management in a range of chronic conditions, and provides an outcome measure for comparing populations and change in patient self-management knowledge and behaviour. The authors recommend validating the scale in other Latin-American settings with more research into the effect of gender on self- management.
A call for change: clinical evaluation of student registered nurse anesthetists.
Collins, Shawn; Callahan, Margaret Faut
2014-02-01
The ability to integrate theory with practice is integral to a student's success. A common reason for attrition from a nurse anesthesia program is clinical issues. To document clinical competence, students are evaluated using various tools. For use of a clinical evaluation tool as possible evidence for a student's dismissal, an important psychometric property to ensure is instrument validity. Clinical evaluation instruments of nurse anesthesia programs are not standardized among programs, which suggests a lack of instrument validity. The lack of established validity of the instruments used to evaluate students' clinical progress brings into question their ability to detect a student who is truly in jeopardy of attrition. Given this possibility, clinical instrument validity warrants research to be fair to students and improve attrition rates based on valid data. This ex post facto study evaluated a 17-item clinical instrument tool to demonstrate the need for validity of clinical evaluation tools. It also compared clinical scores with scores on the National Certification Examination.
Husbands, Adrian; Mathieson, Alistair; Dowell, Jonathan; Cleland, Jennifer; MacKenzie, Rhoda
2014-04-23
The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson's correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT's role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life.
2014-01-01
Background The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. Methods The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson’s correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Results Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Conclusions Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT’s role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life. PMID:24762134