A statistical approach to selecting and confirming validation targets in -omics experiments
2012-01-01
Background Genomic technologies are, by their very nature, designed for hypothesis generation. In some cases, the hypotheses that are generated require that genome scientists confirm findings about specific genes or proteins. But one major advantage of high-throughput technology is that global genetic, genomic, transcriptomic, and proteomic behaviors can be observed. Manual confirmation of every statistically significant genomic result is prohibitively expensive. This has led researchers in genomics to adopt the strategy of confirming only a handful of the most statistically significant results, a small subset chosen for biological interest, or a small random subset. But there is no standard approach for selecting and quantitatively evaluating validation targets. Results Here we present a new statistical method and approach for statistically validating lists of significant results based on confirming only a small random sample. We apply our statistical method to show that the usual practice of confirming only the most statistically significant results does not statistically validate result lists. We analyze an extensively validated RNA-sequencing experiment to show that confirming a random subset can statistically validate entire lists of significant results. Finally, we analyze multiple publicly available microarray experiments to show that statistically validating random samples can both (i) provide evidence to confirm long gene lists and (ii) save thousands of dollars and hundreds of hours of labor over manual validation of each significant result. Conclusions For high-throughput -omics studies, statistical validation is a cost-effective and statistically valid approach to confirming lists of significant results. PMID:22738145
Validation of Statistical Sampling Algorithms in Visual Sample Plan (VSP): Summary Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nuffer, Lisa L; Sego, Landon H.; Wilson, John E.
2009-02-18
The U.S. Department of Homeland Security, Office of Technology Development (OTD) contracted with a set of U.S. Department of Energy national laboratories, including the Pacific Northwest National Laboratory (PNNL), to write a Remediation Guidance for Major Airports After a Chemical Attack. The report identifies key activities and issues that should be considered by a typical major airport following an incident involving release of a toxic chemical agent. Four experimental tasks were identified that would require further research in order to supplement the Remediation Guidance. One of the tasks, Task 4, OTD Chemical Remediation Statistical Sampling Design Validation, dealt with statisticalmore » sampling algorithm validation. This report documents the results of the sampling design validation conducted for Task 4. In 2005, the Government Accountability Office (GAO) performed a review of the past U.S. responses to Anthrax terrorist cases. Part of the motivation for this PNNL report was a major GAO finding that there was a lack of validated sampling strategies in the U.S. response to Anthrax cases. The report (GAO 2005) recommended that probability-based methods be used for sampling design in order to address confidence in the results, particularly when all sample results showed no remaining contamination. The GAO also expressed a desire that the methods be validated, which is the main purpose of this PNNL report. The objective of this study was to validate probability-based statistical sampling designs and the algorithms pertinent to within-building sampling that allow the user to prescribe or evaluate confidence levels of conclusions based on data collected as guided by the statistical sampling designs. Specifically, the designs found in the Visual Sample Plan (VSP) software were evaluated. VSP was used to calculate the number of samples and the sample location for a variety of sampling plans applied to an actual release site. Most of the sampling designs validated are probability based, meaning samples are located randomly (or on a randomly placed grid) so no bias enters into the placement of samples, and the number of samples is calculated such that IF the amount and spatial extent of contamination exceeds levels of concern, at least one of the samples would be taken from a contaminated area, at least X% of the time. Hence, "validation" of the statistical sampling algorithms is defined herein to mean ensuring that the "X%" (confidence) is actually met.« less
On the analysis of very small samples of Gaussian repeated measurements: an alternative approach.
Westgate, Philip M; Burchett, Woodrow W
2017-03-15
The analysis of very small samples of Gaussian repeated measurements can be challenging. First, due to a very small number of independent subjects contributing outcomes over time, statistical power can be quite small. Second, nuisance covariance parameters must be appropriately accounted for in the analysis in order to maintain the nominal test size. However, available statistical strategies that ensure valid statistical inference may lack power, whereas more powerful methods may have the potential for inflated test sizes. Therefore, we explore an alternative approach to the analysis of very small samples of Gaussian repeated measurements, with the goal of maintaining valid inference while also improving statistical power relative to other valid methods. This approach uses generalized estimating equations with a bias-corrected empirical covariance matrix that accounts for all small-sample aspects of nuisance correlation parameter estimation in order to maintain valid inference. Furthermore, the approach utilizes correlation selection strategies with the goal of choosing the working structure that will result in the greatest power. In our study, we show that when accurate modeling of the nuisance correlation structure impacts the efficiency of regression parameter estimation, this method can improve power relative to existing methods that yield valid inference. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Idris, Khairiani; Yang, Kai-Lin
2017-01-01
This article reports the results of a mixed-methods approach to develop and validate an instrument to measure Indonesian pre-service teachers' conceptions of statistics. First, a phenomenographic study involving a sample of 44 participants uncovered six categories of conceptions of statistics. Second, an instrument of conceptions of statistics was…
ERIC Educational Resources Information Center
O'Bryant, Monique J.
2017-01-01
The aim of this study was to validate an instrument that can be used by instructors or social scientist who are interested in evaluating statistics anxiety. The psychometric properties of the English version of the Statistical Anxiety Scale (SAS) was examined through a confirmatory factor analysis of scores from a sample of 323 undergraduate…
Pageler, Natalie M; Grazier G'Sell, Max Jacob; Chandler, Warren; Mailes, Emily; Yang, Christine; Longhurst, Christopher A
2016-09-01
The objective of this project was to use statistical techniques to determine the completeness and accuracy of data migrated during electronic health record conversion. Data validation during migration consists of mapped record testing and validation of a sample of the data for completeness and accuracy. We statistically determined a randomized sample size for each data type based on the desired confidence level and error limits. The only error identified in the post go-live period was a failure to migrate some clinical notes, which was unrelated to the validation process. No errors in the migrated data were found during the 12- month post-implementation period. Compared to the typical industry approach, we have demonstrated that a statistical approach to sampling size for data validation can ensure consistent confidence levels while maximizing efficiency of the validation process during a major electronic health record conversion. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.
2017-01-01
Objective Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods. Study Design and Setting We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation. Results Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics. Conclusion This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods. PMID:27262237
Multiple Versus Single Set Validation of Multivariate Models to Avoid Mistakes.
Harrington, Peter de Boves
2018-01-02
Validation of multivariate models is of current importance for a wide range of chemical applications. Although important, it is neglected. The common practice is to use a single external validation set for evaluation. This approach is deficient and may mislead investigators with results that are specific to the single validation set of data. In addition, no statistics are available regarding the precision of a derived figure of merit (FOM). A statistical approach using bootstrapped Latin partitions is advocated. This validation method makes an efficient use of the data because each object is used once for validation. It was reviewed a decade earlier but primarily for the optimization of chemometric models this review presents the reasons it should be used for generalized statistical validation. Average FOMs with confidence intervals are reported and powerful, matched-sample statistics may be applied for comparing models and methods. Examples demonstrate the problems with single validation sets.
Air Combat Training: Good Stick Index Validation. Final Report for Period 3 April 1978-1 April 1979.
ERIC Educational Resources Information Center
Moore, Samuel B.; And Others
A study was conducted to investigate and statistically validate a performance measuring system (the Good Stick Index) in the Tactical Air Command Combat Engagement Simulator I (TAC ACES I) Air Combat Maneuvering (ACM) training program. The study utilized a twelve-week sample of eighty-nine student pilots to statistically validate the Good Stick…
45 CFR 153.350 - Risk adjustment data validation standards.
Code of Federal Regulations, 2012 CFR
2012-10-01
... implementation of any risk adjustment software and ensure proper validation of a statistically valid sample of... respect to implementation of risk adjustment software or as a result of data validation conducted pursuant... implementation of risk adjustment software or data validation. ...
Determination of polarimetric parameters of honey by near-infrared transflectance spectroscopy.
García-Alvarez, M; Ceresuela, S; Huidobro, J F; Hermida, M; Rodríguez-Otero, J L
2002-01-30
NIR transflectance spectroscopy was used to determine polarimetric parameters (direct polarization, polarization after inversion, specific rotation in dry matter, and polarization due to nonmonosaccharides) and sucrose in honey. In total, 156 honey samples were collected during 1992 (45 samples), 1995 (56 samples), and 1996 (55 samples). Samples were analyzed by NIR spectroscopy and polarimetric methods. Calibration (118 samples) and validation (38 samples) sets were made up; honeys from the three years were included in both sets. Calibrations were performed by modified partial least-squares regression and scatter correction by standard normal variation and detrend methods. For direct polarization, polarization after inversion, specific rotation in dry matter, and polarization due to nonmonosaccharides, good statistics (bias, SEV, and R(2)) were obtained for the validation set, and no statistically (p = 0.05) significant differences were found between instrumental and polarimetric methods for these parameters. Statistical data for sucrose were not as good as those of the other parameters. Therefore, NIR spectroscopy is not an effective method for quantitative analysis of sucrose in these honey samples. However, NIR spectroscopy may be an acceptable method for semiquantitative evaluation of sucrose for honeys, such as those in our study, containing up to 3% of sucrose. Further work is necessary to validate the uncertainty at higher levels.
Measuring Microaggression and Organizational Climate Factors in Military Units
2011-04-01
i.e., items) to accurately assess what we intend for them to measure. To assess construct and convergent validity, the author assessed the statistical ...sample indicated both convergent and construct validity of the microaggression scale. Table 5 presents these statistics . Measuring Microaggressions...models. As shown in Table 7, the measurement models had acceptable fit indices. That is, the Chi-square statistics were at their minimum; although the
Valid statistical inference methods for a case-control study with missing data.
Tian, Guo-Liang; Zhang, Chi; Jiang, Xuejun
2018-04-01
The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.
Beno, Sarah M; Stasiewicz, Matthew J; Andrus, Alexis D; Ralyea, Robert D; Kent, David J; Martin, Nicole H; Wiedmann, Martin; Boor, Kathryn J
2016-12-01
Pathogen environmental monitoring programs (EMPs) are essential for food processing facilities of all sizes that produce ready-to-eat food products exposed to the processing environment. We developed, implemented, and evaluated EMPs targeting Listeria spp. and Salmonella in nine small cheese processing facilities, including seven farmstead facilities. Individual EMPs with monthly sample collection protocols were designed specifically for each facility. Salmonella was detected in only one facility, with likely introduction from the adjacent farm indicated by pulsed-field gel electrophoresis data. Listeria spp. were isolated from all nine facilities during routine sampling. The overall Listeria spp. (other than Listeria monocytogenes ) and L. monocytogenes prevalences in the 4,430 environmental samples collected were 6.03 and 1.35%, respectively. Molecular characterization and subtyping data suggested persistence of a given Listeria spp. strain in seven facilities and persistence of L. monocytogenes in four facilities. To assess routine sampling plans, validation sampling for Listeria spp. was performed in seven facilities after at least 6 months of routine sampling. This validation sampling was performed by independent individuals and included collection of 50 to 150 samples per facility, based on statistical sample size calculations. Two of the facilities had a significantly higher frequency of detection of Listeria spp. during the validation sampling than during routine sampling, whereas two other facilities had significantly lower frequencies of detection. This study provides a model for a science- and statistics-based approach to developing and validating pathogen EMPs.
Assessing Discriminative Performance at External Validation of Clinical Prediction Models
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.
2016-01-01
Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.
Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W
2016-01-01
External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
45 CFR 153.350 - Risk adjustment data validation standards.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 45 Public Welfare 1 2013-10-01 2013-10-01 false Risk adjustment data validation standards. 153.350... validation standards. (a) General requirement. The State, or HHS on behalf of the State, must ensure proper implementation of any risk adjustment software and ensure proper validation of a statistically valid sample of...
45 CFR 153.350 - Risk adjustment data validation standards.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 45 Public Welfare 1 2014-10-01 2014-10-01 false Risk adjustment data validation standards. 153.350... validation standards. (a) General requirement. The State, or HHS on behalf of the State, must ensure proper implementation of any risk adjustment software and ensure proper validation of a statistically valid sample of...
2013-01-01
Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9). PMID:23721463
14 CFR Sec. 19-7 - Passenger origin-destination survey.
Code of Federal Regulations, 2011 CFR
2011-01-01
... Transportation Statistics' Director of Airline Information. (c) A statistically valid sample of light coupons... LAX Salt Lake City NorthwestOperating Carrier NorthwestTicketed Carrier Fare Code Phoenix American...
Mayer, B; Muche, R
2013-01-01
Animal studies are highly relevant for basic medical research, although their usage is discussed controversially in public. Thus, an optimal sample size for these projects should be aimed at from a biometrical point of view. Statistical sample size calculation is usually the appropriate methodology in planning medical research projects. However, required information is often not valid or only available during the course of an animal experiment. This article critically discusses the validity of formal sample size calculation for animal studies. Within the discussion, some requirements are formulated to fundamentally regulate the process of sample size determination for animal experiments.
Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M
2015-03-01
It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Willis, Brian H; Riley, Richard D
2017-09-20
An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Derivation and Applicability of Asymptotic Results for Multiple Subtests Person-Fit Statistics
Albers, Casper J.; Meijer, Rob R.; Tendeiro, Jorge N.
2016-01-01
In high-stakes testing, it is important to check the validity of individual test scores. Although a test may, in general, result in valid test scores for most test takers, for some test takers, test scores may not provide a good description of a test taker’s proficiency level. Person-fit statistics have been proposed to check the validity of individual test scores. In this study, the theoretical asymptotic sampling distribution of two person-fit statistics that can be used for tests that consist of multiple subtests is first discussed. Second, simulation study was conducted to investigate the applicability of this asymptotic theory for tests of finite length, in which the correlation between subtests and number of items in the subtests was varied. The authors showed that these distributions provide reasonable approximations, even for tests consisting of subtests of only 10 items each. These results have practical value because researchers do not have to rely on extensive simulation studies to simulate sampling distributions. PMID:29881053
Ganna, Andrea; Lee, Donghwan; Ingelsson, Erik; Pawitan, Yudi
2015-07-01
It is common and advised practice in biomedical research to validate experimental or observational findings in a population different from the one where the findings were initially assessed. This practice increases the generalizability of the results and decreases the likelihood of reporting false-positive findings. Validation becomes critical when dealing with high-throughput experiments, where the large number of tests increases the chance to observe false-positive results. In this article, we review common approaches to determine statistical thresholds for validation and describe the factors influencing the proportion of significant findings from a 'training' sample that are replicated in a 'validation' sample. We refer to this proportion as rediscovery rate (RDR). In high-throughput studies, the RDR is a function of false-positive rate and power in both the training and validation samples. We illustrate the application of the RDR using simulated data and real data examples from metabolomics experiments. We further describe an online tool to calculate the RDR using t-statistics. We foresee two main applications. First, if the validation study has not yet been collected, the RDR can be used to decide the optimal combination between the proportion of findings taken to validation and the size of the validation study. Secondly, if a validation study has already been done, the RDR estimated using the training data can be compared with the observed RDR from the validation data; hence, the success of the validation study can be assessed. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Majumdar, Subhabrata; Basak, Subhash C
2018-04-26
Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
ERIC Educational Resources Information Center
Mattern, Krista D.; Patterson, Brian F.
2011-01-01
The College Board formed a research consortium with four-year colleges and universities to build a national higher education database with the primary goal of validating the SAT® for use in college admission. The first sample included first-time, first-year students entering college in fall 2006, with 110 institutions providing students'…
ERIC Educational Resources Information Center
Patterson, Brian F.; Mattern, Krista D.
2013-01-01
The continued accumulation of validity evidence for the core uses of educational assessments is critical to ensure that proper inferences will be made for those core purposes. To that end, the College Board has continued to follow previous cohorts of college students and this report provides updated validity evidence for using the SAT to predict…
40 CFR 761.130 - Sampling requirements.
Code of Federal Regulations, 2010 CFR
2010-07-01
... sampling scheme and the guidance document are available on EPA's PCB Web site at http://www.epa.gov/pcb, or... § 761.125(c) (2) through (4). Using its best engineering judgment, EPA may sample a statistically valid random or grid sampling technique, or both. When using engineering judgment or random “grab” samples, EPA...
40 CFR 761.130 - Sampling requirements.
Code of Federal Regulations, 2011 CFR
2011-07-01
... sampling scheme and the guidance document are available on EPA's PCB Web site at http://www.epa.gov/pcb, or... § 761.125(c) (2) through (4). Using its best engineering judgment, EPA may sample a statistically valid random or grid sampling technique, or both. When using engineering judgment or random “grab” samples, EPA...
Correcting for Optimistic Prediction in Small Data Sets
Smith, Gordon C. S.; Seaman, Shaun R.; Wood, Angela M.; Royston, Patrick; White, Ian R.
2014-01-01
The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. We used clinical data sets (United Kingdom Down syndrome screening data from Glasgow (1991–2003), Edinburgh (1999–2003), and Cambridge (1990–2006), as well as Scottish national pregnancy discharge data (2004–2007)) to evaluate different approaches to adjustment for optimism. We found that sample splitting, cross-validation without replication, and leave-1-out cross-validation produced optimism-adjusted estimates of the C statistic that were biased and/or associated with greater absolute error than other available methods. Cross-validation with replication, bootstrapping, and a new method (leave-pair-out cross-validation) all generated unbiased optimism-adjusted estimates of the C statistic and had similar absolute errors in the clinical data set. Larger simulation studies confirmed that all 3 methods performed similarly with 10 or more events per variable, or when the C statistic was 0.9 or greater. However, with lower events per variable or lower C statistics, bootstrapping tended to be optimistic but with lower absolute and mean squared errors than both methods of cross-validation. PMID:24966219
Development and Validation of the Caring Loneliness Scale.
Karhe, Liisa; Kaunonen, Marja; Koivisto, Anna-Maija
2016-12-01
The Caring Loneliness Scale (CARLOS) includes 5 categories derived from earlier qualitative research. This article assesses the reliability and construct validity of a scale designed to measure patient experiences of loneliness in a professional caring relationship. Statistical analysis with 4 different sample sizes included Cronbach's alpha and exploratory factor analysis with principal axis factoring extraction. The sample size of 250 gave the most useful and comprehensible structure, but all 4 samples yielded underlying content of loneliness experiences. The initial 5 categories were reduced to 4 factors with 24 items and Cronbach's alpha ranging from .77 to .90. The findings support the reliability and validity of CARLOS for the assessment of Finnish breast cancer and heart surgery patients' experiences but as all instruments, further validation is needed.
Dwivedi, Jaya; Namdev, Kuldeep K; Chilkoti, Deepak C; Verma, Surajpal; Sharma, Swapnil
2018-06-06
Therapeutic drug monitoring (TDM) of anti-epileptic drugs provides a valid clinical tool in optimization of overall therapy. However, TDM is challenging due to the high biological samples (plasma/blood) storage/shipment costs and the limited availability of laboratories providing TDM services. Sampling in the form of dry plasma spot (DPS) or dry blood spot (DBS) is a suitable alternative to overcome these issues. An improved, simple, rapid, and stability indicating method for quantification of pregabalin in human plasma and DPS has been developed and validated. Analyses were performed on liquid chromatography tandem mass spectrometer under positive ionization mode of electrospray interface. Pregabain-d4 was used as internal standard, and the chromatographic separations were performed on Poroshell 120 EC-C18 column using an isocratic mobile phase flow rate of 1 mL/min. Stability of pregabalin in DPS was evaluated under simulated real-time conditions. Extraction procedures from plasma and DPS samples were compared using statistical tests. The method was validated considering the FDA method validation guideline. The method was linear over the concentration range of 20-16000 ng/mL and 100-10000 ng/mL in plasma and DPS, respectively. DPS samples were found stable for only one week upon storage at room temperature and for at least four weeks at freezing temperature (-20 ± 5 °C). Method was applied for quantification of pregabalin in over 600 samples of a clinical study. Statistical analyses revealed that two extraction procedures in plasma and DPS samples showed statistically insignificant difference and can be used interchangeably without any bias. Proposed method involves simple and rapid steps of sample processing that do not require a pre- or post-column derivatization procedure. The method is suitable for routine pharmacokinetic analysis and therapeutic monitoring of pregabalin.
Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal
2017-01-01
Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.
Statistically Controlling for Confounding Constructs Is Harder than You Think
Westfall, Jacob; Yarkoni, Tal
2016-01-01
Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity. PMID:27031707
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
45 CFR 308.1 - Self-assessment implementation methodology.
Code of Federal Regulations, 2010 CFR
2010-10-01
... selects statistically valid samples of cases from the IV-D program universe of cases; and (3) The State establishes a procedure for the design of samples and assures that no portions of the IV-D case universe are...
45 CFR 308.1 - Self-assessment implementation methodology.
Code of Federal Regulations, 2012 CFR
2012-10-01
... selects statistically valid samples of cases from the IV-D program universe of cases; and (3) The State establishes a procedure for the design of samples and assures that no portions of the IV-D case universe are...
45 CFR 308.1 - Self-assessment implementation methodology.
Code of Federal Regulations, 2011 CFR
2011-10-01
... selects statistically valid samples of cases from the IV-D program universe of cases; and (3) The State establishes a procedure for the design of samples and assures that no portions of the IV-D case universe are...
45 CFR 308.1 - Self-assessment implementation methodology.
Code of Federal Regulations, 2013 CFR
2013-10-01
... selects statistically valid samples of cases from the IV-D program universe of cases; and (3) The State establishes a procedure for the design of samples and assures that no portions of the IV-D case universe are...
45 CFR 308.1 - Self-assessment implementation methodology.
Code of Federal Regulations, 2014 CFR
2014-10-01
... selects statistically valid samples of cases from the IV-D program universe of cases; and (3) The State establishes a procedure for the design of samples and assures that no portions of the IV-D case universe are...
ERIC Educational Resources Information Center
Patterson, Brian F.; Mattern, Krista D.
2013-01-01
The continued accumulation of validity evidence for the intended uses of educational assessments is critical to ensure that proper inferences will be made for those purposes. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides…
ERIC Educational Resources Information Center
Beard, Jonathan; Marini, Jessica P.
2015-01-01
The continued accumulation of validity evidence for the intended uses of educational assessment scores is critical to ensure that inferences made using the scores are sound. To that end, the College Board has continued to collect college outcome data to evaluate the relationship between SAT® scores and college success. This report provides updated…
ERIC Educational Resources Information Center
Patterson, Brian F.; Mattern, Krista D.
2009-01-01
In an effort to continuously monitor the validity of the SAT for predicting first-year college grades, the College Board has continued its multi-year effort to recruit four-year colleges and universities (henceforth, "institutions") to provide data on the cohorts of first-time, first-year students entering in the fall semester beginning…
[Respondent-Driven Sampling: a new sampling method to study visible and hidden populations].
Mantecón, Alejandro; Juan, Montse; Calafat, Amador; Becoña, Elisardo; Román, Encarna
2008-01-01
The paper introduces a variant of chain-referral sampling: respondent-driven sampling (RDS). This sampling method shows that methods based on network analysis can be combined with the statistical validity of standard probability sampling methods. In this sense, RDS appears to be a mathematical improvement of snowball sampling oriented to the study of hidden populations. However, we try to prove its validity with populations that are not within a sampling frame but can nonetheless be contacted without difficulty. The basics of RDS are explained through our research on young people (aged 14 to 25) who go clubbing, consume alcohol and other drugs, and have sex. Fieldwork was carried out between May and July 2007 in three Spanish regions: Baleares, Galicia and Comunidad Valenciana. The presentation of the study shows the utility of this type of sampling when the population is accessible but there is a difficulty deriving from the lack of a sampling frame. However, the sample obtained is not a random representative one in statistical terms of the target population. It must be acknowledged that the final sample is representative of a 'pseudo-population' that approximates to the target population but is not identical to it.
Impact of syncope on quality of life: validation of a measure in patients undergoing tilt testing.
Nave-Leal, Elisabete; Oliveira, Mário; Pais-Ribeiro, José; Santos, Sofia; Oliveira, Eunice; Alves, Teresa; Cruz Ferreira, Rui
2015-03-01
Recurrent syncope has a significant impact on quality of life. The development of measurement scales to assess this impact that are easy to use in clinical settings is crucial. The objective of the present study is a preliminary validation of the Impact of Syncope on Quality of Life questionnaire for the Portuguese population. The instrument underwent a process of translation, validation, analysis of cultural appropriateness and cognitive debriefing. A population of 39 patients with a history of recurrent syncope (>1 year) who underwent tilt testing, aged 52.1 ± 16.4 years (21-83), 43.5% male, most in active employment (n=18) or retired (n=13), constituted a convenience sample. The resulting Portuguese version is similar to the original, with 12 items in a single aggregate score, and underwent statistical validation, with assessment of reliability, validity and stability over time. With regard to reliability, the internal consistency of the scale is 0.9. Assessment of convergent and discriminant validity showed statistically significant results (p<0.01). Regarding stability over time, a test-retest of this instrument at six months after tilt testing with 22 patients of the sample who had not undergone any clinical intervention found no statistically significant changes in quality of life. The results indicate that this instrument is of value for assessing quality of life in patients with recurrent syncope in Portugal. Copyright © 2014 Sociedade Portuguesa de Cardiologia. Published by Elsevier España. All rights reserved.
ERIC Educational Resources Information Center
Patterson, Brian F.; Mattern, Krista D.
2011-01-01
The findings for the 2008 sample are largely consistent with the previous reports. SAT scores were found to be correlated with FYGPA (r = 0.54), with a magnitude similar to HSGPA (r = 0.56). The best set of predictors of FYGPA remains SAT scores and HSGPA (r = 0.63), as the addition of the SAT sections to the correlation of HSGPA alone with FYGPA…
Currens, J.C.
1999-01-01
Analytical data for nitrate and triazines from 566 samples collected over a 3-year period at Pleasant Grove Spring, Logan County, KY, were statistically analyzed to determine the minimum data set needed to calculate meaningful yearly averages for a conduit-flow karst spring. Results indicate that a biweekly sampling schedule augmented with bihourly samples from high-flow events will provide meaningful suspended-constituent and dissolved-constituent statistics. Unless collected over an extensive period of time, daily samples may not be representative and may also be autocorrelated. All high-flow events resulting in a significant deflection of a constituent from base-line concentrations should be sampled. Either the geometric mean or the flow-weighted average of the suspended constituents should be used. If automatic samplers are used, then they may be programmed to collect storm samples as frequently as every few minutes to provide details on the arrival time of constituents of interest. However, only samples collected bihourly should be used to calculate averages. By adopting a biweekly sampling schedule augmented with high-flow samples, the need to continuously monitor discharge, or to search for and analyze existing data to develop a statistically valid monitoring plan, is lessened.Analytical data for nitrate and triazines from 566 samples collected over a 3-year period at Pleasant Grove Spring, Logan County, KY, were statistically analyzed to determine the minimum data set needed to calculate meaningful yearly averages for a conduit-flow karst spring. Results indicate that a biweekly sampling schedule augmented with bihourly samples from high-flow events will provide meaningful suspended-constituent and dissolved-constituent statistics. Unless collected over an extensive period of time, daily samples may not be representative and may also be autocorrelated. All high-flow events resulting in a significant deflection of a constituent from base-line concentrations should be sampled. Either the geometric mean or the flow-weighted average of the suspended constituents should be used. If automatic samplers are used, then they may be programmed to collect storm samples as frequently as every few minutes to provide details on the arrival time of constituents of interest. However, only samples collected bihourly should be used to calculate averages. By adopting a biweekly sampling schedule augmented with high-flow samples, the need to continuously monitor discharge, or to search for and analyze existing data to develop a statistically valid monitoring plan, is lessened.
ERIC Educational Resources Information Center
Mattern, Krista D.; Patterson, Brian F.
2006-01-01
The College Board formed a research consortium with four-year colleges and universities to build a national higher education database with the primary goal of validating the SAT®, which is used in college admission and consists of three sections: critical reading (SAT-CR), mathematics (SAT-M) and writing (SAT-W). This report builds on a body of…
ERIC Educational Resources Information Center
Chromy, James R.
This study addressed statistical techniques that might ameliorate some of the sampling problems currently facing states with small populations participating in State National Assessment of Educational Progress (NAEP) assessments. The study explored how the application of finite population correction factors to the between-school component of…
John F. Caratti
2006-01-01
The FIREMON Point Intercept (PO) method is used to assess changes in plant species cover or ground cover for a macroplot. This method uses a narrow diameter sampling pole or sampling pins, placed at systematic intervals along line transects to sample within plot variation and quantify statistically valid changes in plant species cover and height over time. Plant...
De Spiegelaere, Ward; Malatinkova, Eva; Lynch, Lindsay; Van Nieuwerburgh, Filip; Messiaen, Peter; O'Doherty, Una; Vandekerckhove, Linos
2014-06-01
Quantification of integrated proviral HIV DNA by repetitive-sampling Alu-HIV PCR is a candidate virological tool to monitor the HIV reservoir in patients. However, the experimental procedures and data analysis of the assay are complex and hinder its widespread use. Here, we provide an improved and simplified data analysis method by adopting binomial and Poisson statistics. A modified analysis method on the basis of Poisson statistics was used to analyze the binomial data of positive and negative reactions from a 42-replicate Alu-HIV PCR by use of dilutions of an integration standard and on samples of 57 HIV-infected patients. Results were compared with the quantitative output of the previously described Alu-HIV PCR method. Poisson-based quantification of the Alu-HIV PCR was linearly correlated with the standard dilution series, indicating that absolute quantification with the Poisson method is a valid alternative for data analysis of repetitive-sampling Alu-HIV PCR data. Quantitative outputs of patient samples assessed by the Poisson method correlated with the previously described Alu-HIV PCR analysis, indicating that this method is a valid alternative for quantifying integrated HIV DNA. Poisson-based analysis of the Alu-HIV PCR data enables absolute quantification without the need of a standard dilution curve. Implementation of the CI estimation permits improved qualitative analysis of the data and provides a statistical basis for the required minimal number of technical replicates. © 2014 The American Association for Clinical Chemistry.
A Decision Tree for Nonmetric Sex Assessment from the Skull.
Langley, Natalie R; Dudzik, Beatrix; Cloutier, Alesia
2018-01-01
This study uses five well-documented cranial nonmetric traits (glabella, mastoid process, mental eminence, supraorbital margin, and nuchal crest) and one additional trait (zygomatic extension) to develop a validated decision tree for sex assessment. The decision tree was built and cross-validated on a sample of 293 U.S. White individuals from the William M. Bass Donated Skeletal Collection. Ordinal scores from the six traits were analyzed using the partition modeling option in JMP Pro 12. A holdout sample of 50 skulls was used to test the model. The most accurate decision tree includes three variables: glabella, zygomatic extension, and mastoid process. This decision tree yielded 93.5% accuracy on the training sample, 94% on the cross-validated sample, and 96% on a holdout validation sample. Linear weighted kappa statistics indicate acceptable agreement among observers for these variables. Mental eminence should be avoided, and definitions and figures should be referenced carefully to score nonmetric traits. © 2017 American Academy of Forensic Sciences.
ERIC Educational Resources Information Center
James, David E.; Schraw, Gregory; Kuch, Fred
2015-01-01
We present an equation, derived from standard statistical theory, that can be used to estimate sampling margin of error for student evaluations of teaching (SETs). We use the equation to examine the effect of sample size, response rates and sample variability on the estimated sampling margin of error, and present results in four tables that allow…
Pullin, A N; Pairis-Garcia, M D; Campbell, B J; Campler, M R; Proudfoot, K L
2017-11-01
When considering methodologies for collecting behavioral data, continuous sampling provides the most complete and accurate data set whereas instantaneous sampling can provide similar results and also increase the efficiency of data collection. However, instantaneous time intervals require validation to ensure accurate estimation of the data. Therefore, the objective of this study was to validate scan sampling intervals for lambs housed in a feedlot environment. Feeding, lying, standing, drinking, locomotion, and oral manipulation were measured on 18 crossbred lambs housed in an indoor feedlot facility for 14 h (0600-2000 h). Data from continuous sampling were compared with data from instantaneous scan sampling intervals of 5, 10, 15, and 20 min using a linear regression analysis. Three criteria determined if a time interval accurately estimated behaviors: 1) ≥ 0.90, 2) slope not statistically different from 1 ( > 0.05), and 3) intercept not statistically different from 0 ( > 0.05). Estimations for lying behavior were accurate up to 20-min intervals, whereas feeding and standing behaviors were accurate only at 5-min intervals (i.e., met all 3 regression criteria). Drinking, locomotion, and oral manipulation demonstrated poor associations () for all tested intervals. The results from this study suggest that a 5-min instantaneous sampling interval will accurately estimate lying, feeding, and standing behaviors for lambs housed in a feedlot, whereas continuous sampling is recommended for the remaining behaviors. This methodology will contribute toward the efficiency, accuracy, and transparency of future behavioral data collection in lamb behavior research.
The Development of Statistics Textbook Supported with ICT and Portfolio-Based Assessment
NASA Astrophysics Data System (ADS)
Hendikawati, Putriaji; Yuni Arini, Florentina
2016-02-01
This research was development research that aimed to develop and produce a Statistics textbook model that supported with information and communication technology (ICT) and Portfolio-Based Assessment. This book was designed for students of mathematics at the college to improve students’ ability in mathematical connection and communication. There were three stages in this research i.e. define, design, and develop. The textbooks consisted of 10 chapters which each chapter contains introduction, core materials and include examples and exercises. The textbook developed phase begins with the early stages of designed the book (draft 1) which then validated by experts. Revision of draft 1 produced draft 2 which then limited test for readability test book. Furthermore, revision of draft 2 produced textbook draft 3 which simulated on a small sample to produce a valid model textbook. The data were analysed with descriptive statistics. The analysis showed that the Statistics textbook model that supported with ICT and Portfolio-Based Assessment valid and fill up the criteria of practicality.
Statistical Methods and Tools for Uxo Characterization (SERDP Final Technical Report)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pulsipher, Brent A.; Gilbert, Richard O.; Wilson, John E.
2004-11-15
The Strategic Environmental Research and Development Program (SERDP) issued a statement of need for FY01 titled Statistical Sampling for Unexploded Ordnance (UXO) Site Characterization that solicited proposals to develop statistically valid sampling protocols for cost-effective, practical, and reliable investigation of sites contaminated with UXO; protocols that could be validated through subsequent field demonstrations. The SERDP goal was the development of a sampling strategy for which a fraction of the site is initially surveyed by geophysical detectors to confidently identify clean areas and subsections (target areas, TAs) that had elevated densities of anomalous geophysical detector readings that could indicate the presencemore » of UXO. More detailed surveys could then be conducted to search the identified TAs for UXO. SERDP funded three projects: those proposed by the Pacific Northwest National Laboratory (PNNL) (SERDP Project No. UXO 1199), Sandia National Laboratory (SNL), and Oak Ridge National Laboratory (ORNL). The projects were closely coordinated to minimize duplication of effort and facilitate use of shared algorithms where feasible. This final report for PNNL Project 1199 describes the methods developed by PNNL to address SERDP's statement-of-need for the development of statistically-based geophysical survey methods for sites where 100% surveys are unattainable or cost prohibitive.« less
Risk-based Methodology for Validation of Pharmaceutical Batch Processes.
Wiles, Frederick
2013-01-01
In January 2011, the U.S. Food and Drug Administration published new process validation guidance for pharmaceutical processes. The new guidance debunks the long-held industry notion that three consecutive validation batches or runs are all that are required to demonstrate that a process is operating in a validated state. Instead, the new guidance now emphasizes that the level of monitoring and testing performed during process performance qualification (PPQ) studies must be sufficient to demonstrate statistical confidence both within and between batches. In some cases, three qualification runs may not be enough. Nearly two years after the guidance was first published, little has been written defining a statistical methodology for determining the number of samples and qualification runs required to satisfy Stage 2 requirements of the new guidance. This article proposes using a combination of risk assessment, control charting, and capability statistics to define the monitoring and testing scheme required to show that a pharmaceutical batch process is operating in a validated state. In this methodology, an assessment of process risk is performed through application of a process failure mode, effects, and criticality analysis (PFMECA). The output of PFMECA is used to select appropriate levels of statistical confidence and coverage which, in turn, are used in capability calculations to determine when significant Stage 2 (PPQ) milestones have been met. The achievement of Stage 2 milestones signals the release of batches for commercial distribution and the reduction of monitoring and testing to commercial production levels. Individuals, moving range, and range/sigma charts are used in conjunction with capability statistics to demonstrate that the commercial process is operating in a state of statistical control. The new process validation guidance published by the U.S. Food and Drug Administration in January of 2011 indicates that the number of process validation batches or runs required to demonstrate that a pharmaceutical process is operating in a validated state should be based on sound statistical principles. The old rule of "three consecutive batches and you're done" is no longer sufficient. The guidance, however, does not provide any specific methodology for determining the number of runs required, and little has been published to augment this shortcoming. The paper titled "Risk-based Methodology for Validation of Pharmaceutical Batch Processes" describes a statistically sound methodology for determining when a statistically valid number of validation runs has been acquired based on risk assessment and calculation of process capability.
The Application of FT-IR Spectroscopy for Quality Control of Flours Obtained from Polish Producers
Ceglińska, Alicja; Reder, Magdalena; Ciemniewska-Żytkiewicz, Hanna
2017-01-01
Samples of wheat, spelt, rye, and triticale flours produced by different Polish mills were studied by both classic chemical methods and FT-IR MIR spectroscopy. An attempt was made to statistically correlate FT-IR spectral data with reference data with regard to content of various components, for example, proteins, fats, ash, and fatty acids as well as properties such as moisture, falling number, and energetic value. This correlation resulted in calibrated and validated statistical models for versatile evaluation of unknown flour samples. The calibration data set was used to construct calibration models with use of the CSR and the PLS with the leave one-out, cross-validation techniques. The calibrated models were validated with a validation data set. The results obtained confirmed that application of statistical models based on MIR spectral data is a robust, accurate, precise, rapid, inexpensive, and convenient methodology for determination of flour characteristics, as well as for detection of content of selected flour ingredients. The obtained models' characteristics were as follows: R2 = 0.97, PRESS = 2.14; R2 = 0.96, PRESS = 0.69; R2 = 0.95, PRESS = 1.27; R2 = 0.94, PRESS = 0.76, for content of proteins, lipids, ash, and moisture level, respectively. Best results of CSR models were obtained for protein, ash, and crude fat (R2 = 0.86; 0.82; and 0.78, resp.). PMID:28243483
A scoring system for ascertainment of incident stroke; the Risk Index Score (RISc).
Kass-Hout, T A; Moyé, L A; Smith, M A; Morgenstern, L B
2006-01-01
The main objective of this study was to develop and validate a computer-based statistical algorithm that could be translated into a simple scoring system in order to ascertain incident stroke cases using hospital admission medical records data. The Risk Index Score (RISc) algorithm was developed using data collected prospectively by the Brain Attack Surveillance in Corpus Christi (BASIC) project, 2000. The validity of RISc was evaluated by estimating the concordance of scoring system stroke ascertainment to stroke ascertainment by physician and/or abstractor review of hospital admission records. RISc was developed on 1718 randomly selected patients (training set) and then statistically validated on an independent sample of 858 patients (validation set). A multivariable logistic model was used to develop RISc and subsequently evaluated by goodness-of-fit and receiver operating characteristic (ROC) analyses. The higher the value of RISc, the higher the patient's risk of potential stroke. The study showed RISc was well calibrated and discriminated those who had potential stroke from those that did not on initial screening. In this study we developed and validated a rapid, easy, efficient, and accurate method to ascertain incident stroke cases from routine hospital admission records for epidemiologic investigations. Validation of this scoring system was achieved statistically; however, clinical validation in a community hospital setting is warranted.
ERIC Educational Resources Information Center
Veilleux, Jennifer C.; Chapman, Kate M.
2017-01-01
The current set of three studies further evaluates the validity and application of the Psychological Research Inventory of Concepts (PRIC). In Study 1, we administered the PRIC to a sample of introductory psychology students and online (Mechanical Turk) participants along with measures assessing theoretically related concepts. We found evidence of…
A Statistical Analysis of Data Used in Critical Decision Making by Secondary School Personnel.
ERIC Educational Resources Information Center
Dunn, Charleta J.; Kowitz, Gerald T.
Guidance decisions depend on the validity of standardized tests and teacher judgment records as measures of student achievement. To test this validity, a sample of 400 high school juniors, randomly selected from two large Gulf Coas t area schools, were administered the Iowa Tests of Educational Development. The nine subtest scores and each…
DBS-LC-MS/MS assay for caffeine: validation and neonatal application.
Bruschettini, Matteo; Barco, Sebastiano; Romantsik, Olga; Risso, Francesco; Gennai, Iulian; Chinea, Benito; Ramenghi, Luca A; Tripodi, Gino; Cangemi, Giuliana
2016-09-01
DBS might be an appropriate microsampling technique for therapeutic drug monitoring of caffeine in infants. Nevertheless, its application presents several issues that still limit its use. This paper describes a validated DBS-LC-MS/MS method for caffeine. The results of the method validation showed an hematocrit dependence. In the analysis of 96 paired plasma and DBS clinical samples, caffeine levels measured in DBS were statistically significantly lower than in plasma but the observed differences were independent from hematocrit. These results clearly showed the need for extensive validation with real-life samples for DBS-based methods. DBS-LC-MS/MS can be considered to be a good alternative to traditional methods for therapeutic drug monitoring or PK studies in preterm infants.
John F. Caratti
2006-01-01
The FIREMON Density (DE) method is used to assess changes in plant species density and height for a macroplot. This method uses multiple quadrats and belt transects (transects having a width) to sample within plot variation and quantify statistically valid changes in plant species density and height over time. Herbaceous plant species are sampled with quadrats while...
Pechorro, Pedro; Ribeiro da Silva, Diana; Andershed, Henrik; Rijo, Daniel; Abrunhosa Gonçalves, Rui
2016-01-01
The aim of the present study was to examine the psychometric properties of the Youth Psychopathic Traits Inventory (YPI) among a mixed-gender sample of 782 Portuguese youth (M = 15.87 years; SD = 1.72), in a school context. Confirmatory factor analysis revealed the expected three-factor first-order structure. Cross-gender measurement invariance and cross-sample measurement invariance using a forensic sample of institutionalized males were also confirmed. The Portuguese version of the YPI demonstrated generally adequate psychometric properties of internal consistency, mean inter-item correlation, convergent validity, discriminant validity, and criterion-related validity of statistically significant associations with conduct disorder symptoms, alcohol abuse, drug use, and unprotected sex. In terms of known-groups validity, males scored higher than females, and males from the school sample scored lower than institutionalized males. The use of the YPI among the Portuguese male and female youth population is psychometrically justified, and it can be a useful measure to identify adolescents with high levels of psychopathic traits. PMID:27571095
Developing a cosmic ray muon sampling capability for muon tomography and monitoring applications
NASA Astrophysics Data System (ADS)
Chatzidakis, S.; Chrysikopoulou, S.; Tsoukalas, L. H.
2015-12-01
In this study, a cosmic ray muon sampling capability using a phenomenological model that captures the main characteristics of the experimentally measured spectrum coupled with a set of statistical algorithms is developed. The "muon generator" produces muons with zenith angles in the range 0-90° and energies in the range 1-100 GeV and is suitable for Monte Carlo simulations with emphasis on muon tomographic and monitoring applications. The muon energy distribution is described by the Smith and Duller (1959) [35] phenomenological model. Statistical algorithms are then employed for generating random samples. The inverse transform provides a means to generate samples from the muon angular distribution, whereas the Acceptance-Rejection and Metropolis-Hastings algorithms are employed to provide the energy component. The predictions for muon energies 1-60 GeV and zenith angles 0-90° are validated with a series of actual spectrum measurements and with estimates from the software library CRY. The results confirm the validity of the phenomenological model and the applicability of the statistical algorithms to generate polyenergetic-polydirectional muons. The response of the algorithms and the impact of critical parameters on computation time and computed results were investigated. Final output from the proposed "muon generator" is a look-up table that contains the sampled muon angles and energies and can be easily integrated into Monte Carlo particle simulation codes such as Geant4 and MCNP.
Sexual Harassment Retaliation Climate DEOCS 4.1 Construct Validity Summary
2017-08-01
exploratory factor analysis, and bivariate correlations (sample 1) 2) To determine the factor structure of the remaining (final) questions via...statistics, reliability analysis, exploratory factor analysis, and bivariate correlations of the prospective Sexual Harassment Retaliation Climate...reported by the survey requester). For information regarding the composition of sample, refer to Table 1. Table 1. Sample 1 Demographics n
Instrumental and statistical methods for the comparison of class evidence
NASA Astrophysics Data System (ADS)
Liszewski, Elisa Anne
Trace evidence is a major field within forensic science. Association of trace evidence samples can be problematic due to sample heterogeneity and a lack of quantitative criteria for comparing spectra or chromatograms. The aim of this study is to evaluate different types of instrumentation for their ability to discriminate among samples of various types of trace evidence. Chemometric analysis, including techniques such as Agglomerative Hierarchical Clustering, Principal Components Analysis, and Discriminant Analysis, was employed to evaluate instrumental data. First, automotive clear coats were analyzed by using microspectrophotometry to collect UV absorption data. In total, 71 samples were analyzed with classification accuracy of 91.61%. An external validation was performed, resulting in a prediction accuracy of 81.11%. Next, fiber dyes were analyzed using UV-Visible microspectrophotometry. While several physical characteristics of cotton fiber can be identified and compared, fiber color is considered to be an excellent source of variation, and thus was examined in this study. Twelve dyes were employed, some being visually indistinguishable. Several different analyses and comparisons were done, including an inter-laboratory comparison and external validations. Lastly, common plastic samples and other polymers were analyzed using pyrolysis-gas chromatography/mass spectrometry, and their pyrolysis products were then analyzed using multivariate statistics. The classification accuracy varied dependent upon the number of classes chosen, but the plastics were grouped based on composition. The polymers were used as an external validation and misclassifications occurred with chlorinated samples all being placed into the category containing PVC.
Parametric vs. non-parametric statistics of low resolution electromagnetic tomography (LORETA).
Thatcher, R W; North, D; Biver, C
2005-01-01
This study compared the relative statistical sensitivity of non-parametric and parametric statistics of 3-dimensional current sources as estimated by the EEG inverse solution Low Resolution Electromagnetic Tomography (LORETA). One would expect approximately 5% false positives (classification of a normal as abnormal) at the P < .025 level of probability (two tailed test) and approximately 1% false positives at the P < .005 level. EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) from 43 normal adult subjects were imported into the Key Institute's LORETA program. We then used the Key Institute's cross-spectrum and the Key Institute's LORETA output files (*.lor) as the 2,394 gray matter pixel representation of 3-dimensional currents at different frequencies. The mean and standard deviation *.lor files were computed for each of the 2,394 gray matter pixels for each of the 43 subjects. Tests of Gaussianity and different transforms were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of parametric vs. non-parametric statistics were compared using a "leave-one-out" cross validation method in which individual normal subjects were withdrawn and then statistically classified as being either normal or abnormal based on the remaining subjects. Log10 transforms approximated Gaussian distribution in the range of 95% to 99% accuracy. Parametric Z score tests at P < .05 cross-validation demonstrated an average misclassification rate of approximately 4.25%, and range over the 2,394 gray matter pixels was 27.66% to 0.11%. At P < .01 parametric Z score cross-validation false positives were 0.26% and ranged from 6.65% to 0% false positives. The non-parametric Key Institute's t-max statistic at P < .05 had an average misclassification error rate of 7.64% and ranged from 43.37% to 0.04% false positives. The nonparametric t-max at P < .01 had an average misclassification rate of 6.67% and ranged from 41.34% to 0% false positives of the 2,394 gray matter pixels for any cross-validated normal subject. In conclusion, adequate approximation to Gaussian distribution and high cross-validation can be achieved by the Key Institute's LORETA programs by using a log10 transform and parametric statistics, and parametric normative comparisons had lower false positive rates than the non-parametric tests.
Saraf, Sanatan; Mathew, Thomas; Roy, Anindya
2015-01-01
For the statistical validation of surrogate endpoints, an alternative formulation is proposed for testing Prentice's fourth criterion, under a bivariate normal model. In such a setup, the criterion involves inference concerning an appropriate regression parameter, and the criterion holds if the regression parameter is zero. Testing such a null hypothesis has been criticized in the literature since it can only be used to reject a poor surrogate, and not to validate a good surrogate. In order to circumvent this, an equivalence hypothesis is formulated for the regression parameter, namely the hypothesis that the parameter is equivalent to zero. Such an equivalence hypothesis is formulated as an alternative hypothesis, so that the surrogate endpoint is statistically validated when the null hypothesis is rejected. Confidence intervals for the regression parameter and tests for the equivalence hypothesis are proposed using bootstrap methods and small sample asymptotics, and their performances are numerically evaluated and recommendations are made. The choice of the equivalence margin is a regulatory issue that needs to be addressed. The proposed equivalence testing formulation is also adopted for other parameters that have been proposed in the literature on surrogate endpoint validation, namely, the relative effect and proportion explained.
Are atmospheric surface layer flows ergodic?
NASA Astrophysics Data System (ADS)
Higgins, Chad W.; Katul, Gabriel G.; Froidevaux, Martin; Simeonov, Valentin; Parlange, Marc B.
2013-06-01
The transposition of atmospheric turbulence statistics from the time domain, as conventionally sampled in field experiments, is explained by the so-called ergodic hypothesis. In micrometeorology, this hypothesis assumes that the time average of a measured flow variable represents an ensemble of independent realizations from similar meteorological states and boundary conditions. That is, the averaging duration must be sufficiently long to include a large number of independent realizations of the sampled flow variable so as to represent the ensemble. While the validity of the ergodic hypothesis for turbulence has been confirmed in laboratory experiments, and numerical simulations for idealized conditions, evidence for its validity in the atmospheric surface layer (ASL), especially for nonideal conditions, continues to defy experimental efforts. There is some urgency to make progress on this problem given the proliferation of tall tower scalar concentration networks aimed at constraining climate models yet are impacted by nonideal conditions at the land surface. Recent advancements in water vapor concentration lidar measurements that simultaneously sample spatial and temporal series in the ASL are used to investigate the validity of the ergodic hypothesis for the first time. It is shown that ergodicity is valid in a strict sense above uniform surfaces away from abrupt surface transitions. Surprisingly, ergodicity may be used to infer the ensemble concentration statistics of a composite grass-lake system using only water vapor concentration measurements collected above the sharp transition delineating the lake from the grass surface.
Churcher, Frances P; Mills, Jeremy F; Forth, Adelle E
2016-08-01
Over the past few decades many structured risk appraisal measures have been created to respond to this need. The Two-Tiered Violence Risk Estimates Scale (TTV) is a measure designed to integrate both an actuarial estimate of violence risk with critical risk management indicators. The current study examined interrater reliability and the predictive validity of the TTV in a sample of violent offenders (n = 120) over an average follow-up period of 17.75 years. The TTV was retrospectively scored and compared with the Violence Risk Appraisal Guide (VRAG), the Statistical Information of Recidivism Scale-Revised (SIR-R1), and the Psychopathy Checklist-Revised (PCL-R). Approximately 53% of the sample reoffended violently, with an overall recidivism rate of 74%. Although the VRAG was the strongest predictor of violent recidivism in the sample, the Actuarial Risk Estimates (ARE) scale of the TTV produced a small, significant effect. The Risk Management Indicators (RMI) produced nonsignificant area under the curve (AUC) values for all recidivism outcomes. Comparisons between measures using AUC values and Cox regression showed that there were no statistical differences in predictive validity. The results of this research will be used to inform the validation and reliability literature on the TTV, and will contribute to the overall risk assessment literature. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
John F. Caratti
2006-01-01
The FIREMON Line Intercept (LI) method is used to assess changes in plant species cover for a macroplot. This method uses multiple line transects to sample within plot variation and quantify statistically valid changes in plant species cover and height over time. This method is suited for most forest and rangeland communities, but is especially useful for sampling...
ERIC Educational Resources Information Center
Awang-Hashim, Rosa; O'Neil, Harold F., Jr.; Hocevar, Dennis
2002-01-01
The relations between motivational constructs, effort, self-efficacy and worry, and statistics achievement were investigated in a sample of 360 undergraduates in Malaysia. Both trait (cross-situational) and state (task-specific) measures of each construct were used to test a mediational trait (r) state (r) performance (TSP) model. As hypothesized,…
Identification of abnormal accident patterns at intersections
DOT National Transportation Integrated Search
1999-08-01
This report presents the findings and recommendations based on the Identification of Abnormal Accident Patterns at Intersections. This project used a statistically valid sampling method to determine whether a specific intersection has an abnormally h...
Coble, M D; Buckleton, J; Butler, J M; Egeland, T; Fimmers, R; Gill, P; Gusmão, L; Guttman, B; Krawczak, M; Morling, N; Parson, W; Pinto, N; Schneider, P M; Sherry, S T; Willuweit, S; Prinz, M
2016-11-01
The use of biostatistical software programs to assist in data interpretation and calculate likelihood ratios is essential to forensic geneticists and part of the daily case work flow for both kinship and DNA identification laboratories. Previous recommendations issued by the DNA Commission of the International Society for Forensic Genetics (ISFG) covered the application of bio-statistical evaluations for STR typing results in identification and kinship cases, and this is now being expanded to provide best practices regarding validation and verification of the software required for these calculations. With larger multiplexes, more complex mixtures, and increasing requests for extended family testing, laboratories are relying more than ever on specific software solutions and sufficient validation, training and extensive documentation are of upmost importance. Here, we present recommendations for the minimum requirements to validate bio-statistical software to be used in forensic genetics. We distinguish between developmental validation and the responsibilities of the software developer or provider, and the internal validation studies to be performed by the end user. Recommendations for the software provider address, for example, the documentation of the underlying models used by the software, validation data expectations, version control, implementation and training support, as well as continuity and user notifications. For the internal validations the recommendations include: creating a validation plan, requirements for the range of samples to be tested, Standard Operating Procedure development, and internal laboratory training and education. To ensure that all laboratories have access to a wide range of samples for validation and training purposes the ISFG DNA commission encourages collaborative studies and public repositories of STR typing results. Published by Elsevier Ireland Ltd.
External validation of a Cox prognostic model: principles and methods
2013-01-01
Background A prognostic model should not enter clinical practice unless it has been demonstrated that it performs a useful role. External validation denotes evaluation of model performance in a sample independent of that used to develop the model. Unlike for logistic regression models, external validation of Cox models is sparsely treated in the literature. Successful validation of a model means achieving satisfactory discrimination and calibration (prediction accuracy) in the validation sample. Validating Cox models is not straightforward because event probabilities are estimated relative to an unspecified baseline function. Methods We describe statistical approaches to external validation of a published Cox model according to the level of published information, specifically (1) the prognostic index only, (2) the prognostic index together with Kaplan-Meier curves for risk groups, and (3) the first two plus the baseline survival curve (the estimated survival function at the mean prognostic index across the sample). The most challenging task, requiring level 3 information, is assessing calibration, for which we suggest a method of approximating the baseline survival function. Results We apply the methods to two comparable datasets in primary breast cancer, treating one as derivation and the other as validation sample. Results are presented for discrimination and calibration. We demonstrate plots of survival probabilities that can assist model evaluation. Conclusions Our validation methods are applicable to a wide range of prognostic studies and provide researchers with a toolkit for external validation of a published Cox model. PMID:23496923
Mindful attention and awareness: relationships with psychopathology and emotion regulation.
Gregório, Sónia; Pinto-Gouveia, José
2013-01-01
The growing interest in mindfulness from the scientific community has originated several self-report measures of this psychological construct. The Mindful Attention and Awareness Scale (MAAS) is a self-report measure of mindfulness at a trait-level. This paper aims at exploring MAAS psychometric characteristics and validating it for the Portuguese population. The first two studies replicate some of the original author's statistical procedures in two different samples from the Portuguese general community population, in particular confirmatory factor analyses. Results from both analyses confirmed the scale single-factor structure and indicated a very good reliability. Moreover, cross-validation statistics showed that this single-factor structure is valid for different respondents from the general community population. In the third study the Portuguese version of the MAAS was found to have good convergent and discriminant validities. Overall the findings support the psychometric validity of the Portuguese version of MAAS and suggest this is a reliable self-report measure of trait-mindfulness, a central construct in Clinical Psychology research and intervention fields.
ERIC Educational Resources Information Center
Mattern, Krista D.; Patterson, Brian F.
2012-01-01
The College Board formed a research consortium with four-year colleges and universities to build a national higher education database with the primary goal of validating the revised SAT®, which consists of three sections: critical reading (SAT-CR), mathematics (SAT-M), and writing (SAT-W), for use in college admission. A study by Mattern and…
Hippisley-Cox, Julia; Coupland, Carol; Brindle, Peter
2014-01-01
Objectives To validate the performance of a set of risk prediction algorithms developed using the QResearch database, in an independent sample from general practices contributing to the Clinical Research Data Link (CPRD). Setting Prospective open cohort study using practices contributing to the CPRD database and practices contributing to the QResearch database. Participants The CPRD validation cohort consisted of 3.3 million patients, aged 25–99 years registered at 357 general practices between 1 Jan 1998 and 31 July 2012. The validation statistics for QResearch were obtained from the original published papers which used a one-third sample of practices separate to those used to derive the score. A cohort from QResearch was used to compare incidence rates and baseline characteristics and consisted of 6.8 million patients from 753 practices registered between 1 Jan 1998 and until 31 July 2013. Outcome measures Incident events relating to seven different risk prediction scores: QRISK2 (cardiovascular disease); QStroke (ischaemic stroke); QDiabetes (type 2 diabetes); QFracture (osteoporotic fracture and hip fracture); QKidney (moderate and severe kidney failure); QThrombosis (venous thromboembolism); QBleed (intracranial bleed and upper gastrointestinal haemorrhage). Measures of discrimination and calibration were calculated. Results Overall, the baseline characteristics of the CPRD and QResearch cohorts were similar though QResearch had higher recording levels for ethnicity and family history. The validation statistics for each of the risk prediction scores were very similar in the CPRD cohort compared with the published results from QResearch validation cohorts. For example, in women, the QDiabetes algorithm explained 50% of the variation within CPRD compared with 51% on QResearch and the receiver operator curve value was 0.85 on both databases. The scores were well calibrated in CPRD. Conclusions Each of the algorithms performed practically as well in the external independent CPRD validation cohorts as they had in the original published QResearch validation cohorts. PMID:25168040
Guo, Ying; Little, Roderick J; McConnell, Daniel S
2012-01-01
Covariate measurement error is common in epidemiologic studies. Current methods for correcting measurement error with information from external calibration samples are insufficient to provide valid adjusted inferences. We consider the problem of estimating the regression of an outcome Y on covariates X and Z, where Y and Z are observed, X is unobserved, but a variable W that measures X with error is observed. Information about measurement error is provided in an external calibration sample where data on X and W (but not Y and Z) are recorded. We describe a method that uses summary statistics from the calibration sample to create multiple imputations of the missing values of X in the regression sample, so that the regression coefficients of Y on X and Z and associated standard errors can be estimated using simple multiple imputation combining rules, yielding valid statistical inferences under the assumption of a multivariate normal distribution. The proposed method is shown by simulation to provide better inferences than existing methods, namely the naive method, classical calibration, and regression calibration, particularly for correction for bias and achieving nominal confidence levels. We also illustrate our method with an example using linear regression to examine the relation between serum reproductive hormone concentrations and bone mineral density loss in midlife women in the Michigan Bone Health and Metabolism Study. Existing methods fail to adjust appropriately for bias due to measurement error in the regression setting, particularly when measurement error is substantial. The proposed method corrects this deficiency.
ERIC Educational Resources Information Center
Mandy, William; Charman, Tony; Puura, Kaija; Skuse, David
2014-01-01
The recent "Diagnostic and Statistical Manual of Mental Disorders-Fifth Edition" ("DSM-5") reformulation of autism spectrum disorder has received empirical support from North American and UK samples. Autism spectrum disorder is an increasingly global diagnosis, and research is needed to discover how well it generalises beyond…
Validity, Reliability and Difficulty Indices for Instructor-Built Exam Questions
ERIC Educational Resources Information Center
Jandaghi, Gholamreza; Shaterian, Fatemeh
2008-01-01
The purpose of the research is to determine college Instructor's skill rate in designing exam questions in chemistry subject. The statistical population was all of chemistry exam sheets for two semesters in one academic year from which a sample of 364 exam sheets was drawn using multistage cluster sampling. Two experts assessed the sheets and by…
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-25
... required by subparagraph (A); or (ii) determine industry support using a statistically valid sampling... provided for convenience and customs purposes. The written description of the scope of the investigation is...
Lin, Yu-Pin; Chu, Hone-Jay; Huang, Yu-Long; Tang, Chia-Hsi; Rouhani, Shahrokh
2011-06-01
This study develops a stratified conditional Latin hypercube sampling (scLHS) approach for multiple, remotely sensed, normalized difference vegetation index (NDVI) images. The objective is to sample, monitor, and delineate spatiotemporal landscape changes, including spatial heterogeneity and variability, in a given area. The scLHS approach, which is based on the variance quadtree technique (VQT) and the conditional Latin hypercube sampling (cLHS) method, selects samples in order to delineate landscape changes from multiple NDVI images. The images are then mapped for calibration and validation by using sequential Gaussian simulation (SGS) with the scLHS selected samples. Spatial statistical results indicate that in terms of their statistical distribution, spatial distribution, and spatial variation, the statistics and variograms of the scLHS samples resemble those of multiple NDVI images more closely than those of cLHS and VQT samples. Moreover, the accuracy of simulated NDVI images based on SGS with scLHS samples is significantly better than that of simulated NDVI images based on SGS with cLHS samples and VQT samples, respectively. However, the proposed approach efficiently monitors the spatial characteristics of landscape changes, including the statistics, spatial variability, and heterogeneity of NDVI images. In addition, SGS with the scLHS samples effectively reproduces spatial patterns and landscape changes in multiple NDVI images.
Experimental Design in Clinical 'Omics Biomarker Discovery.
Forshed, Jenny
2017-11-03
This tutorial highlights some issues in the experimental design of clinical 'omics biomarker discovery, how to avoid bias and get as true quantities as possible from biochemical analyses, and how to select samples to improve the chance of answering the clinical question at issue. This includes the importance of defining clinical aim and end point, knowing the variability in the results, randomization of samples, sample size, statistical power, and how to avoid confounding factors by including clinical data in the sample selection, that is, how to avoid unpleasant surprises at the point of statistical analysis. The aim of this Tutorial is to help translational clinical and preclinical biomarker candidate research and to improve the validity and potential of future biomarker candidate findings.
Amini, Mehdi; Pourshahbaz, Abbas; Mohammadkhani, Parvaneh; Ardakani, Mohammad-Reza Khodaie; Lotfi, Mozhgan
2014-12-01
The goal of this study was to examine the construct validity of the diagnostic and statistical manual of mental disorder-5 (DSM-5) conceptual model of antisocial and borderline personality disorders (PDs). More specifically, the aim was to determine whether the DSM-5 five-factor structure of pathological personality trait domains replicated in an independently collected sample that differs culturally from the derivation sample. This study was on a sample of 346 individuals with antisocial (n = 122) and borderline PD (n = 130), and nonclinical subjects (n = 94). Participants randomly selected from prisoners, out-patient, and in-patient clients. Participants were recruited from Tehran prisoners, and clinical psychology and psychiatry clinics of Razi and Taleghani Hospital, Tehran, Iran. The SCID-II-PQ, SCID-II, DSM-5 Personality Trait Rating Form (Clinician's PTRF) were used to diagnosis of PD and to assessment of pathological traits. The data were analyzed by exploratory factor analysis. Factor analysis revealed a 5-factor solution for DSM-5 personality traits. Results showed that DSM-5 has adequate construct validity in Iranian sample with antisocial and borderline PDs. Factors similar in number with the other studies, but different in the content. Exploratory factor analysis revealed five homogeneous components of antisocial and borderline PDs. That may represent personality, behavioral, and affective features central to the disorder. Furthermore, the present study helps understand the adequacy of DSM-5 dimensional approach to evaluation of personality pathology, specifically on Iranian sample.
Smith, Ashlee L.; Sun, Mai; Bhargava, Rohit; Stewart, Nicolas A.; Flint, Melanie S.; Bigbee, William L.; Krivak, Thomas C.; Strange, Mary A.; Cooper, Kristine L.; Zorn, Kristin K.
2013-01-01
Objective: The biology of high grade serous ovarian carcinoma (HGSOC) is poorly understood. Little has been reported on intratumoral homogeneity or heterogeneity of primary HGSOC tumors and their metastases. We evaluated the global protein expression profiles of paired primary and metastatic HGSOC from formalin-fixed, paraffin-embedded (FFPE) tissue samples. Methods: After IRB approval, six patients with advanced HGSOC were identified with tumor in both ovaries at initial surgery. Laser capture microdissection (LCM) was used to extract tumor for protein digestion. Peptides were extracted and analyzed by reversed-phase liquid chromatography coupled to a linear ion trap mass spectrometer. Tandem mass spectra were searched against the UniProt human protein database. Differences in protein abundance between samples were assessed and analyzed by Ingenuity Pathway Analysis software. Immunohistochemistry (IHC) for select proteins from the original and an additional validation set of five patients was performed. Results: Unsupervised clustering of the abundance profiles placed the paired specimens adjacent to each other. IHC H-score analysis of the validation set revealed a strong correlation between paired samples for all proteins. For the similarly expressed proteins, the estimated correlation coefficients in two of three experimental samples and all validation samples were statistically significant (p < 0.05). The estimated correlation coefficients in the experimental sample proteins classified as differentially expressed were not statistically significant. Conclusion: A global proteomic screen of primary HGSOC tumors and their metastatic lesions identifies tumoral homogeneity and heterogeneity and provides preliminary insight into these protein profiles and the cellular pathways they constitute. PMID:28250404
Smith, Otto R F; Alves, Daniele E; Knapstad, Marit; Haug, Ellen; Aarø, Leif E
2017-05-12
Mental well-being is an important, yet understudied, area of research, partly due to lack of appropriate population-based measures. The Warwick-Edinburgh Mental Well-being Scale (WEMWBS) was developed to meet the needs for such a measure. This article assesses the psychometric properties of the Norwegian version of the WEMWBS, and its short-version (SWEMWBS) among a sample of primary health care patients who participated in the evaluation of Prompt Mental Health Care (PMHC), a novel Norwegian mental health care program aimed to increase access to treatment for anxiety and depression. Forward and back-translations were conducted, and 1168 patients filled out an electronic survey including the WEMWBS, and other mental health scales. The original dataset was randomly divided into a training sample (≈70%) and a validation sample (≈30%). Parallel analysis and confirmatory factor analysis were carried out to assess construct validity and precision. The final models were cross-validated in the validation sample by specifying a model with fixed parameters based on the estimates from the trainings set. Criterion validity and measurement invariance of the (S)WEMWBS were examined as well. Support was found for the single factor hypothesis in both scales, but similar to previous studies, only after a number of residuals were allowed to correlate (WEMWBS: CFI = 0.99; RMSEA = 0.06, SWEMWBS: CFI = .99; RMSEA = 0.06). Further analyses showed that the correlated residuals did not alter the meaning of the underlying construct and did not substantially affect the associations with other variables. Precision was high for both versions of the WEMWBS (>.80), and scalar measurement invariance was obtained for gender and age group. The final measurement models displayed adequate fit statistics in the validation sample as well. Correlations with other mental health scales were largely in line with expectations. No statistically significant differences were found in mean latent (S)WEMWBS scores for age and gender. Both WEMWBS scales appear to be valid and precise instruments to measure mental well-being in primary health care patients. The results encourage the use of mental well-being as an outcome in future epidemiological, clinical, and evaluation studies, and may as such be valuable for both research and public health practice.
Dobecki, Marek
2012-01-01
This paper reviews the requirements for measurement methods of chemical agents in the air at workstations. European standards, which have a status of Polish standards, comprise some requirements and information on sampling strategy, measuring techniques, type of samplers, sampling pumps and methods of occupational exposure evaluation at a given technological process. Measurement methods, including air sampling and analytical procedure in a laboratory, should be appropriately validated before intended use. In the validation process, selected methods are tested and budget of uncertainty is set up. The validation procedure that should be implemented in the laboratory together with suitable statistical tools and major components of uncertainity to be taken into consideration, were presented in this paper. Methods of quality control, including sampling and laboratory analyses were discussed. Relative expanded uncertainty for each measurement expressed as a percentage, should not exceed the limit of values set depending on the type of occupational exposure (short-term or long-term) and the magnitude of exposure to chemical agents in the work environment.
Xu, Stanley; Clarke, Christina L; Newcomer, Sophia R; Daley, Matthew F; Glanz, Jason M
2018-05-16
Vaccine safety studies are often electronic health record (EHR)-based observational studies. These studies often face significant methodological challenges, including confounding and misclassification of adverse event. Vaccine safety researchers use self-controlled case series (SCCS) study design to handle confounding effect and employ medical chart review to ascertain cases that are identified using EHR data. However, for common adverse events, limited resources often make it impossible to adjudicate all adverse events observed in electronic data. In this paper, we considered four approaches for analyzing SCCS data with confirmation rates estimated from an internal validation sample: (1) observed cases, (2) confirmed cases only, (3) known confirmation rate, and (4) multiple imputation (MI). We conducted a simulation study to evaluate these four approaches using type I error rates, percent bias, and empirical power. Our simulation results suggest that when misclassification of adverse events is present, approaches such as observed cases, confirmed case only, and known confirmation rate may inflate the type I error, yield biased point estimates, and affect statistical power. The multiple imputation approach considers the uncertainty of estimated confirmation rates from an internal validation sample, yields a proper type I error rate, largely unbiased point estimate, proper variance estimate, and statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Naeem, Naghma; Muijtjens, Arno
2015-04-01
The psychological construct of emotional intelligence (EI), its theoretical models, measurement instruments and applications have been the subject of several research studies in health professions education. The objective of the current study was to investigate the factorial validity and reliability of a bilingual version of the Schutte Self Report Emotional Intelligence Scale (SSREIS) in an undergraduate Arab medical student population. The study was conducted during April-May 2012. A cross-sectional survey design was employed. A sample (n = 467) was obtained from undergraduate medical students belonging to the male and female medical college of King Saud University, Riyadh, Saudi Arabia. Exploratory and confirmatory factor analysis was performed using SPSS 16.0 and AMOS 4.0 statistical software to determine the factor structure. Reliability was determined using Cronbach's alpha statistics. The results obtained using an undergraduate Arab medical student sample supported a multidimensional; three factor structure of the SSREIS. The three factors are Optimism, Awareness-of-Emotions and Use-of-Emotions. The reliability (Cronbach's alpha) for the three subscales was 0.76, 0.72 and 0.55, respectively. Emotional intelligence is a multifactorial construct (three factors). The bilingual version of the SSREIS is a valid and reliable measure of trait emotional intelligence in an undergraduate Arab medical student population.
El Khattabi, Laïla Allach; Rouillac-Le Sciellour, Christelle; Le Tessier, Dominique; Luscan, Armelle; Coustier, Audrey; Porcher, Raphael; Bhouri, Rakia; Nectoux, Juliette; Sérazin, Valérie; Quibel, Thibaut; Mandelbrot, Laurent; Tsatsaris, Vassilis; Vialard, François; Dupont, Jean-Michel
2016-01-01
NIPT for fetal aneuploidy by digital PCR has been hampered by the large number of PCR reactions needed to meet statistical requirements, preventing clinical application. Here, we designed an octoplex droplet digital PCR (ddPCR) assay which allows increasing the number of available targets and thus overcomes statistical obstacles. After technical optimization of the multiplex PCR on mixtures of trisomic and euploid DNA, we performed a validation study on samples of plasma DNA from 213 pregnant women. Molecular counting of circulating cell-free DNA was performed using a mix of hydrolysis probes targeting chromosome 21 and a reference chromosome. The results of our validation experiments showed that ddPCR detected trisomy 21 even when the sample's trisomic DNA content is as low as 5%. In a validation study of plasma samples from 213 pregnant women, ddPCR discriminated clearly between the trisomy 21 and the euploidy groups. Our results demonstrate that digital PCR can meet the requirements for non-invasive prenatal testing of trisomy 21. This approach is technically simple, relatively cheap, easy to implement in a diagnostic setting and compatible with ethical concerns regarding access to nucleotide sequence information. These advantages make it a potential technique of choice for population-wide screening for trisomy 21 in pregnant women.
Detection of overreported psychopathology with the MMPI-2-RF [corrected] validity scales.
Sellbom, Martin; Bagby, R Michael
2010-12-01
We examined the utility of the validity scales on the recently released Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2 RF; Ben-Porath & Tellegen, 2008) to detect overreported psychopathology. This set of validity scales includes a newly developed scale and revised versions of the original MMPI-2 validity scales. We used an analogue, experimental simulation in which MMPI-2 RF responses (derived from archived MMPI-2 protocols) of undergraduate students instructed to overreport psychopathology (in either a coached or noncoached condition) were compared with those of psychiatric inpatients who completed the MMPI-2 under standardized instructions. The MMPI-2 RF validity scale Infrequent Psychopathology Responses best differentiated the simulation groups from the sample of patients, regardless of experimental condition. No other validity scale added consistent incremental predictive utility to Infrequent Psychopathology Responses in distinguishing the simulation groups from the sample of patients. Classification accuracy statistics confirmed the recommended cut scores in the MMPI-2 RF manual (Ben-Porath & Tellegen, 2008).
75 FR 80563 - Agency Information Collection Activities: Proposed Request and Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2010-12-22
.... The respondents are a statistically valid sample of all RSI/DI beneficiaries in current pay status or... SSN. We send this form to the employer to identify the employees involved, to resolve the discrepancy...
Skates, Steven J.; Gillette, Michael A.; LaBaer, Joshua; Carr, Steven A.; Anderson, N. Leigh; Liebler, Daniel C.; Ransohoff, David; Rifai, Nader; Kondratovich, Marina; Težak, Živana; Mansfield, Elizabeth; Oberg, Ann L.; Wright, Ian; Barnes, Grady; Gail, Mitchell; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Boja, Emily S.
2014-01-01
Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC), with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance, and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step towards building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research. PMID:24063748
Skates, Steven J; Gillette, Michael A; LaBaer, Joshua; Carr, Steven A; Anderson, Leigh; Liebler, Daniel C; Ransohoff, David; Rifai, Nader; Kondratovich, Marina; Težak, Živana; Mansfield, Elizabeth; Oberg, Ann L; Wright, Ian; Barnes, Grady; Gail, Mitchell; Mesri, Mehdi; Kinsinger, Christopher R; Rodriguez, Henry; Boja, Emily S
2013-12-06
Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor, and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung, and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC) with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step toward building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research.
ERIC Educational Resources Information Center
Mattern, Krista D.; Patterson, Brian F.
2013-01-01
The College Board formed a research consortium with four-year colleges and universities to build a national higher education database with the primary goal of validating the revised SAT for use in college admission. A study by Mattern and Patterson (2009) examined the relationship between SAT scores and retention to the second year. The sample…
ERIC Educational Resources Information Center
Mattern, Krista D.; Patterson, Brian F.
2012-01-01
The College Board formed a research consortium with four-year colleges and universities to build a national higher education database with the primary goal of validating the revised SAT for use in college admission. A study by Mattern and Patterson (2009) examined the relationship between SAT scores and retention to the second year of college. The…
Szyda, Joanna; Liu, Zengting; Zatoń-Dobrowolska, Magdalena; Wierzbicki, Heliodor; Rzasa, Anna
2008-01-01
We analysed data from a selective DNA pooling experiment with 130 individuals of the arctic fox (Alopex lagopus), which originated from 2 different types regarding body size. The association between alleles of 6 selected unlinked molecular markers and body size was tested by using univariate and multinomial logistic regression models, applying odds ratio and test statistics from the power divergence family. Due to the small sample size and the resulting sparseness of the data table, in hypothesis testing we could not rely on the asymptotic distributions of the tests. Instead, we tried to account for data sparseness by (i) modifying confidence intervals of odds ratio; (ii) using a normal approximation of the asymptotic distribution of the power divergence tests with different approaches for calculating moments of the statistics; and (iii) assessing P values empirically, based on bootstrap samples. As a result, a significant association was observed for 3 markers. Furthermore, we used simulations to assess the validity of the normal approximation of the asymptotic distribution of the test statistics under the conditions of small and sparse samples.
Selecting the "Best" Factor Structure and Moving Measurement Validation Forward: An Illustration.
Schmitt, Thomas A; Sass, Daniel A; Chappelle, Wayne; Thompson, William
2018-04-09
Despite the broad literature base on factor analysis best practices, research seeking to evaluate a measure's psychometric properties frequently fails to consider or follow these recommendations. This leads to incorrect factor structures, numerous and often overly complex competing factor models and, perhaps most harmful, biased model results. Our goal is to demonstrate a practical and actionable process for factor analysis through (a) an overview of six statistical and psychometric issues and approaches to be aware of, investigate, and report when engaging in factor structure validation, along with a flowchart for recommended procedures to understand latent factor structures; (b) demonstrating these issues to provide a summary of the updated Posttraumatic Stress Disorder Checklist (PCL-5) factor models and a rationale for validation; and (c) conducting a comprehensive statistical and psychometric validation of the PCL-5 factor structure to demonstrate all the issues we described earlier. Considering previous research, the PCL-5 was evaluated using a sample of 1,403 U.S. Air Force remotely piloted aircraft operators with high levels of battlefield exposure. Previously proposed PCL-5 factor structures were not supported by the data, but instead a bifactor model is arguably more statistically appropriate.
Li, Jie; Stroebe, Magaret; Chan, Cecilia L W; Chow, Amy Y M
2017-06-01
The rationale, development, and validation of the Bereavement Guilt Scale (BGS) are described in this article. The BGS was based on a theoretically developed, multidimensional conceptualization of guilt. Part 1 describes the generation of the item pool, derived from in-depth interviews, and review of the scientific literature. Part 2 details statistical analyses for further item selection (Sample 1, N = 273). Part 3 covers the psychometric properties of the emergent-BGS (Sample 2, N = 600, and Sample 3, N = 479). Confirmatory factor analysis indicated that a five-factor model fit the data best. Correlations of BGS scores with depression, anxiety, self-esteem, self-forgiveness, and mode of death were consistent with theoretical predictions, supporting the construct validity of the measure. The internal consistency and test-retest reliability were also supported. Thus, initial testing or examination suggests that the BGS is a valid tool to assess multiple components of bereavement guilt. Further psychometric testing across cultures is recommended.
Confidence crisis of results in biomechanics research.
Knudson, Duane
2017-11-01
Many biomechanics studies have small sample sizes and incorrect statistical analyses, so reporting of inaccurate inferences and inflated magnitude of effects are common in the field. This review examines these issues in biomechanics research and summarises potential solutions from research in other fields to increase the confidence in the experimental effects reported in biomechanics. Authors, reviewers and editors of biomechanics research reports are encouraged to improve sample sizes and the resulting statistical power, improve reporting transparency, improve the rigour of statistical analyses used, and increase the acceptance of replication studies to improve the validity of inferences from data in biomechanics research. The application of sports biomechanics research results would also improve if a larger percentage of unbiased effects and their uncertainty were reported in the literature.
Safety belt and motorcycle helmet use in Virginia : the December 2003 update.
DOT National Transportation Integrated Search
2004-01-01
The Virginia Transportation Research Council has been collecting safety belt use data in Virginia since 1974. Beginning in 1992, the data gathering methodology was changed to a statistically valid probability-based sampling plan in accordance with fe...
A content validated questionnaire for assessment of self reported venous blood sampling practices
2012-01-01
Background Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. Findings We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. Conclusions The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward. PMID:22260505
A content validated questionnaire for assessment of self reported venous blood sampling practices.
Bölenius, Karin; Brulin, Christine; Grankvist, Kjell; Lindkvist, Marie; Söderberg, Johan
2012-01-19
Venous blood sampling is a common procedure in health care. It is strictly regulated by national and international guidelines. Deviations from guidelines due to human mistakes can cause patient harm. Validated questionnaires for health care personnel can be used to assess preventable "near misses"--i.e. potential errors and nonconformities during venous blood sampling practices that could transform into adverse events. However, no validated questionnaire that assesses nonconformities in venous blood sampling has previously been presented. The aim was to test a recently developed questionnaire in self reported venous blood sampling practices for validity and reliability. We developed a questionnaire to assess deviations from best practices during venous blood sampling. The questionnaire contained questions about patient identification, test request management, test tube labeling, test tube handling, information search procedures and frequencies of error reporting. For content validity, the questionnaire was confirmed by experts on questionnaires and venous blood sampling. For reliability, test-retest statistics were used on the questionnaire answered twice. The final venous blood sampling questionnaire included 19 questions out of which 9 had in total 34 underlying items. It was found to have content validity. The test-retest analysis demonstrated that the items were generally stable. In total, 82% of the items fulfilled the reliability acceptance criteria. The questionnaire could be used for assessment of "near miss" practices that could jeopardize patient safety and gives several benefits instead of assessing rare adverse events only. The higher frequencies of "near miss" practices allows for quantitative analysis of the effect of corrective interventions and to benchmark preanalytical quality not only at the laboratory/hospital level but also at the health care unit/hospital ward.
Adams, James; Kruger, Uwe; Geis, Elizabeth; Gehn, Eva; Fimbres, Valeria; Pollard, Elena; Mitchell, Jessica; Ingram, Julie; Hellmers, Robert; Quig, David; Hahn, Juergen
2017-01-01
Introduction A number of previous studies examined a possible association of toxic metals and autism, and over half of those studies suggest that toxic metal levels are different in individuals with Autism Spectrum Disorders (ASD). Additionally, several studies found that those levels correlate with the severity of ASD. Methods In order to further investigate these points, this paper performs the most detailed statistical analysis to date of a data set in this field. First morning urine samples were collected from 67 children and adults with ASD and 50 neurotypical controls of similar age and gender. The samples were analyzed to determine the levels of 10 urinary toxic metals (UTM). Autism-related symptoms were assessed with eleven behavioral measures. Statistical analysis was used to distinguish participants on the ASD spectrum and neurotypical participants based upon the UTM data alone. The analysis also included examining the association of autism severity with toxic metal excretion data using linear and nonlinear analysis. “Leave-one-out” cross-validation was used to ensure statistical independence of results. Results and Discussion Average excretion levels of several toxic metals (lead, tin, thallium, antimony) were significantly higher in the ASD group. However, ASD classification using univariate statistics proved difficult due to large variability, but nonlinear multivariate statistical analysis significantly improved ASD classification with Type I/II errors of 15% and 18%, respectively. These results clearly indicate that the urinary toxic metal excretion profiles of participants in the ASD group were significantly different from those of the neurotypical participants. Similarly, nonlinear methods determined a significantly stronger association between the behavioral measures and toxic metal excretion. The association was strongest for the Aberrant Behavior Checklist (including subscales on Irritability, Stereotypy, Hyperactivity, and Inappropriate Speech), but significant associations were found for UTM with all eleven autism-related assessments with cross-validation R2 values ranging from 0.12–0.48. PMID:28068407
Raman spectroscopy-based screening of IgM positive and negative sera for dengue virus infection
NASA Astrophysics Data System (ADS)
Bilal, M.; Saleem, M.; Bilal, Maria; Ijaz, T.; Khan, Saranjam; Ullah, Rahat; Raza, A.; Khurram, M.; Akram, W.; Ahmed, M.
2016-11-01
A statistical method based on Raman spectroscopy for the screening of immunoglobulin M (IgM) in dengue virus (DENV) infected human sera is presented. In total, 108 sera samples were collected and their antibody indexes (AI) for IgM were determined through enzyme-linked immunosorbent assay (ELISA). Raman spectra of these samples were acquired using a 785 nm wavelength excitation laser. Seventy-eight Raman spectra were selected randomly and unbiasedly for the development of a statistical model using partial least square (PLS) regression, while the remaining 30 were used for testing the developed model. An R-square (r 2) value of 0.929 was determined using the leave-one-sample-out (LOO) cross validation method, showing the validity of this model. It considers all molecular changes related to IgM concentration, and describes their role in infection. A graphical user interface (GUI) platform has been developed to run a developed multivariate model for the prediction of AI of IgM for blindly tested samples, and an excellent agreement has been found between model predicted and clinically determined values. Parameters like sensitivity, specificity, accuracy, and area under receiver operator characteristic (ROC) curve for these tested samples are also reported to visualize model performance.
A Science and Risk-Based Pragmatic Methodology for Blend and Content Uniformity Assessment.
Sayeed-Desta, Naheed; Pazhayattil, Ajay Babu; Collins, Jordan; Doshi, Chetan
2018-04-01
This paper describes a pragmatic approach that can be applied in assessing powder blend and unit dosage uniformity of solid dose products at Process Design, Process Performance Qualification, and Continued/Ongoing Process Verification stages of the Process Validation lifecycle. The statistically based sampling, testing, and assessment plan was developed due to the withdrawal of the FDA draft guidance for industry "Powder Blends and Finished Dosage Units-Stratified In-Process Dosage Unit Sampling and Assessment." This paper compares the proposed Grouped Area Variance Estimate (GAVE) method with an alternate approach outlining the practicality and statistical rationalization using traditional sampling and analytical methods. The approach is designed to fit solid dose processes assuring high statistical confidence in both powder blend uniformity and dosage unit uniformity during all three stages of the lifecycle complying with ASTM standards as recommended by the US FDA.
Zhang, Xin; Wu, Yuxia; Ren, Pengwei; Liu, Xueting; Kang, Deying
2015-10-30
To explore the relationship between the external validity and the internal validity of hypertension RCTs conducted in China. Comprehensive literature searches were performed in Medline, Embase, Cochrane Central Register of Controlled Trials (CCTR), CBMdisc (Chinese biomedical literature database), CNKI (China National Knowledge Infrastructure/China Academic Journals Full-text Database) and VIP (Chinese scientific journals database) as well as advanced search strategies were used to locate hypertension RCTs. The risk of bias in RCTs was assessed by a modified scale, Jadad scale respectively, and then studies with 3 or more grading scores were included for the purpose of evaluating of external validity. A data extract form including 4 domains and 25 items was used to explore relationship of the external validity and the internal validity. Statistic analyses were performed by using SPSS software, version 21.0 (SPSS, Chicago, IL). 226 hypertension RCTs were included for final analysis. RCTs conducted in university affiliated hospitals (P < 0.001) or secondary/tertiary hospitals (P < 0.001) were scored at higher internal validity. Multi-center studies (median = 4.0, IQR = 2.0) were scored higher internal validity score than single-center studies (median = 3.0, IQR = 1.0) (P < 0.001). Funding-supported trials had better methodological quality (P < 0.001). In addition, the reporting of inclusion criteria also leads to better internal validity (P = 0.004). Multivariate regression indicated sample size, industry-funding, quality of life (QOL) taken as measure and the university affiliated hospital as trial setting had statistical significance (P < 0.001, P < 0.001, P = 0.001, P = 0.006 respectively). Several components relate to the external validity of RCTs do associate with the internal validity, that do not stand in an easy relationship to each other. Regarding the poor reporting, other possible links between two variables need to trace in the future methodological researches.
Précis of statistical significance: rationale, validity, and utility.
Chow, S L
1998-04-01
The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.
Validation of a novel saliva-based ELISA test for diagnosing tapeworm burden in horses.
Lightbody, Kirsty L; Davis, Paul J; Austin, Corrine J
2016-06-01
Tapeworm infections pose a significant threat to equine health as they are associated with clinical cases of colic. Diagnosis of tapeworm burden using fecal egg counts (FECs) is unreliable, and, although a commercial serologic ELISA for anti-tapeworm antibodies is available, it requires a veterinarian to collect the blood sample. A reliable diagnostic test using an owner-accessible sample such as saliva could provide a cost-effective alternative for tapeworm testing in horses, and allow targeted deworming strategies. The purpose of the study was to statistically validate a saliva tapeworm ELISA test and compare to a tapeworm-specific IgG(T) serologic ELISA. Serum samples (139) and matched saliva samples (104) were collected from horses at a UK abattoir. The ileocecal junction and cecum were visually examined for tapeworms and any present were counted. Samples were analyzed using a serologic ELISA and the saliva tapeworm test. The test results were compared to tapeworm numbers and the various data sets were statistically analyzed. Saliva scores had strong positive correlations with both infection intensity (0.74) and serologic results (Spearman's rank coefficients; 0.74 and 0.86, respectively). The saliva tapeworm test was capable of identifying the presence of one or more tapeworms with 83% sensitivity and 85% specificity. Importantly, no high-burden (more than 20 tapeworms) horses were misdiagnosed. The saliva tapeworm test has statistical accuracy for detecting tapeworm burdens in horses with 83% sensitivity and 85% specificity, similar to those of the serologic ELISA (85% and 78%, respectively). © 2016 American Society for Veterinary Clinical Pathology.
Validation of verbal autopsy: determination of cause of deaths in Malaysia 2013.
Ganapathy, Shubash Shander; Yi Yi, Khoo; Omar, Mohd Azahadi; Anuar, Mohamad Fuad Mohamad; Jeevananthan, Chandrika; Rao, Chalapati
2017-08-11
Mortality statistics by age, sex and cause are the foundation of basic health data required for health status assessment, epidemiological research and formation of health policy. Close to half the deaths in Malaysia occur outside a health facility, are not attended by medical personnel, and are given a lay opinion as to the cause of death, leading to poor quality of data from vital registration. Verbal autopsy (VA) is a very useful tool in diagnosing broad causes of deaths for events that occur outside health facilities. This article reports the development of the VA methods and our principal finding from a validation study. A cross sectional study on nationally representative sample deaths that occurred in Malaysia during 2013 was used. A VA questionnaire suitable for local use was developed. Trained field interviewers visited the family members of the deceased at their homes and conducted face to face interviews with the next of kin. Completed questionnaires were reviewed by trained physicians who assigned multiple and underlying causes. Reference diagnoses for validation were obtained from review of medical records (MR) available for a sample of the overall study deaths. Corresponding MR diagnosis with matched sample of the VA diagnosis were available in 2172 cases for the validation study. Sensitivity scores were good (>75%) for transport accidents and certain cancers. Moderate sensitivity (50% - 75%) was obtained for ischaemic heart disease (64%) and cerebrovascular disease (72%). The validation sample for deaths due to major causes such as ischaemic heart disease, pneumonia, breast cancer and transport accidents show low cause-specific mortality fraction (CSMF) changes. The scores obtained for the top 10 leading site-specific cancers ranged from average to good. We can conclude that VA is suitable for implementation for deaths outside the health facilities in Malaysia. This would reduce ill-defined mortality causes in vital registration data, and yield more accurate national mortality statistics.
Validating a biometric authentication system: sample size requirements.
Dass, Sarat C; Zhu, Yongfang; Jain, Anil K
2006-12-01
Authentication systems based on biometric features (e.g., fingerprint impressions, iris scans, human face images, etc.) are increasingly gaining widespread use and popularity. Often, vendors and owners of these commercial biometric systems claim impressive performance that is estimated based on some proprietary data. In such situations, there is a need to independently validate the claimed performance levels. System performance is typically evaluated by collecting biometric templates from n different subjects, and for convenience, acquiring multiple instances of the biometric for each of the n subjects. Very little work has been done in 1) constructing confidence regions based on the ROC curve for validating the claimed performance levels and 2) determining the required number of biometric samples needed to establish confidence regions of prespecified width for the ROC curve. To simplify the analysis that address these two problems, several previous studies have assumed that multiple acquisitions of the biometric entity are statistically independent. This assumption is too restrictive and is generally not valid. We have developed a validation technique based on multivariate copula models for correlated biometric acquisitions. Based on the same model, we also determine the minimum number of samples required to achieve confidence bands of desired width for the ROC curve. We illustrate the estimation of the confidence bands as well as the required number of biometric samples using a fingerprint matching system that is applied on samples collected from a small population.
40 CFR 403.7 - Removal credits.
Code of Federal Regulations, 2011 CFR
2011-07-01
... representative of the actual operation of the POTW Treatment Plant, an alternative sampling schedule will be... statistically valid description of daily, weekly and seasonal sewage treatment plant loadings and performance... the intentional or unintentional diversion of flow from the POTW before the POTW Treatment Plant...
ERIC Educational Resources Information Center
Mattern, Krista D.; Patterson, Brian F.
2011-01-01
This report presents the findings from a replication of the analyses from the report, "Is Performance on the SAT Related to College Retention?" (Mattern & Patterson, 2009). The tables presented herein are based on the 2007 sample and the findings are largely the same as those presented in the original report, and show SAT scores are…
Pamies-Aubalat, Lidia; Quiles-Marcos, Yolanda; Núñez-Núñez, Rosa M
2013-12-01
This study examined the Dieting Peer Competitiveness Scale; it is an instrument for evaluating this social comparison in young people. This instrumental study has two aims: The objective of the first aim was to present preliminary psychometric data from the Spanish version of the Dieting Peer Competitiveness Scale, including statistical item analysis, research about this instrument's internal structure, and a reliability analysis, from a sample of 1067 secondary school adolescents. The second objective of the study corresponds to confirmatory factor analysis of the scale's internal structure, as well as analysis for evidence of validity from a sample of 1075 adolescents.
Miller, Joshua D; Few, Lauren R; Wilson, Lauren; Gentile, Brittany; Widiger, Thomas A; Mackillop, James; Keith Campbell, W
2013-09-01
The five-factor narcissism inventory (FFNI) is a new self-report measure that was developed to assess traits associated with narcissistic personality disorder (NPD), as well as grandiose and vulnerable narcissism from a five-factor model (FFM) perspective. In the current study, the FFNI was examined in relation to Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV; American Psychiatric Association, 2000) NPD, DSM-5 (http://www.dsm5.org) NPD traits, grandiose narcissism, and vulnerable narcissism in both community (N = 287) and clinical samples (N = 98). Across the samples, the FFNI scales manifested good convergent and discriminant validity such that FFNI scales derived from FFM neuroticism were primarily related to vulnerable narcissism scores, scales derived from FFM extraversion were primarily related to grandiose scores, and FFNI scales derived from FFM agreeableness were related to both narcissism dimensions, as well as the DSM-IV and DSM-5 NPD scores. The FFNI grandiose and vulnerable narcissism composites also demonstrated incremental validity in the statistical prediction of these scores, above and beyond existing measures of DSM NPD, grandiose narcissism, and vulnerable narcissism, respectively. The FFNI is a promising measure that provides a comprehensive assessment of narcissistic pathology while maintaining ties to the significant general personality literature on the FFM.
Sample size determination for disease prevalence studies with partially validated data.
Qiu, Shi-Fang; Poon, Wai-Yin; Tang, Man-Lai
2016-02-01
Disease prevalence is an important topic in medical research, and its study is based on data that are obtained by classifying subjects according to whether a disease has been contracted. Classification can be conducted with high-cost gold standard tests or low-cost screening tests, but the latter are subject to the misclassification of subjects. As a compromise between the two, many research studies use partially validated datasets in which all data points are classified by fallible tests, and some of the data points are validated in the sense that they are also classified by the completely accurate gold-standard test. In this article, we investigate the determination of sample sizes for disease prevalence studies with partially validated data. We use two approaches. The first is to find sample sizes that can achieve a pre-specified power of a statistical test at a chosen significance level, and the second is to find sample sizes that can control the width of a confidence interval with a pre-specified confidence level. Empirical studies have been conducted to demonstrate the performance of various testing procedures with the proposed sample sizes. The applicability of the proposed methods are illustrated by a real-data example. © The Author(s) 2012.
Bastiaens, Tim; Claes, Laurence; Smits, Dirk; De Clercq, Barbara; De Fruyt, Filip; Rossi, Gina; Vanwalleghem, Dominique; Vermote, Rudi; Lowyck, Benedicte; Claes, Stephan; De Hert, Marc
2016-02-01
The factor structure and the convergent validity of the Personality Inventory for DSM-5 (PID-5), a self-report questionnaire designed to measure personality pathology as advocated in the fifth edition, Section III of Diagnostic and Statistical Manual of Mental Disorders (DSM-5), are already demonstrated in general population samples, but need replication in clinical samples. In 240 Flemish inpatients, we examined the factor structure of the PID-5 by means of exploratory structural equation modeling. Additionally, we investigated differences in PID-5 higher order domain scores according to gender, age and educational level, and explored convergent and discriminant validity by relating the PID-5 with the Dimensional Assessment of Personality Pathology-Basic Questionnaire and by comparing PID-5 scores of inpatients with and without a DSM-IV categorical personality disorder diagnosis. Our results confirmed the original five-factor structure of the PID-5. The reliability and the convergent and discriminant validity of the PID-5 proved to be adequate. Implications for future research are discussed. © The Author(s) 2015.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-09
... using a statistically valid sampling method to poll the industry. Section 771(4)(A) of the Act defines... subheading 2903.39.2020. Although the HTSUS subheading and CAS registry number are provided for convenience...
Tipton, John; Hooten, Mevin B.; Goring, Simon
2017-01-01
Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.
Eisenberg, Dan T A; Kuzawa, Christopher W; Hayes, M Geoffrey
2015-01-01
Telomere length (TL) is commonly measured using quantitative PCR (qPCR). Although, easier than the southern blot of terminal restriction fragments (TRF) TL measurement method, one drawback of qPCR is that it introduces greater measurement error and thus reduces the statistical power of analyses. To address a potential source of measurement error, we consider the effect of well position on qPCR TL measurements. qPCR TL data from 3,638 people run on a Bio-Rad iCycler iQ are reanalyzed here. To evaluate measurement validity, correspondence with TRF, age, and between mother and offspring are examined. First, we present evidence for systematic variation in qPCR TL measurements in relation to thermocycler well position. Controlling for these well-position effects consistently improves measurement validity and yields estimated improvements in statistical power equivalent to increasing sample sizes by 16%. We additionally evaluated the linearity of the relationships between telomere and single copy gene control amplicons and between qPCR and TRF measures. We find that, unlike some previous reports, our data exhibit linear relationships. We introduce the standard error in percent, a superior method for quantifying measurement error as compared to the commonly used coefficient of variation. Using this measure, we find that excluding samples with high measurement error does not improve measurement validity in our study. Future studies using block-based thermocyclers should consider well position effects. Since additional information can be gleaned from well position corrections, rerunning analyses of previous results with well position correction could serve as an independent test of the validity of these results. © 2015 Wiley Periodicals, Inc.
Bourke-Taylor, Helen; Lalor, Aislinn; Farnworth, Louise; Pallant, Julie F
2014-10-01
The Health Promoting Activities Scale (HPAS) measures the frequency that mothers participate in self-selected leisure activities that promote health and wellbeing. The scale was originally validated on mothers of school-aged children with disabilities, and the current article extends this research using a comparative sample of mothers of typically developing school-aged children. Australian mothers (N = 263) completed a questionnaire containing the HPAS, a measure of depression, anxiety and stress (DASS-21) and questions concerning their weight, height, sleep quality and demographics. Statistical analysis assessed the underlying structure, internal consistency and construct validity of the HPAS. Inferential statistics were utilised to investigate the construct validity. Exploratory factor analysis supported the unidimensionality of the HPAS. It showed good internal consistency (Cronbach's alpha = 0.78). Significantly lower HPAS scores were recorded for women who were obese; had elevated levels of depression, anxiety and stress; had poor quality sleep or had heavy caring commitments. The mean HPAS score in this sample (M = 32.2) was significantly higher than was previously reported for women of children with a disability (M = 21.6: P < 0.001). Further psychometric evaluation of the HPAS continues to support the HPAS as a sound instrument that measures the frequency that women participate in meaningful occupation that is associated with differences in mental health and wellbeing and other health indicators. © 2014 Occupational Therapy Australia.
Matulis, Simone; Loos, Laura; Langguth, Nadine; Schreiber, Franziska; Gutermann, Jana; Gawrilow, Caterina; Steil, Regina
2015-01-01
Background The Trauma Symptom Checklist for Children (TSC-C) is the most widely used self-report scale to assess trauma-related symptoms in children and adolescents on six clinical scales. The purpose of the present study was to develop a German version of the TSC-C and to investigate its psychometric properties, such as factor structure, reliability, and validity, in a sample of German adolescents. Method A normative sample of N=583 and a clinical sample of N=41 adolescents with a history of physical or sexual abuse aged between 13 and 21 years participated in the study. Results The Confirmatory Factor Analysis on the six-factor model (anger, anxiety, depression, dissociation, posttraumatic stress, and sexual concerns with the subdimensions preoccupation and distress) revealed acceptable to good fit statistics in the normative sample. One item had to be excluded from the German version of the TSC-C because the factor loading was too low. All clinical scales presented acceptable to good reliability, with Cronbach's α's ranging from .80 to .86 in the normative sample and from .72 to .87 in the clinical sample. Concurrent validity was also demonstrated by the high correlations between the TSC-C scales and instruments measuring similar psychopathology. TSC-C scores reliably differentiated between adolescents with trauma history and those without trauma history, indicating discriminative validity. Conclusions In conclusion, the German version of the TSC-C is a reliable and valid instrument for assessing trauma-related symptoms on six different scales in adolescents aged between 13 and 21 years. PMID:26498182
Matulis, Simone; Loos, Laura; Langguth, Nadine; Schreiber, Franziska; Gutermann, Jana; Gawrilow, Caterina; Steil, Regina
2015-01-01
The Trauma Symptom Checklist for Children (TSC-C) is the most widely used self-report scale to assess trauma-related symptoms in children and adolescents on six clinical scales. The purpose of the present study was to develop a German version of the TSC-C and to investigate its psychometric properties, such as factor structure, reliability, and validity, in a sample of German adolescents. A normative sample of N=583 and a clinical sample of N=41 adolescents with a history of physical or sexual abuse aged between 13 and 21 years participated in the study. The Confirmatory Factor Analysis on the six-factor model (anger, anxiety, depression, dissociation, posttraumatic stress, and sexual concerns with the subdimensions preoccupation and distress) revealed acceptable to good fit statistics in the normative sample. One item had to be excluded from the German version of the TSC-C because the factor loading was too low. All clinical scales presented acceptable to good reliability, with Cronbach's α's ranging from .80 to .86 in the normative sample and from .72 to .87 in the clinical sample. Concurrent validity was also demonstrated by the high correlations between the TSC-C scales and instruments measuring similar psychopathology. TSC-C scores reliably differentiated between adolescents with trauma history and those without trauma history, indicating discriminative validity. In conclusion, the German version of the TSC-C is a reliable and valid instrument for assessing trauma-related symptoms on six different scales in adolescents aged between 13 and 21 years.
Olives, Casey; Valadez, Joseph J; Brooker, Simon J; Pagano, Marcello
2012-01-01
Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n=15 and n=25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n=15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools.
Pechorro, Pedro; Barroso, Ricardo; Maroco, João; Vieira, Rui Xavier; Gonçalves, Rui Abrunhosa
2015-11-01
The main aim of the present study was to examine some psychometric properties of the Psychopathy Checklist: Youth Version (PCL:YV) among Portuguese juvenile delinquents. With forensic sample of 192 incarcerated male participants, the Portuguese version of the PCL:YV demonstrated promising psychometric properties of the three-factor model of youth psychopathy, internal consistency, convergent validity, concurrent validity, and retrospective validity that generally justify its use among Portuguese youths. Statistically significant associations were found with age of criminal onset, frequency of crimes, number of victims, and use of physical violence. © The Author(s) 2014.
NASA Astrophysics Data System (ADS)
Fisher, B. L.; Wolff, D. B.; Silberstein, D. S.; Marks, D. M.; Pippitt, J. L.
2007-12-01
The Tropical Rainfall Measuring Mission's (TRMM) Ground Validation (GV) Program was originally established with the principal long-term goal of determining the random errors and systematic biases stemming from the application of the TRMM rainfall algorithms. The GV Program has been structured around two validation strategies: 1) determining the quantitative accuracy of the integrated monthly rainfall products at GV regional sites over large areas of about 500 km2 using integrated ground measurements and 2) evaluating the instantaneous satellite and GV rain rate statistics at spatio-temporal scales compatible with the satellite sensor resolution (Simpson et al. 1988, Thiele 1988). The GV Program has continued to evolve since the launch of the TRMM satellite on November 27, 1997. This presentation will discuss current GV methods of validating TRMM operational rain products in conjunction with ongoing research. The challenge facing TRMM GV has been how to best utilize rain information from the GV system to infer the random and systematic error characteristics of the satellite rain estimates. A fundamental problem of validating space-borne rain estimates is that the true mean areal rainfall is an ideal, scale-dependent parameter that cannot be directly measured. Empirical validation uses ground-based rain estimates to determine the error characteristics of the satellite-inferred rain estimates, but ground estimates also incur measurement errors and contribute to the error covariance. Furthermore, sampling errors, associated with the discrete, discontinuous temporal sampling by the rain sensors aboard the TRMM satellite, become statistically entangled in the monthly estimates. Sampling errors complicate the task of linking biases in the rain retrievals to the physics of the satellite algorithms. The TRMM Satellite Validation Office (TSVO) has made key progress towards effective satellite validation. For disentangling the sampling and retrieval errors, TSVO has developed and applied a methodology that statistically separates the two error sources. Using TRMM monthly estimates and high-resolution radar and gauge data, this method has been used to estimate sampling and retrieval error budgets over GV sites. More recently, a multi- year data set of instantaneous rain rates from the TRMM microwave imager (TMI), the precipitation radar (PR), and the combined algorithm was spatio-temporally matched and inter-compared to GV radar rain rates collected during satellite overpasses of select GV sites at the scale of the TMI footprint. The analysis provided a more direct probe of the satellite rain algorithms using ground data as an empirical reference. TSVO has also made significant advances in radar quality control through the development of the Relative Calibration Adjustment (RCA) technique. The RCA is currently being used to provide a long-term record of radar calibration for the radar at Kwajalein, a strategically important GV site in the tropical Pacific. The RCA technique has revealed previously undetected alterations in the radar sensitivity due to engineering changes (e.g., system modifications, antenna offsets, alterations of the receiver, or the data processor), making possible the correction of the radar rainfall measurements and ensuring the integrity of nearly a decade of TRMM GV observations and resources.
Houssaini, Allal; Assoumou, Lambert; Miller, Veronica; Calvez, Vincent; Marcelin, Anne-Geneviève; Flandre, Philippe
2013-01-01
Background Several attempts have been made to determine HIV-1 resistance from genotype resistance testing. We compare scoring methods for building weighted genotyping scores and commonly used systems to determine whether the virus of a HIV-infected patient is resistant. Methods and Principal Findings Three statistical methods (linear discriminant analysis, support vector machine and logistic regression) are used to determine the weight of mutations involved in HIV resistance. We compared these weighted scores with known interpretation systems (ANRS, REGA and Stanford HIV-db) to classify patients as resistant or not. Our methodology is illustrated on the Forum for Collaborative HIV Research didanosine database (N = 1453). The database was divided into four samples according to the country of enrolment (France, USA/Canada, Italy and Spain/UK/Switzerland). The total sample and the four country-based samples allow external validation (one sample is used to estimate a score and the other samples are used to validate it). We used the observed precision to compare the performance of newly derived scores with other interpretation systems. Our results show that newly derived scores performed better than or similar to existing interpretation systems, even with external validation sets. No difference was found between the three methods investigated. Our analysis identified four new mutations associated with didanosine resistance: D123S, Q207K, H208Y and K223Q. Conclusions We explored the potential of three statistical methods to construct weighted scores for didanosine resistance. Our proposed scores performed at least as well as already existing interpretation systems and previously unrecognized didanosine-resistance associated mutations were identified. This approach could be used for building scores of genotypic resistance to other antiretroviral drugs. PMID:23555613
ERIC Educational Resources Information Center
Jandaghi, Gholamreza
2010-01-01
The purpose of the research is to determine high school teachers' skill rate in designing exam questions in physics subject. The statistical population was all of physics exam shits for two semesters in one school year from which a sample of 364 exam shits was drawn using multistage cluster sampling. Two experts assessed the shits and by using…
21 CFR 314.50 - Content and format of an application.
Code of Federal Regulations, 2013 CFR
2013-04-01
... the protocol and a description of the statistical analyses used to evaluate the study. If the study... application: (i) Three copies of the analytical procedures and related descriptive information contained in... the samples and to validate the applicant's analytical procedures. The related descriptive information...
21 CFR 314.50 - Content and format of an application.
Code of Federal Regulations, 2012 CFR
2012-04-01
... the protocol and a description of the statistical analyses used to evaluate the study. If the study... application: (i) Three copies of the analytical procedures and related descriptive information contained in... the samples and to validate the applicant's analytical procedures. The related descriptive information...
21 CFR 314.50 - Content and format of an application.
Code of Federal Regulations, 2014 CFR
2014-04-01
... the protocol and a description of the statistical analyses used to evaluate the study. If the study... application: (i) Three copies of the analytical procedures and related descriptive information contained in... the samples and to validate the applicant's analytical procedures. The related descriptive information...
21 CFR 314.50 - Content and format of an application.
Code of Federal Regulations, 2011 CFR
2011-04-01
... the protocol and a description of the statistical analyses used to evaluate the study. If the study... application: (i) Three copies of the analytical procedures and related descriptive information contained in... the samples and to validate the applicant's analytical procedures. The related descriptive information...
21 CFR 314.50 - Content and format of an application.
Code of Federal Regulations, 2010 CFR
2010-04-01
... the protocol and a description of the statistical analyses used to evaluate the study. If the study... application: (i) Three copies of the analytical procedures and related descriptive information contained in... the samples and to validate the applicant's analytical procedures. The related descriptive information...
29 CFR Section 1607.16 - Definitions.
Code of Federal Regulations, 2010 CFR
2010-07-01
... action are open to users. T. Skill. A present, observable competence to perform a learned psychomoter act... criterion-related validity studies. These conditions include: (1) An adequate sample of persons available for the study to achieve findings of statistical significance; (2) having or being able to obtain a...
Spectral signature verification using statistical analysis and text mining
NASA Astrophysics Data System (ADS)
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is present for comparison. The spectral validation method proposed is described from a practical application and analytical perspective.
Just add water: Accuracy of analysis of diluted human milk samples using mid-infrared spectroscopy.
Smith, R W; Adamkin, D H; Farris, A; Radmacher, P G
2017-01-01
To determine the maximum dilution of human milk (HM) that yields reliable results for protein, fat and lactose when analyzed by mid-infrared spectroscopy. De-identified samples of frozen HM were obtained. Milk was thawed and warmed (40°C) prior to analysis. Undiluted (native) HM was analyzed by mid-infrared spectroscopy for macronutrient composition: total protein (P), fat (F), carbohydrate (C); Energy (E) was calculated from the macronutrient results. Subsequent analyses were done with 1 : 2, 1 : 3, 1 : 5 and 1 : 10 dilutions of each sample with distilled water. Additional samples were sent to a certified lab for external validation. Quantitatively, F and P showed statistically significant but clinically non-critical differences in 1 : 2 and 1 : 3 dilutions. Differences at higher dilutions were statistically significant and deviated from native values enough to render those dilutions unreliable. External validation studies also showed statistically significant but clinically unimportant differences at 1 : 2 and 1 : 3 dilutions. The Calais Human Milk Analyzer can be used with HM samples diluted 1 : 2 and 1 : 3 and return results within 5% of values from undiluted HM. At a 1 : 5 or 1 : 10 dilution, however, results vary as much as 10%, especially with P and F. At the 1 : 2 and 1 : 3 dilutions these differences appear to be insignificant in the context of nutritional management. However, the accuracy and reliability of the 1 : 5 and 1 : 10 dilutions are questionable.
Pat, Lucio; Ali, Bassam; Guerrero, Armando; Córdova, Atl V.; Garduza, José P.
2016-01-01
Attenuated total reflectance-Fourier transform infrared spectrometry and chemometrics model was used for determination of physicochemical properties (pH, redox potential, free acidity, electrical conductivity, moisture, total soluble solids (TSS), ash, and HMF) in honey samples. The reference values of 189 honey samples of different botanical origin were determined using Association Official Analytical Chemists, (AOAC), 1990; Codex Alimentarius, 2001, International Honey Commission, 2002, methods. Multivariate calibration models were built using partial least squares (PLS) for the measurands studied. The developed models were validated using cross-validation and external validation; several statistical parameters were obtained to determine the robustness of the calibration models: (PCs) optimum number of components principal, (SECV) standard error of cross-validation, (R 2 cal) coefficient of determination of cross-validation, (SEP) standard error of validation, and (R 2 val) coefficient of determination for external validation and coefficient of variation (CV). The prediction accuracy for pH, redox potential, electrical conductivity, moisture, TSS, and ash was good, while for free acidity and HMF it was poor. The results demonstrate that attenuated total reflectance-Fourier transform infrared spectrometry is a valuable, rapid, and nondestructive tool for the quantification of physicochemical properties of honey. PMID:28070445
Community-Based Validation of the Social Phobia Screener (SOPHS).
Batterham, Philip J; Mackinnon, Andrew J; Christensen, Helen
2017-10-01
There is a need for brief, accurate screening scales for social anxiety disorder to enable better identification of the disorder in research and clinical settings. A five-item social anxiety screener, the Social Phobia Screener (SOPHS), was developed to address this need. The screener was validated in two samples: (a) 12,292 Australian young adults screened for a clinical trial, including 1,687 participants who completed a phone-based clinical interview and (b) 4,214 population-based Australian adults recruited online. The SOPHS (78% sensitivity, 72% specificity) was found to have comparable screening performance to the Social Phobia Inventory (77% sensitivity, 71% specificity) and Mini-Social Phobia Inventory (74% sensitivity, 73% specificity) relative to clinical criteria in the trial sample. In the population-based sample, the SOPHS was also accurate (95% sensitivity, 73% specificity) in identifying Diagnostic and Statistical Manual of Mental Disorders-Fifth edition social anxiety disorder. The SOPHS is a valid and reliable screener for social anxiety that is freely available for use in research and clinical settings.
Ielpo, Pierina; Leardi, Riccardo; Pappagallo, Giuseppe; Uricchio, Vito Felice
2017-06-01
In this paper, the results obtained from multivariate statistical techniques such as PCA (Principal component analysis) and LDA (Linear discriminant analysis) applied to a wide soil data set are presented. The results have been compared with those obtained on a groundwater data set, whose samples were collected together with soil ones, within the project "Improvement of the Regional Agro-meteorological Monitoring Network (2004-2007)". LDA, applied to soil data, has allowed to distinguish the geographical origin of the sample from either one of the two macroaeras: Bari and Foggia provinces vs Brindisi, Lecce e Taranto provinces, with a percentage of correct prediction in cross validation of 87%. In the case of the groundwater data set, the best classification was obtained when the samples were grouped into three macroareas: Foggia province, Bari province and Brindisi, Lecce and Taranto provinces, by reaching a percentage of correct predictions in cross validation of 84%. The obtained information can be very useful in supporting soil and water resource management, such as the reduction of water consumption and the reduction of energy and chemical (nutrients and pesticides) inputs in agriculture.
Psychometric evaluation of the Swedish version of Rosenberg's self-esteem scale.
Eklund, Mona; Bäckström, Martin; Hansson, Lars
2018-04-01
The widely used Rosenberg's self-esteem scale (RSES) has not been evaluated for psychometric properties in Sweden. This study aimed at analyzing its factor structure, internal consistency, criterion, convergent and discriminant validity, sensitivity to change, and whether a four-graded Likert-type response scale increased its reliability and validity compared to a yes/no response scale. People with mental illness participating in intervention studies to (1) promote everyday life balance (N = 223) or (2) remedy self-stigma (N = 103) were included. Both samples completed the RSES and questionnaires addressing quality of life and sociodemographic data. Sample 1 also completed instruments chosen to assess convergent and discriminant validity: self-mastery (convergent validity), level of functioning and occupational engagement (discriminant validity). Confirmatory factor analysis (CFA), structural equation modeling, and conventional inferential statistics were used. Based on both samples, the Swedish RSES formed one factor and exhibited high internal consistency (>0.90). The two response scales were equivalent. Criterion validity in relation to quality of life was demonstrated. RSES could distinguish between women and men (women scoring lower) and between diagnostic groups (people with depression scoring lower). Correlations >0.5 with variables chosen to reflect convergent validity and around 0.2 with variables used to address discriminant validity further highlighted the construct validity of RSES. The instrument also showed sensitivity to change. The Swedish RSES exhibited a one-component factor structure and showed good psychometric properties in terms of good internal consistency, criterion, convergent and discriminant validity, and sensitivity to change. The yes/no and the four-graded Likert-type response scales worked equivalently.
Classical Statistics and Statistical Learning in Imaging Neuroscience
Bzdok, Danilo
2017-01-01
Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896
Parametric study of statistical bias in laser Doppler velocimetry
NASA Technical Reports Server (NTRS)
Gould, Richard D.; Stevenson, Warren H.; Thompson, H. Doyle
1989-01-01
Analytical studies have often assumed that LDV velocity bias depends on turbulence intensity in conjunction with one or more characteristic time scales, such as the time between validated signals, the time between data samples, and the integral turbulence time-scale. These parameters are presently varied independently, in an effort to quantify the biasing effect. Neither of the post facto correction methods employed is entirely accurate. The mean velocity bias error is found to be nearly independent of data validation rate.
Haugan, Gørill; Drageset, Jorunn
2014-08-01
Depression and anxiety are particularly common among individuals living in long-term care facilities. Therefore, access to a valid and reliable measure of anxiety and depression among nursing home patients is highly warranted. To investigate the dimensionality, reliability and construct validity of the Hospital Anxiety and Depression scale (HADS) in a cognitively intact nursing home population. Cross-sectional data were collected from two samples; 429 cognitively intact nursing home patients participated, representing 74 different Norwegian nursing homes. Confirmative factor analyses and correlations with selected constructs were used. The two-factor model provided a good fit in Sample1, revealing a poorer fit in Sample2. Good-acceptable measurement reliability was demonstrated, and construct validity was supported. Using listwise deletion the sample sizes were 227 and 187, for Sample1 and Sample2, respectively. Greater sample sizes would have strengthen the statistical power in the tests. The researchers visited the participants to help fill in the questionnaires; this might have introduced some bias into the respondents׳ reporting. The 14 HADS items were part of greater questionnaires. Thus, frail, older NH patients might have tired during the interview causing a possible bias. Low reliability for depression was disclosed, mainly resulting from three items appearing to be inappropriate indicators for depression in this population. Further research is needed exploring which items might perform as more reliably indicators for depression among nursing home patients. Copyright © 2014 Elsevier B.V. All rights reserved.
Random sampling and validation of covariance matrices of resonance parameters
NASA Astrophysics Data System (ADS)
Plevnik, Lucijan; Zerovnik, Gašper
2017-09-01
Analytically exact methods for random sampling of arbitrary correlated parameters are presented. Emphasis is given on one hand on the possible inconsistencies in the covariance data, concentrating on the positive semi-definiteness and consistent sampling of correlated inherently positive parameters, and on the other hand on optimization of the implementation of the methods itself. The methods have been applied in the program ENDSAM, written in the Fortran language, which from a file from a nuclear data library of a chosen isotope in ENDF-6 format produces an arbitrary number of new files in ENDF-6 format which contain values of random samples of resonance parameters (in accordance with corresponding covariance matrices) in places of original values. The source code for the program ENDSAM is available from the OECD/NEA Data Bank. The program works in the following steps: reads resonance parameters and their covariance data from nuclear data library, checks whether the covariance data is consistent, and produces random samples of resonance parameters. The code has been validated with both realistic and artificial data to show that the produced samples are statistically consistent. Additionally, the code was used to validate covariance data in existing nuclear data libraries. A list of inconsistencies, observed in covariance data of resonance parameters in ENDF-VII.1, JEFF-3.2 and JENDL-4.0 is presented. For now, the work has been limited to resonance parameters, however the methods presented are general and can in principle be extended to sampling and validation of any nuclear data.
Turkish Version of Kolcaba's Immobilization Comfort Questionnaire: A Validity and Reliability Study.
Tosun, Betül; Aslan, Özlem; Tunay, Servet; Akyüz, Aygül; Özkan, Hüseyin; Bek, Doğan; Açıksöz, Semra
2015-12-01
The purpose of this study was to determine the validity and reliability of the Turkish version of the Immobilization Comfort Questionnaire (ICQ). The sample used in this methodological study consisted of 121 patients undergoing lower extremity arthroscopy in a training and research hospital. The validity study of the questionnaire assessed language validity, structural validity and criterion validity. Structural validity was evaluated via exploratory factor analysis. Criterion validity was evaluated by assessing the correlation between the visual analog scale (VAS) scores (i.e., the comfort and pain VAS scores) and the ICQ scores using Spearman's correlation test. The Kaiser-Meyer-Olkin coefficient and Bartlett's test of sphericity were used to determine the suitability of the data for factor analysis. Internal consistency was evaluated to determine reliability. The data were analyzed with SPSS version 15.00 for Windows. Descriptive statistics were presented as frequencies, percentages, means and standard deviations. A p value ≤ .05 was considered statistically significant. A moderate positive correlation was found between the ICQ scores and the VAS comfort scores; a moderate negative correlation was found between the ICQ and the VAS pain measures in the criterion validity analysis. Cronbach α values of .75 and .82 were found for the first and second measurements, respectively. The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems. Copyright © 2015. Published by Elsevier B.V.
Experimental design, power and sample size for animal reproduction experiments.
Chapman, Phillip L; Seidel, George E
2008-01-01
The present paper concerns statistical issues in the design of animal reproduction experiments, with emphasis on the problems of sample size determination and power calculations. We include examples and non-technical discussions aimed at helping researchers avoid serious errors that may invalidate or seriously impair the validity of conclusions from experiments. Screen shots from interactive power calculation programs and basic SAS power calculation programs are presented to aid in understanding statistical power and computing power in some common experimental situations. Practical issues that are common to most statistical design problems are briefly discussed. These include one-sided hypothesis tests, power level criteria, equality of within-group variances, transformations of response variables to achieve variance equality, optimal specification of treatment group sizes, 'post hoc' power analysis and arguments for the increased use of confidence intervals in place of hypothesis tests.
24 CFR 902.52 - Distribution of survey to residents.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 24 Housing and Urban Development 4 2010-04-01 2010-04-01 false Distribution of survey to residents... § 902.52 Distribution of survey to residents. (a) Sampling. A statistically valid number of units will be chosen to receive the Resident Service and Satisfaction Survey. These units will be randomly...
A Comparison of Conjoint Analysis Response Formats
Kevin J. Boyle; Thomas P. Holmes; Mario F. Teisl; Brian Roe
2001-01-01
A split-sample design is used to evaluate the convergent validity of three response formats used in conjoint analysis experiments. WC investigate whether recoding rating data to rankings and choose-one formats, and recoding ranking data to choose one. result in structural models and welfare estimates that are statistically indistinguishable from...
From the Knowledge of Understanding to Military Deception
2008-05-21
and decision-making. Experimental research could lead to a validation of the theory. In the experiment, I tried to cause a decrease in ambiguity and...strong evidence that indicate a relationship. Further research with a larger sample size might show a significant statistical relationship. In order...2 Research question
78 FR 51133 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-20
... a currently valid OMB control number. National Agricultural Statistics Service Title: Wheat and... surveys. This project is conducted as a cooperative effort with the U.S. Wheat and Barley Scab Initiative... of the Information: The survey will use a sampling universe defined as producers that harvest wheat...
Ouyang, Liwen; Apley, Daniel W; Mehrotra, Sanjay
2016-04-01
Electronic medical record (EMR) databases offer significant potential for developing clinical hypotheses and identifying disease risk associations by fitting statistical models that capture the relationship between a binary response variable and a set of predictor variables that represent clinical, phenotypical, and demographic data for the patient. However, EMR response data may be error prone for a variety of reasons. Performing a manual chart review to validate data accuracy is time consuming, which limits the number of chart reviews in a large database. The authors' objective is to develop a new design-of-experiments-based systematic chart validation and review (DSCVR) approach that is more powerful than the random validation sampling used in existing approaches. The DSCVR approach judiciously and efficiently selects the cases to validate (i.e., validate whether the response values are correct for those cases) for maximum information content, based only on their predictor variable values. The final predictive model will be fit using only the validation sample, ignoring the remainder of the unvalidated and unreliable error-prone data. A Fisher information based D-optimality criterion is used, and an algorithm for optimizing it is developed. The authors' method is tested in a simulation comparison that is based on a sudden cardiac arrest case study with 23 041 patients' records. This DSCVR approach, using the Fisher information based D-optimality criterion, results in a fitted model with much better predictive performance, as measured by the receiver operating characteristic curve and the accuracy in predicting whether a patient will experience the event, than a model fitted using a random validation sample. The simulation comparisons demonstrate that this DSCVR approach can produce predictive models that are significantly better than those produced from random validation sampling, especially when the event rate is low. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Validity and reliability of a method for assessment of cervical vertebral maturation.
Zhao, Xiao-Guang; Lin, Jiuxiang; Jiang, Jiu-Hui; Wang, Qingzhu; Ng, Sut Hong
2012-03-01
To evaluate the validity and reliability of the cervical vertebral maturation (CVM) method with a longitudinal sample. Eighty-six cephalograms from 18 subjects (5 males and 13 females) were selected from the longitudinal database. Total mandibular length was measured on each film; an increased rate served as the gold standard in examination of the validity of the CVM method. Eleven orthodontists, after receiving intensive training in the CVM method, evaluated all films twice. Kendall's W and the weighted kappa statistic were employed. Kendall's W values were higher than 0.8 at both times, indicating strong interobserver reproducibility, but interobserver agreement was documented twice at less than 50%. A wide range of intraobserver agreement was noted (40.7%-79.1%), and substantial intraobserver reproducibility was proved by kappa values (0.53-0.86). With regard to validity, moderate agreement was reported between the gold standard and observer staging at the initial time (kappa values 0.44-0.61). However, agreement seemed to be unacceptable for clinical use, especially in cervical stage 3 (26.8%). Even though the validity and reliability of the CVM method proved statistically acceptable, we suggest that many other growth indicators should be taken into consideration in evaluating adolescent skeletal maturation.
Cortes, Aneg L; Montiel, Enrique R; Gimeno, Isabel M
2009-12-01
The use of Flinders Technology Associates (FTA) filter cards to quantify Marek's disease virus (MDV) DNA for the diagnosis of Marek's disease (MD) and to monitor MD vaccines was evaluated. Samples of blood (43), solid tumors (14), and feather pulp (FP; 36) collected fresh and in FTA cards were analyzed. MDV DNA load was quantified by real-time PCR. Threshold cycle (Ct) ratios were calculated for each sample by dividing the Ct value of the internal control gene (glyceraldehyde-3-phosphate dehydrogenase) by the Ct value of the MDV gene. Statistically significant correlation (P < 0.05) within Ct ratios was detected between samples collected fresh and in FTA cards by using Pearson's correlation test. Load of serotype 1 MDV DNA was quantified in 24 FP, 14 solid tumor, and 43 blood samples. There was a statistically significant correlation between FP (r = 0.95), solid tumor (r = 0.94), and blood (r = 0.9) samples collected fresh and in FTA cards. Load of serotype 2 MDV DNA was quantified in 17 FP samples, and the correlation between samples collected fresh and in FTA cards was also statistically significant (Pearson's coefficient, r = 0.96); load of serotype 3 MDV DNA was quantified in 36 FP samples, and correlation between samples taken fresh and in FTA cards was also statistically significant (r = 0.84). MDV DNA samples extracted 3 days (t0) and 8 months after collection (t1) were used to evaluate the stability of MDV DNA in archived samples collected in FTA cards. A statistically significant correlation was found for serotype 1 (r = 0.96), serotype 2 (r = 1), and serotype 3 (r = 0.9). The results show that FTA cards are an excellent media to collect, transport, and archive samples for MD diagnosis and to monitor MD vaccines. In addition, FTA cards are widely available, inexpensive, and adequate for the shipment of samples nationally and internationally.
Yalin Sapmaz, Şermin; Özek Erkuran, Handan; Yalin, Nefize; Önen, Özlem; Öztekin, Siğnem; Kavurma, Canem; Köroğlu, Ertuğrul; Aydemir, Ömer
2017-12-01
This study aimed to assess the validity and reliability of the Turkish version of Diagnostic and Statistical Manual of Mental Disorders (DSM-5) Level 2 Anger Scale. The scale was prepared by translation and back translation of DSM-5 Level 2 Anger Scale. Study groups consisted of a clinical sample of cases diagnosed with depressive disorder and treated in a child and adolescent psychiatry unit and a community sample. The study was continued with 218 children and 160 parents. In the assessment process, child and parent forms of DSM-5 Level 2 Anger Scale and Children's Depression Inventory and Strengths and Difficulties Questionnaire-Parent Form were used. In the reliability analyses, the Cronbach alpha internal consistency coefficient values were found very high regarding child and parent forms. Item-total score correlation coefficients were high and very high, respectively, for child and parent forms indicating a statistical significance. As for construct validity, one factor was maintained for each form and was found to be consistent with the original form of the scale. As for concurrent validity, the child form of the scale showed significant correlation with Children's Depression Inventory, while the parent form showed significant correlation with Strengths and Difficulties Questionnaire-Parent Form. It was found that the Turkish version of DSM-5 Level 2 Anger Scale could be utilized as a valid and reliable tool both in clinical practice and for research purposes.
OCT Amplitude and Speckle Statistics of Discrete Random Media.
Almasian, Mitra; van Leeuwen, Ton G; Faber, Dirk J
2017-11-01
Speckle, amplitude fluctuations in optical coherence tomography (OCT) images, contains information on sub-resolution structural properties of the imaged sample. Speckle statistics could therefore be utilized in the characterization of biological tissues. However, a rigorous theoretical framework relating OCT speckle statistics to structural tissue properties has yet to be developed. As a first step, we present a theoretical description of OCT speckle, relating the OCT amplitude variance to size and organization for samples of discrete random media (DRM). Starting the calculations from the size and organization of the scattering particles, we analytically find expressions for the OCT amplitude mean, amplitude variance, the backscattering coefficient and the scattering coefficient. We assume fully developed speckle and verify the validity of this assumption by experiments on controlled samples of silica microspheres suspended in water. We show that the OCT amplitude variance is sensitive to sub-resolution changes in size and organization of the scattering particles. Experimentally determined and theoretically calculated optical properties are compared and in good agreement.
Bonetti, Jennifer; Quarino, Lawrence
2014-05-01
This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.
ERIC Educational Resources Information Center
Zwick, Rebecca
2012-01-01
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Shih, Weichung Joe; Li, Gang; Wang, Yining
2016-03-01
Sample size plays a crucial role in clinical trials. Flexible sample-size designs, as part of the more general category of adaptive designs that utilize interim data, have been a popular topic in recent years. In this paper, we give a comparative review of four related methods for such a design. The likelihood method uses the likelihood ratio test with an adjusted critical value. The weighted method adjusts the test statistic with given weights rather than the critical value. The dual test method requires both the likelihood ratio statistic and the weighted statistic to be greater than the unadjusted critical value. The promising zone approach uses the likelihood ratio statistic with the unadjusted value and other constraints. All four methods preserve the type-I error rate. In this paper we explore their properties and compare their relationships and merits. We show that the sample size rules for the dual test are in conflict with the rules of the promising zone approach. We delineate what is necessary to specify in the study protocol to ensure the validity of the statistical procedure and what can be kept implicit in the protocol so that more flexibility can be attained for confirmatory phase III trials in meeting regulatory requirements. We also prove that under mild conditions, the likelihood ratio test still preserves the type-I error rate when the actual sample size is larger than the re-calculated one. Copyright © 2015 Elsevier Inc. All rights reserved.
Psychometrics of the MHSIP Adult Consumer Survey.
Jerrell, Jeanette M
2006-10-01
The reliability and validity of the Mental Health Statistics Improvement Program (MHSIP) Adult Consumer Survey were assessed in a statewide convenience sample of 459 persons with severe mental illness served through a public mental health system. Consistent with previous findings and the intent of its developers, three factors were identified that demonstrate good internal consistency, moderate test-retest reliability, and good convergent validity with consumer perceptions of other aspects of their care. The reliability and validity of the MHSIP Adult Consumer Survey documented in this study underscore its scientific and practical utility as an abbreviated tool for assessing access, quality and appropriateness, and outcome in mental health service systems.
Optical diagnosis of malaria infection in human plasma using Raman spectroscopy
NASA Astrophysics Data System (ADS)
Bilal, Muhammad; Saleem, Muhammad; Amanat, Samina Tufail; Shakoor, Huma Abdul; Rashid, Rashad; Mahmood, Arshad; Ahmed, Mushtaq
2015-01-01
We present the prediction of malaria infection in human plasma using Raman spectroscopy. Raman spectra of malaria-infected samples are compared with those of healthy and dengue virus infected ones for disease recognition. Raman spectra were acquired using a laser at 532 nm as an excitation source and 10 distinct spectral signatures that statistically differentiated malaria from healthy and dengue-infected cases were found. A multivariate regression model has been developed that utilized Raman spectra of 20 malaria-infected, 10 non-malarial with fever, 10 healthy, and 6 dengue-infected samples to optically predict the malaria infection. The model yields the correlation coefficient r2 value of 0.981 between the predicted values and clinically known results of trainee samples, and the root mean square error in cross validation was found to be 0.09; both these parameters validated the model. The model was further blindly tested for 30 unknown suspected samples and found to be 86% accurate compared with the clinical results, with the inaccuracy due to three samples which were predicted in the gray region. Standard deviation and root mean square error in prediction for unknown samples were found to be 0.150 and 0.149, which are accepted for the clinical validation of the model.
Fatehi, Zahra; Baradaran, Hamid Reza; Asadpour, Mohamad; Rezaeian, Mohsen
2017-01-01
Background: Individuals' listening styles differs based on their characters, professions and situations. This study aimed to assess the validity and reliability of Listening Styles Profile- Revised (LSP- R) in Iranian students. Methods: After translating into Persian, LSP-R was employed in a sample of 240 medical and nursing Persian speaking students in Iran. Statistical analysis was performed to test the reliability and validity of the LSP-R. Results: The study revealed high internal consistency and good test-retest reliability for the Persian version of the questionnaire. The Cronbach's alpha coefficient was 0.72 and intra-class correlation coefficient 0.87. The means for the content validity index and the content validity ratio (CVR) were 0.90 and 0.83, respectively. Exploratory factor analysis (EFA) yielded a four-factor solution accounted for 60.8% of the observed variance. Majority of medical students (73%) as well as majority of nursing students (70%) stated that their listening styles were task-oriented. Conclusion: In general, the study finding suggests that the Persian version of LSP-R is a valid and reliable instrument for assessing listening styles profile in the studied sample.
Risk score to predict the outcome of patients with cerebral vein and dural sinus thrombosis.
Ferro, José M; Bacelar-Nicolau, Helena; Rodrigues, Teresa; Bacelar-Nicolau, Leonor; Canhão, Patrícia; Crassard, Isabelle; Bousser, Marie-Germaine; Dutra, Aurélio Pimenta; Massaro, Ayrton; Mackowiack-Cordiolani, Marie-Anne; Leys, Didier; Fontes, João; Stam, Jan; Barinagarrementeria, Fernando
2009-01-01
Around 15% of patients die or become dependent after cerebral vein and dural sinus thrombosis (CVT). We used the International Study on Cerebral Vein and Dural Sinus Thrombosis (ISCVT) sample (624 patients, with a median follow-up time of 478 days) to develop a Cox proportional hazards regression model to predict outcome, dichotomised by a modified Rankin Scale score >2. From the model hazard ratios, a risk score was derived and a cut-off point selected. The model and the score were tested in 2 validation samples: (1) the prospective Cerebral Venous Thrombosis Portuguese Collaborative Study Group (VENOPORT) sample with 91 patients; (2) a sample of 169 consecutive CVT patients admitted to 5 ISCVT centres after the end of the ISCVT recruitment period. Sensitivity, specificity, c statistics and overall efficiency to predict outcome at 6 months were calculated. The model (hazard ratios: malignancy 4.53; coma 4.19; thrombosis of the deep venous system 3.03; mental status disturbance 2.18; male gender 1.60; intracranial haemorrhage 1.42) had overall efficiencies of 85.1, 84.4 and 90.0%, in the derivation sample and validation samples 1 and 2, respectively. Using the risk score (range from 0 to 9) with a cut-off of >or=3 points, overall efficiency was 85.4, 84.4 and 90.1% in the derivation sample and validation samples 1 and 2, respectively. Sensitivity and specificity in the combined samples were 96.1 and 13.6%, respectively. The CVT risk score has a good estimated overall rate of correct classifications in both validation samples, but its specificity is low. It can be used to avoid unnecessary or dangerous interventions in low-risk patients, and may help to identify high-risk CVT patients. (c) 2009 S. Karger AG, Basel.
Lightfoot, Emma; O’Connell, Tamsin C.
2016-01-01
Oxygen isotope analysis of archaeological skeletal remains is an increasingly popular tool to study past human migrations. It is based on the assumption that human body chemistry preserves the δ18O of precipitation in such a way as to be a useful technique for identifying migrants and, potentially, their homelands. In this study, the first such global survey, we draw on published human tooth enamel and bone bioapatite data to explore the validity of using oxygen isotope analyses to identify migrants in the archaeological record. We use human δ18O results to show that there are large variations in human oxygen isotope values within a population sample. This may relate to physiological factors influencing the preservation of the primary isotope signal, or due to human activities (such as brewing, boiling, stewing, differential access to water sources and so on) causing variation in ingested water and food isotope values. We compare the number of outliers identified using various statistical methods. We determine that the most appropriate method for identifying migrants is dependent on the data but is likely to be the IQR or median absolute deviation from the median under most archaeological circumstances. Finally, through a spatial assessment of the dataset, we show that the degree of overlap in human isotope values from different locations across Europe is such that identifying individuals’ homelands on the basis of oxygen isotope analysis alone is not possible for the regions analysed to date. Oxygen isotope analysis is a valid method for identifying first-generation migrants from an archaeological site when used appropriately, however it is difficult to identify migrants using statistical methods for a sample size of less than c. 25 individuals. In the absence of local previous analyses, each sample should be treated as an individual dataset and statistical techniques can be used to identify migrants, but in most cases pinpointing a specific homeland should not be attempted. PMID:27124001
Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S
2007-01-01
Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078
Lociciro, S; Esseiva, P; Hayoz, P; Dujourdy, L; Besacier, F; Margot, P
2008-05-20
Harmonisation and optimization of analytical and statistical methodologies were carried out between two forensic laboratories (Lausanne, Switzerland and Lyon, France) in order to provide drug intelligence for cross-border cocaine seizures. Part I dealt with the optimization of the analytical method and its robustness. This second part investigates statistical methodologies that will provide reliable comparison of cocaine seizures analysed on two different gas chromatographs interfaced with a flame ionisation detectors (GC-FIDs) in two distinct laboratories. Sixty-six statistical combinations (ten data pre-treatments followed by six different distance measurements and correlation coefficients) were applied. One pre-treatment (N+S: area of each peak is divided by its standard deviation calculated from the whole data set) followed by the Cosine or Pearson correlation coefficients were found to be the best statistical compromise for optimal discrimination of linked and non-linked samples. The centralisation of the analyses in one single laboratory is not a required condition anymore to compare samples seized in different countries. This allows collaboration, but also, jurisdictional control over data.
Chapman, Benjamin P.; Weiss, Alexander; Duberstein, Paul
2016-01-01
Statistical learning theory (SLT) is the statistical formulation of machine learning theory, a body of analytic methods common in “big data” problems. Regression-based SLT algorithms seek to maximize predictive accuracy for some outcome, given a large pool of potential predictors, without overfitting the sample. Research goals in psychology may sometimes call for high dimensional regression. One example is criterion-keyed scale construction, where a scale with maximal predictive validity must be built from a large item pool. Using this as a working example, we first introduce a core principle of SLT methods: minimization of expected prediction error (EPE). Minimizing EPE is fundamentally different than maximizing the within-sample likelihood, and hinges on building a predictive model of sufficient complexity to predict the outcome well, without undue complexity leading to overfitting. We describe how such models are built and refined via cross-validation. We then illustrate how three common SLT algorithms–Supervised Principal Components, Regularization, and Boosting—can be used to construct a criterion-keyed scale predicting all-cause mortality, using a large personality item pool within a population cohort. Each algorithm illustrates a different approach to minimizing EPE. Finally, we consider broader applications of SLT predictive algorithms, both as supportive analytic tools for conventional methods, and as primary analytic tools in discovery phase research. We conclude that despite their differences from the classic null-hypothesis testing approach—or perhaps because of them–SLT methods may hold value as a statistically rigorous approach to exploratory regression. PMID:27454257
Academic Performance and Perceived Stress among University Students
ERIC Educational Resources Information Center
Talib, Nadeem; Zia-ur-Rehman, Muhammad
2012-01-01
This study aims to investigate the effect of factor such as perceived stress on the academic performance of the students. A sample of 199 university graduates and undergraduates in Rawalpindi and Islamabad was selected as a statistical frame. Instrumentation used for this study is previously validated construct in order to evaluate the effect of…
Krypton and xenon in lunar fines
NASA Technical Reports Server (NTRS)
Basford, J. R.; Dragon, J. C.; Pepin, R. O.; Coscio, M. R., Jr.; Murthy, V. R.
1973-01-01
Data from grain-size separates, stepwise-heated fractions, and bulk analyses of 20 samples of fines and breccias from five lunar sites are used to define three-isotope and ordinate intercept correlations in an attempt to resolve the lunar heavy rare gas system in a statistically valid approach. Tables of concentrations and isotope compositions are given.
John F. Caratti
2006-01-01
The FIREMON Cover/Frequency (CF) method is used to assess changes in plant species cover and frequency for a macroplot. This method uses multiple quadrats to sample within-plot variation and quantify statistically valid changes in plant species cover, height, and frequency over time. Because it is difficult to estimate cover in quadrats for larger plants, this method...
Outliers: A Potential Data Problem.
ERIC Educational Resources Information Center
Douzenis, Cordelia; Rakow, Ernest A.
Outliers, extreme data values relative to others in a sample, may distort statistics that assume internal levels of measurement and normal distribution. The outlier may be a valid value or an error. Several procedures are available for identifying outliers, and each may be applied to errors of prediction from the regression lines for utility in a…
National visitor use monitoring implementation in Alaska.
Eric M. White; Joshua B. Wilson
2008-01-01
The USDA Forest Service implemented the National Visitor Use Monitoring (NVUM) program across the entire National Forest System (NFS) in calendar year 2000. The primary objective of the NVUM program is to develop reliable estimates of recreation use on NFS lands via a nationally consistent, statistically valid sampling approach. Secondary objectives of NVUM are to...
ERIC Educational Resources Information Center
Wang, Shudong; Wang, Ning; Hoadley, David
2007-01-01
This study used confirmatory factor analysis (CFA) to examine the comparability of the National Nurse Aide Assessment Program (NNAAP[TM]) test scores across language and administration condition groups for calibration and validation samples that were randomly drawn from the same population. Fit statistics supported both the calibration and…
Austin, P C; Shah, B R; Newman, A; Anderson, G M
2012-09-01
There are limited validated methods to ascertain comorbidities for risk adjustment in ambulatory populations of patients with diabetes using administrative health-care databases. The objective was to examine the ability of the Johns Hopkins' Aggregated Diagnosis Groups to predict mortality in population-based ambulatory samples of both incident and prevalent subjects with diabetes. Retrospective cohorts constructed using population-based administrative data. The incident cohort consisted of all 346,297 subjects diagnosed with diabetes between 1 April 2004 and 31 March 2008. The prevalent cohort consisted of all 879,849 subjects with pre-existing diabetes on 1 January, 2007. The outcome was death within 1 year of the subject's index date. A logistic regression model consisting of age, sex and indicator variables for 22 of the 32 Johns Hopkins' Aggregated Diagnosis Group categories had excellent discrimination for predicting mortality in incident diabetes patients: the c-statistic was 0.87 in an independent validation sample. A similar model had excellent discrimination for predicting mortality in prevalent diabetes patients: the c-statistic was 0.84 in an independent validation sample. Both models demonstrated very good calibration, denoting good agreement between observed and predicted mortality across the range of predicted mortality in which the large majority of subjects lay. For comparative purposes, regression models incorporating the Charlson comorbidity index, age and sex, age and sex, and age alone had poorer discrimination than the model that incorporated the Johns Hopkins' Aggregated Diagnosis Groups. Logistical regression models using age, sex and the John Hopkins' Aggregated Diagnosis Groups were able to accurately predict 1-year mortality in population-based samples of patients with diabetes. © 2011 The Authors. Diabetic Medicine © 2011 Diabetes UK.
Chen, Po-Yi; Yang, Chien-Ming; Morin, Charles M
2015-05-01
The purpose of this study is to examine the factor structure of the Insomnia Severity Index (ISI) across samples recruited from different countries. We tried to identify the most appropriate factor model for the ISI and further examined the measurement invariance property of the ISI across samples from different countries. Our analyses included one data set collected from a Taiwanese sample and two data sets obtained from samples in Hong Kong and Canada. The data set collected in Taiwan was analyzed with ordinal exploratory factor analysis (EFA) to obtain the appropriate factor model for the ISI. After that, we conducted a series of confirmatory factor analyses (CFAs), which is a special case of the structural equation model (SEM) that concerns the parameters in the measurement model, to the statistics collected in Canada and Hong Kong. The purposes of these CFA were to cross-validate the result obtained from EFA and further examine the cross-cultural measurement invariance of the ISI. The three-factor model outperforms other models in terms of global fit indices in Taiwan's population. Its external validity is also supported by confirmatory factor analyses. Furthermore, the measurement invariance analyses show that the strong invariance property between the samples from different cultures holds, providing evidence that the ISI results obtained in different cultures are comparable. The factorial validity of the ISI is stable in different populations. More importantly, its invariance property across cultures suggests that the ISI is a valid measure of the insomnia severity construct across countries. Copyright © 2014 Elsevier B.V. All rights reserved.
Internet cognitive testing of large samples needed in genetic research.
Haworth, Claire M A; Harlaar, Nicole; Kovas, Yulia; Davis, Oliver S P; Oliver, Bonamy R; Hayiou-Thomas, Marianna E; Frances, Jane; Busfield, Patricia; McMillan, Andrew; Dale, Philip S; Plomin, Robert
2007-08-01
Quantitative and molecular genetic research requires large samples to provide adequate statistical power, but it is expensive to test large samples in person, especially when the participants are widely distributed geographically. Increasing access to inexpensive and fast Internet connections makes it possible to test large samples efficiently and economically online. Reliability and validity of Internet testing for cognitive ability have not been previously reported; these issues are especially pertinent for testing children. We developed Internet versions of reading, language, mathematics and general cognitive ability tests and investigated their reliability and validity for 10- and 12-year-old children. We tested online more than 2500 pairs of 10-year-old twins and compared their scores to similar internet-based measures administered online to a subsample of the children when they were 12 years old (> 759 pairs). Within 3 months of the online testing at 12 years, we administered standard paper and pencil versions of the reading and mathematics tests in person to 30 children (15 pairs of twins). Scores on Internet-based measures at 10 and 12 years correlated .63 on average across the two years, suggesting substantial stability and high reliability. Correlations of about .80 between Internet measures and in-person testing suggest excellent validity. In addition, the comparison of the internet-based measures to ratings from teachers based on criteria from the UK National Curriculum suggests good concurrent validity for these tests. We conclude that Internet testing can be reliable and valid for collecting cognitive test data on large samples even for children as young as 10 years.
NASA Astrophysics Data System (ADS)
Grulke, Eric A.; Wu, Xiaochun; Ji, Yinglu; Buhr, Egbert; Yamamoto, Kazuhiro; Song, Nam Woong; Stefaniak, Aleksandr B.; Schwegler-Berry, Diane; Burchett, Woodrow W.; Lambert, Joshua; Stromberg, Arnold J.
2018-04-01
Size and shape distributions of gold nanorod samples are critical to their physico-chemical properties, especially their longitudinal surface plasmon resonance. This interlaboratory comparison study developed methods for measuring and evaluating size and shape distributions for gold nanorod samples using transmission electron microscopy (TEM) images. The objective was to determine whether two different samples, which had different performance attributes in their application, were different with respect to their size and/or shape descriptor distributions. Touching particles in the captured images were identified using a ruggedness shape descriptor. Nanorods could be distinguished from nanocubes using an elongational shape descriptor. A non-parametric statistical test showed that cumulative distributions of an elongational shape descriptor, that is, the aspect ratio, were statistically different between the two samples for all laboratories. While the scale parameters of size and shape distributions were similar for both samples, the width parameters of size and shape distributions were statistically different. This protocol fulfills an important need for a standardized approach to measure gold nanorod size and shape distributions for applications in which quantitative measurements and comparisons are important. Furthermore, the validated protocol workflow can be automated, thus providing consistent and rapid measurements of nanorod size and shape distributions for researchers, regulatory agencies, and industry.
McAlinden, Colm; Khadka, Jyoti; Pesudovs, Konrad
2011-07-01
The ever-expanding choice of ocular metrology and imaging equipment has driven research into the validity of their measurements. Consequently, studies of the agreement between two instruments or clinical tests have proliferated in the ophthalmic literature. It is important that researchers apply the appropriate statistical tests in agreement studies. Correlation coefficients are hazardous and should be avoided. The 'limits of agreement' method originally proposed by Altman and Bland in 1983 is the statistical procedure of choice. Its step-by-step use and practical considerations in relation to optometry and ophthalmology are detailed in addition to sample size considerations and statistical approaches to precision (repeatability or reproducibility) estimates. Ophthalmic & Physiological Optics © 2011 The College of Optometrists.
Anomaly detection for machine learning redshifts applied to SDSS galaxies
NASA Astrophysics Data System (ADS)
Hoyle, Ben; Rau, Markus Michael; Paech, Kerstin; Bonnett, Christopher; Seitz, Stella; Weller, Jochen
2015-10-01
We present an analysis of anomaly detection for machine learning redshift estimation. Anomaly detection allows the removal of poor training examples, which can adversely influence redshift estimates. Anomalous training examples may be photometric galaxies with incorrect spectroscopic redshifts, or galaxies with one or more poorly measured photometric quantity. We select 2.5 million `clean' SDSS DR12 galaxies with reliable spectroscopic redshifts, and 6730 `anomalous' galaxies with spectroscopic redshift measurements which are flagged as unreliable. We contaminate the clean base galaxy sample with galaxies with unreliable redshifts and attempt to recover the contaminating galaxies using the Elliptical Envelope technique. We then train four machine learning architectures for redshift analysis on both the contaminated sample and on the preprocessed `anomaly-removed' sample and measure redshift statistics on a clean validation sample generated without any preprocessing. We find an improvement on all measured statistics of up to 80 per cent when training on the anomaly removed sample as compared with training on the contaminated sample for each of the machine learning routines explored. We further describe a method to estimate the contamination fraction of a base data sample.
Risk prediction models of breast cancer: a systematic review of model performances.
Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin
2012-05-01
The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.
Olives, Casey; Valadez, Joseph J.; Brooker, Simon J.; Pagano, Marcello
2012-01-01
Background Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa. Methodology We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n = 15 and n = 25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa. Principle Findings Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n = 15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error. Conclusion/Significance This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools. PMID:22970333
Aggio, Raphael B. M.; de Lacy Costello, Ben; White, Paul; Khalid, Tanzeela; Ratcliffe, Norman M.; Persad, Raj; Probert, Chris S. J.
2016-01-01
Prostate cancer is one of the most common cancers. Serum prostate-specific antigen (PSA) is used to aid the selection of men undergoing biopsies. Its use remains controversial. We propose a GC-sensor algorithm system for classifying urine samples from patients with urological symptoms. This pilot study includes 155 men presenting to urology clinics, 58 were diagnosed with prostate cancer, 24 with bladder cancer and 73 with haematuria and or poor stream, without cancer. Principal component analysis (PCA) was applied to assess the discrimination achieved, while linear discriminant analysis (LDA) and support vector machine (SVM) were used as statistical models for sample classification. Leave-one-out cross-validation (LOOCV), repeated 10-fold cross-validation (10FoldCV), repeated double cross-validation (DoubleCV) and Monte Carlo permutations were applied to assess performance. Significant separation was found between prostate cancer and control samples, bladder cancer and controls and between bladder and prostate cancer samples. For prostate cancer diagnosis, the GC/SVM system classified samples with 95% sensitivity and 96% specificity after LOOCV. For bladder cancer diagnosis, the SVM reported 96% sensitivity and 100% specificity after LOOCV, while the DoubleCV reported 87% sensitivity and 99% specificity, with SVM showing 78% and 98% sensitivity between prostate and bladder cancer samples. Evaluation of the results of the Monte Carlo permutation of class labels obtained chance-like accuracy values around 50% suggesting the observed results for bladder cancer and prostate cancer detection are not due to over fitting. The results of the pilot study presented here indicate that the GC system is able to successfully identify patterns that allow classification of urine samples from patients with urological cancers. An accurate diagnosis based on urine samples would reduce the number of negative prostate biopsies performed, and the frequency of surveillance cystoscopy for bladder cancer patients. Larger cohort studies are planned to investigate the potential of this system. Future work may lead to non-invasive breath analyses for diagnosing urological conditions. PMID:26865331
SU-E-J-85: Leave-One-Out Perturbation (LOOP) Fitting Algorithm for Absolute Dose Film Calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chu, A; Ahmad, M; Chen, Z
2014-06-01
Purpose: To introduce an outliers-recognition fitting routine for film dosimetry. It cannot only be flexible with any linear and non-linear regression but also can provide information for the minimal number of sampling points, critical sampling distributions and evaluating analytical functions for absolute film-dose calibration. Methods: The technique, leave-one-out (LOO) cross validation, is often used for statistical analyses on model performance. We used LOO analyses with perturbed bootstrap fitting called leave-one-out perturbation (LOOP) for film-dose calibration . Given a threshold, the LOO process detects unfit points (“outliers”) compared to other cohorts, and a bootstrap fitting process follows to seek any possibilitiesmore » of using perturbations for further improvement. After that outliers were reconfirmed by a traditional t-test statistics and eliminated, then another LOOP feedback resulted in the final. An over-sampled film-dose- calibration dataset was collected as a reference (dose range: 0-800cGy), and various simulated conditions for outliers and sampling distributions were derived from the reference. Comparisons over the various conditions were made, and the performance of fitting functions, polynomial and rational functions, were evaluated. Results: (1) LOOP can prove its sensitive outlier-recognition by its statistical correlation to an exceptional better goodness-of-fit as outliers being left-out. (2) With sufficient statistical information, the LOOP can correct outliers under some low-sampling conditions that other “robust fits”, e.g. Least Absolute Residuals, cannot. (3) Complete cross-validated analyses of LOOP indicate that the function of rational type demonstrates a much superior performance compared to the polynomial. Even with 5 data points including one outlier, using LOOP with rational function can restore more than a 95% value back to its reference values, while the polynomial fitting completely failed under the same conditions. Conclusion: LOOP can cooperate with any fitting routine functioning as a “robust fit”. In addition, it can be set as a benchmark for film-dose calibration fitting performance.« less
Tian, Guo-Liang; Li, Hui-Qiong
2017-08-01
Some existing confidence interval methods and hypothesis testing methods in the analysis of a contingency table with incomplete observations in both margins entirely depend on an underlying assumption that the sampling distribution of the observed counts is a product of independent multinomial/binomial distributions for complete and incomplete counts. However, it can be shown that this independency assumption is incorrect and can result in unreliable conclusions because of the under-estimation of the uncertainty. Therefore, the first objective of this paper is to derive the valid joint sampling distribution of the observed counts in a contingency table with incomplete observations in both margins. The second objective is to provide a new framework for analyzing incomplete contingency tables based on the derived joint sampling distribution of the observed counts by developing a Fisher scoring algorithm to calculate maximum likelihood estimates of parameters of interest, the bootstrap confidence interval methods, and the bootstrap testing hypothesis methods. We compare the differences between the valid sampling distribution and the sampling distribution under the independency assumption. Simulation studies showed that average/expected confidence-interval widths of parameters based on the sampling distribution under the independency assumption are shorter than those based on the new sampling distribution, yielding unrealistic results. A real data set is analyzed to illustrate the application of the new sampling distribution for incomplete contingency tables and the analysis results again confirm the conclusions obtained from the simulation studies.
Xu, Rengyi; Mesaros, Clementina; Weng, Liwei; Snyder, Nathaniel W; Vachani, Anil; Blair, Ian A; Hwang, Wei-Ting
2017-07-01
We compared three statistical methods in selecting a panel of serum lipid biomarkers for mesothelioma and asbestos exposure. Serum samples from mesothelioma, asbestos-exposed subjects and controls (40 per group) were analyzed. Three variable selection methods were considered: top-ranked predictors from univariate model, stepwise and least absolute shrinkage and selection operator. Crossed-validated area under the receiver operating characteristic curve was used to compare the prediction performance. Lipids with high crossed-validated area under the curve were identified. Lipid with mass-to-charge ratio of 372.31 was selected by all three methods comparing mesothelioma versus control. Lipids with mass-to-charge ratio of 1464.80 and 329.21 were selected by two models for asbestos exposure versus control. Different methods selected a similar set of serum lipids. Combining candidate biomarkers can improve prediction.
Montei, Carolyn; McDougal, Susan; Mozola, Mark; Rice, Jennifer
2014-01-01
The Soleris Non-fermenting Total Viable Count method was previously validated for a wide variety of food products, including cocoa powder. A matrix extension study was conducted to validate the method for use with cocoa butter and cocoa liquor. Test samples included naturally contaminated cocoa liquor and cocoa butter inoculated with natural microbial flora derived from cocoa liquor. A probability of detection statistical model was used to compare Soleris results at multiple test thresholds (dilutions) with aerobic plate counts determined using the AOAC Official Method 966.23 dilution plating method. Results of the two methods were not statistically different at any dilution level in any of the three trials conducted. The Soleris method offers the advantage of results within 24 h, compared to the 48 h required by standard dilution plating methods.
Trends in study design and the statistical methods employed in a leading general medicine journal.
Gosho, M; Sato, Y; Nagashima, K; Takahashi, S
2018-02-01
Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing after the presentation of the FDA guidance for adaptive design. © 2017 John Wiley & Sons Ltd.
CARVALHO, Suzana Papile Maciel; BRITO, Liz Magalhães; de PAIVA, Luiz Airton Saavedra; BICUDO, Lucilene Arilho Ribeiro; CROSATO, Edgard Michel; de OLIVEIRA, Rogério Nogueira
2013-01-01
Validation studies of physical anthropology methods in the different population groups are extremely important, especially in cases in which the population variations may cause problems in the identification of a native individual by the application of norms developed for different communities. Objective This study aimed to estimate the gender of skeletons by application of the method of Oliveira, et al. (1995), previously used in a population sample from Northeast Brazil. Material and Methods The accuracy of this method was assessed for a population from Southeast Brazil and validated by statistical tests. The method used two mandibular measurements, namely the bigonial distance and the mandibular ramus height. The sample was composed of 66 skulls and the method was applied by two examiners. The results were statistically analyzed by the paired t test, logistic discriminant analysis and logistic regression. Results The results demonstrated that the application of the method of Oliveira, et al. (1995) in this population achieved very different outcomes between genders, with 100% for females and only 11% for males, which may be explained by ethnic differences. However, statistical adjustment of measurement data for the population analyzed allowed accuracy of 76.47% for males and 78.13% for females, with the creation of a new discriminant formula. Conclusion It was concluded that methods involving physical anthropology present high rate of accuracy for human identification, easy application, low cost and simplicity; however, the methodologies must be validated for the different populations due to differences in ethnic patterns, which are directly related to the phenotypic aspects. In this specific case, the method of Oliveira, et al. (1995) presented good accuracy and may be used for gender estimation in Brazil in two geographic regions, namely Northeast and Southeast; however, for other regions of the country (North, Central West and South), previous methodological adjustment is recommended as demonstrated in this study. PMID:24037076
Carvalho, Suzana Papile Maciel; Brito, Liz Magalhães; Paiva, Luiz Airton Saavedra de; Bicudo, Lucilene Arilho Ribeiro; Crosato, Edgard Michel; Oliveira, Rogério Nogueira de
2013-01-01
Validation studies of physical anthropology methods in the different population groups are extremely important, especially in cases in which the population variations may cause problems in the identification of a native individual by the application of norms developed for different communities. This study aimed to estimate the gender of skeletons by application of the method of Oliveira, et al. (1995), previously used in a population sample from Northeast Brazil. The accuracy of this method was assessed for a population from Southeast Brazil and validated by statistical tests. The method used two mandibular measurements, namely the bigonial distance and the mandibular ramus height. The sample was composed of 66 skulls and the method was applied by two examiners. The results were statistically analyzed by the paired t test, logistic discriminant analysis and logistic regression. The results demonstrated that the application of the method of Oliveira, et al. (1995) in this population achieved very different outcomes between genders, with 100% for females and only 11% for males, which may be explained by ethnic differences. However, statistical adjustment of measurement data for the population analyzed allowed accuracy of 76.47% for males and 78.13% for females, with the creation of a new discriminant formula. It was concluded that methods involving physical anthropology present high rate of accuracy for human identification, easy application, low cost and simplicity; however, the methodologies must be validated for the different populations due to differences in ethnic patterns, which are directly related to the phenotypic aspects. In this specific case, the method of Oliveira, et al. (1995) presented good accuracy and may be used for gender estimation in Brazil in two geographic regions, namely Northeast and Southeast; however, for other regions of the country (North, Central West and South), previous methodological adjustment is recommended as demonstrated in this study.
Wang, JianLi; Sareen, Jitender; Patten, Scott; Bolton, James; Schmitz, Norbert; Birney, Arden
2014-05-01
Prediction algorithms are useful for making clinical decisions and for population health planning. However, such prediction algorithms for first onset of major depression do not exist. The objective of this study was to develop and validate a prediction algorithm for first onset of major depression in the general population. Longitudinal study design with approximate 3-year follow-up. The study was based on data from a nationally representative sample of the US general population. A total of 28 059 individuals who participated in Waves 1 and 2 of the US National Epidemiologic Survey on Alcohol and Related Conditions and who had not had major depression at Wave 1 were included. The prediction algorithm was developed using logistic regression modelling in 21 813 participants from three census regions. The algorithm was validated in participants from the 4th census region (n=6246). Major depression occurred since Wave 1 of the National Epidemiologic Survey on Alcohol and Related Conditions, assessed by the Alcohol Use Disorder and Associated Disabilities Interview Schedule-diagnostic and statistical manual for mental disorders IV. A prediction algorithm containing 17 unique risk factors was developed. The algorithm had good discriminative power (C statistics=0.7538, 95% CI 0.7378 to 0.7699) and excellent calibration (F-adjusted test=1.00, p=0.448) with the weighted data. In the validation sample, the algorithm had a C statistic of 0.7259 and excellent calibration (Hosmer-Lemeshow χ(2)=3.41, p=0.906). The developed prediction algorithm has good discrimination and calibration capacity. It can be used by clinicians, mental health policy-makers and service planners and the general public to predict future risk of having major depression. The application of the algorithm may lead to increased personalisation of treatment, better clinical decisions and more optimal mental health service planning.
Renaudin, Isabelle; Poliakoff, Françoise
2017-01-01
A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of “Flavescence dorée” (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes’ theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods. PMID:28384335
Chabirand, Aude; Loiseau, Marianne; Renaudin, Isabelle; Poliakoff, Françoise
2017-01-01
A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of "Flavescence dorée" (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes' theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods.
Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano
2011-01-01
The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.
[Factor Analysis: Principles to Evaluate Measurement Tools for Mental Health].
Campo-Arias, Adalberto; Herazo, Edwin; Oviedo, Heidi Celina
2012-09-01
The validation of a measurement tool in mental health is a complex process that usually starts by estimating reliability, to later approach its validity. Factor analysis is a way to know the number of dimensions, domains or factors of a measuring tool, generally related to the construct validity of the scale. The analysis could be exploratory or confirmatory, and helps in the selection of the items with better performance. For an acceptable factor analysis, it is necessary to follow some steps and recommendations, conduct some statistical tests, and rely on a proper sample of participants. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
Fault detection, isolation, and diagnosis of self-validating multifunctional sensors.
Yang, Jing-Li; Chen, Yin-Sheng; Zhang, Li-Li; Sun, Zhen
2016-06-01
A novel fault detection, isolation, and diagnosis (FDID) strategy for self-validating multifunctional sensors is presented in this paper. The sparse non-negative matrix factorization-based method can effectively detect faults by using the squared prediction error (SPE) statistic, and the variables contribution plots based on SPE statistic can help to locate and isolate the faulty sensitive units. The complete ensemble empirical mode decomposition is employed to decompose the fault signals to a series of intrinsic mode functions (IMFs) and a residual. The sample entropy (SampEn)-weighted energy values of each IMFs and the residual are estimated to represent the characteristics of the fault signals. Multi-class support vector machine is introduced to identify the fault mode with the purpose of diagnosing status of the faulty sensitive units. The performance of the proposed strategy is compared with other fault detection strategies such as principal component analysis, independent component analysis, and fault diagnosis strategies such as empirical mode decomposition coupled with support vector machine. The proposed strategy is fully evaluated in a real self-validating multifunctional sensors experimental system, and the experimental results demonstrate that the proposed strategy provides an excellent solution to the FDID research topic of self-validating multifunctional sensors.
Environmental Validation of Legionella Control in a VHA Facility Water System.
Jinadatha, Chetan; Stock, Eileen M; Miller, Steve E; McCoy, William F
2018-03-01
OBJECTIVES We conducted this study to determine what sample volume, concentration, and limit of detection (LOD) are adequate for environmental validation of Legionella control. We also sought to determine whether time required to obtain culture results can be reduced compared to spread-plate culture method. We also assessed whether polymerase chain reaction (PCR) and in-field total heterotrophic aerobic bacteria (THAB) counts are reliable indicators of Legionella in water samples from buildings. DESIGN Comparative Legionella screening and diagnostics study for environmental validation of a healthcare building water system. SETTING Veterans Health Administration (VHA) facility water system in central Texas. METHODS We analyzed 50 water samples (26 hot, 24 cold) from 40 sinks and 10 showers using spread-plate cultures (International Standards Organization [ISO] 11731) on samples shipped overnight to the analytical lab. In-field, on-site cultures were obtained using the PVT (Phigenics Validation Test) culture dipslide-format sampler. A PCR assay for genus-level Legionella was performed on every sample. RESULTS No practical differences regardless of sample volume filtered were observed. Larger sample volumes yielded more detections of Legionella. No statistically significant differences at the 1 colony-forming unit (CFU)/mL or 10 CFU/mL LOD were observed. Approximately 75% less time was required when cultures were started in the field. The PCR results provided an early warning, which was confirmed by spread-plate cultures. The THAB results did not correlate with Legionella status. CONCLUSIONS For environmental validation at this facility, we confirmed that (1) 100 mL sample volumes were adequate, (2) 10× concentrations were adequate, (3) 10 CFU/mL LOD was adequate, (4) in-field cultures reliably reduced time to get results by 75%, (5) PCR provided a reliable early warning, and (6) THAB was not predictive of Legionella results. Infect Control Hosp Epidemiol 2018;39:259-266.
NASA Astrophysics Data System (ADS)
Amesbury, Matthew J.; Swindles, Graeme T.; Bobrov, Anatoly; Charman, Dan J.; Holden, Joseph; Lamentowicz, Mariusz; Mallon, Gunnar; Mazei, Yuri; Mitchell, Edward A. D.; Payne, Richard J.; Roland, Thomas P.; Turner, T. Edward; Warner, Barry G.
2016-11-01
In the decade since the first pan-European testate amoeba-based transfer function for peatland palaeohydrological reconstruction was published, a vast amount of additional data collection has been undertaken by the research community. Here, we expand the pan-European dataset from 128 to 1799 samples, spanning 35° of latitude and 55° of longitude. After the development of a new taxonomic scheme to permit compilation of data from a wide range of contributors and the removal of samples with high pH values, we developed ecological transfer functions using a range of model types and a dataset of ∼1300 samples. We rigorously tested the efficacy of these models using both statistical validation and independent test sets with associated instrumental data. Model performance measured by statistical indicators was comparable to other published models. Comparison to test sets showed that taxonomic resolution did not impair model performance and that the new pan-European model can therefore be used as an effective tool for palaeohydrological reconstruction. Our results question the efficacy of relying on statistical validation of transfer functions alone and support a multi-faceted approach to the assessment of new models. We substantiated recent advice that model outputs should be standardised and presented as residual values in order to focus interpretation on secure directional shifts, avoiding potentially inaccurate conclusions relating to specific water-table depths. The extent and diversity of the dataset highlighted that, at the taxonomic resolution applied, a majority of taxa had broad geographic distributions, though some morphotypes appeared to have restricted ranges.
Testing for qualitative heterogeneity: An application to composite endpoints in survival analysis.
Oulhaj, Abderrahim; El Ghouch, Anouar; Holman, Rury R
2017-01-01
Composite endpoints are frequently used in clinical outcome trials to provide more endpoints, thereby increasing statistical power. A key requirement for a composite endpoint to be meaningful is the absence of the so-called qualitative heterogeneity to ensure a valid overall interpretation of any treatment effect identified. Qualitative heterogeneity occurs when individual components of a composite endpoint exhibit differences in the direction of a treatment effect. In this paper, we develop a general statistical method to test for qualitative heterogeneity, that is to test whether a given set of parameters share the same sign. This method is based on the intersection-union principle and, provided that the sample size is large, is valid whatever the model used for parameters estimation. We propose two versions of our testing procedure, one based on a random sampling from a Gaussian distribution and another version based on bootstrapping. Our work covers both the case of completely observed data and the case where some observations are censored which is an important issue in many clinical trials. We evaluated the size and power of our proposed tests by carrying out some extensive Monte Carlo simulations in the case of multivariate time to event data. The simulations were designed under a variety of conditions on dimensionality, censoring rate, sample size and correlation structure. Our testing procedure showed very good performances in terms of statistical power and type I error. The proposed test was applied to a data set from a single-center, randomized, double-blind controlled trial in the area of Alzheimer's disease.
Lee, Jason; Morishima, Toshitaka; Kunisawa, Susumu; Sasaki, Noriko; Otsubo, Tetsuya; Ikai, Hiroshi; Imanaka, Yuichi
2013-01-01
Stroke and other cerebrovascular diseases are a major cause of death and disability. Predicting in-hospital mortality in ischaemic stroke patients can help to identify high-risk patients and guide treatment approaches. Chart reviews provide important clinical information for mortality prediction, but are laborious and limiting in sample sizes. Administrative data allow for large-scale multi-institutional analyses but lack the necessary clinical information for outcome research. However, administrative claims data in Japan has seen the recent inclusion of patient consciousness and disability information, which may allow more accurate mortality prediction using administrative data alone. The aim of this study was to derive and validate models to predict in-hospital mortality in patients admitted for ischaemic stroke using administrative data. The sample consisted of 21,445 patients from 176 Japanese hospitals, who were randomly divided into derivation and validation subgroups. Multivariable logistic regression models were developed using 7- and 30-day and overall in-hospital mortality as dependent variables. Independent variables included patient age, sex, comorbidities upon admission, Japan Coma Scale (JCS) score, Barthel Index score, modified Rankin Scale (mRS) score, and admissions after hours and on weekends/public holidays. Models were developed in the derivation subgroup, and coefficients from these models were applied to the validation subgroup. Predictive ability was analysed using C-statistics; calibration was evaluated with Hosmer-Lemeshow χ(2) tests. All three models showed predictive abilities similar or surpassing that of chart review-based models. The C-statistics were highest in the 7-day in-hospital mortality prediction model, at 0.906 and 0.901 in the derivation and validation subgroups, respectively. For the 30-day in-hospital mortality prediction models, the C-statistics for the derivation and validation subgroups were 0.893 and 0.872, respectively; in overall in-hospital mortality prediction these values were 0.883 and 0.876. In this study, we have derived and validated in-hospital mortality prediction models for three different time spans using a large population of ischaemic stroke patients in a multi-institutional analysis. The recent inclusion of JCS, Barthel Index, and mRS scores in Japanese administrative data has allowed the prediction of in-hospital mortality with accuracy comparable to that of chart review analyses. The models developed using administrative data had consistently high predictive abilities for all models in both the derivation and validation subgroups. These results have implications in the role of administrative data in future mortality prediction analyses. Copyright © 2013 S. Karger AG, Basel.
Trajectory Design for a Single-String Impactor Concept
NASA Technical Reports Server (NTRS)
Dono Perez, Andres; Burton, Roland; Stupl, Jan; Mauro, David
2017-01-01
This paper introduces a trajectory design for a secondary spacecraft concept to augment science return in interplanetary missions. The concept consist of a single-string probe with a kinetic impactor on board that generates an artificial plume to perform in-situ sampling. The trajectory design was applied to a particular case study that samples ejecta particles from the Jovian moon Europa. Results were validated using statistical analysis. Details regarding the navigation, targeting and disposal challenges related to this concept are presented herein.
Women's Liberation Scale (WLS): A Measure of Attitudes Toward Positions Advocated by Women's Groups.
ERIC Educational Resources Information Center
Goldberg, Carlos
The Women's Liberation Scale (WLS) is a 14-item, Likert-type scale designed to measure attitudes toward positions advocated by women's groups. The WLS and its four-alternative response schema is presented, along with descriptive statistics of scores based on male and female college samples. Reliability and validity measures are reported, and the…
INVESTIGATION OF THE USE OF STATISTICS IN COUNSELING STUDENTS.
ERIC Educational Resources Information Center
HEWES, ROBERT F.
THE OBJECTIVE WAS TO EMPLOY TECHNIQUES OF PROFILE ANALYSIS TO DEVELOP THE JOINT PROBABILITY OF SELECTING A SUITABLE SUBJECT MAJOR AND OF ASSURING TO A HIGH DEGREE GRADUATION FROM COLLEGE WITH THAT MAJOR. THE SAMPLE INCLUDED 1,197 MIT FRESHMEN STUDENTS IN 1952-53, AND THE VALIDATION GROUP INCLUDED 699 ENTRANTS IN 1954. DATA INCLUDED SECONDARY…
El Khattabi, Laïla Allach; Rouillac-Le Sciellour, Christelle; Le Tessier, Dominique; Luscan, Armelle; Coustier, Audrey; Porcher, Raphael; Bhouri, Rakia; Nectoux, Juliette; Sérazin, Valérie; Quibel, Thibaut; Mandelbrot, Laurent; Tsatsaris, Vassilis
2016-01-01
Objective NIPT for fetal aneuploidy by digital PCR has been hampered by the large number of PCR reactions needed to meet statistical requirements, preventing clinical application. Here, we designed an octoplex droplet digital PCR (ddPCR) assay which allows increasing the number of available targets and thus overcomes statistical obstacles. Method After technical optimization of the multiplex PCR on mixtures of trisomic and euploid DNA, we performed a validation study on samples of plasma DNA from 213 pregnant women. Molecular counting of circulating cell-free DNA was performed using a mix of hydrolysis probes targeting chromosome 21 and a reference chromosome. Results The results of our validation experiments showed that ddPCR detected trisomy 21 even when the sample’s trisomic DNA content is as low as 5%. In a validation study of plasma samples from 213 pregnant women, ddPCR discriminated clearly between the trisomy 21 and the euploidy groups. Conclusion Our results demonstrate that digital PCR can meet the requirements for non-invasive prenatal testing of trisomy 21. This approach is technically simple, relatively cheap, easy to implement in a diagnostic setting and compatible with ethical concerns regarding access to nucleotide sequence information. These advantages make it a potential technique of choice for population-wide screening for trisomy 21 in pregnant women. PMID:27167625
Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul
2017-08-01
The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version's validity. Composite reliability was tested to determine the scale's reliability. All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6) , for the Malaysian population.
Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul
2017-01-01
Background The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. Methods The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version’s validity. Composite reliability was tested to determine the scale’s reliability. Results All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. Conclusion The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6), for the Malaysian population. PMID:28951688
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pombet, Denis; Desnoyers, Yvon; Charters, Grant
2013-07-01
The TruPro{sup R} process enables to collect a significant number of samples to characterize radiological materials. This innovative and alternative technique is experimented for the ANDRA quality-control inspection of cemented packages. It proves to be quicker and more prolific than the current methodology. Using classical statistics and geo-statistics approaches, the physical and radiological characteristics of two hulls containing immobilized wastes (sludges or concentrates) in a hydraulic binder are assessed in this paper. The waste homogeneity is also evaluated in comparison to ANDRA criterion. Sensibility to sample size (support effect), presence of extreme values, acceptable deviation rate and minimum number ofmore » data are discussed. The final objectives are to check the homogeneity of the two characterized radwaste packages and also to validate and reinforce this alternative characterization methodology. (authors)« less
Validation of Splicing Events in Transcriptome Sequencing Data
Kaisers, Wolfgang; Ptok, Johannes; Schwender, Holger; Schaal, Heiner
2017-01-01
Genomic alignments of sequenced cellular messenger RNA contain gapped alignments which are interpreted as consequence of intron removal. The resulting gap-sites, genomic locations of alignment gaps, are landmarks representing potential splice-sites. As alignment algorithms report gap-sites with a considerable false discovery rate, validations are required. We describe two quality scores, gap quality score (gqs) and weighted gap information score (wgis), developed for validation of putative splicing events: While gqs solely relies on alignment data wgis additionally considers information from the genomic sequence. FASTQ files obtained from 54 human dermal fibroblast samples were aligned against the human genome (GRCh38) using TopHat and STAR aligner. Statistical properties of gap-sites validated by gqs and wgis were evaluated by their sequence similarity to known exon-intron borders. Within the 54 samples, TopHat identifies 1,000,380 and STAR reports 6,487,577 gap-sites. Due to the lack of strand information, however, the percentage of identified GT-AG gap-sites is rather low. While gap-sites from TopHat contain ≈89% GT-AG, gap-sites from STAR only contain ≈42% GT-AG dinucleotide pairs in merged data from 54 fibroblast samples. Validation with gqs yields 156,251 gap-sites from TopHat alignments and 166,294 from STAR alignments. Validation with wgis yields 770,327 gap-sites from TopHat alignments and 1,065,596 from STAR alignments. Both alignment algorithms, TopHat and STAR, report gap-sites with considerable false discovery rate, which can drastically be reduced by validation with gqs and wgis. PMID:28545234
275 Candidates and 149 Validated Planets Orbiting Bright Stars in K2 Campaigns 0–10
NASA Astrophysics Data System (ADS)
Mayo, Andrew W.; Vanderburg, Andrew; Latham, David W.; Bieryla, Allyson; Morton, Timothy D.; Buchhave, Lars A.; Dressing, Courtney D.; Beichman, Charles; Berlind, Perry; Calkins, Michael L.; Ciardi, David R.; Crossfield, Ian J. M.; Esquerdo, Gilbert A.; Everett, Mark E.; Gonzales, Erica J.; Hirsch, Lea A.; Horch, Elliott P.; Howard, Andrew W.; Howell, Steve B.; Livingston, John; Patel, Rahul; Petigura, Erik A.; Schlieder, Joshua E.; Scott, Nicholas J.; Schumer, Clea F.; Sinukoff, Evan; Teske, Johanna; Winters, Jennifer G.
2018-03-01
Since 2014, NASA’s K2 mission has observed large portions of the ecliptic plane in search of transiting planets and has detected hundreds of planet candidates. With observations planned until at least early 2018, K2 will continue to identify more planet candidates. We present here 275 planet candidates observed during Campaigns 0–10 of the K2 mission that are orbiting stars brighter than 13 mag (in Kepler band) and for which we have obtained high-resolution spectra (R = 44,000). These candidates are analyzed using the vespa package in order to calculate their false-positive probabilities (FPP). We find that 149 candidates are validated with an FPP lower than 0.1%, 39 of which were previously only candidates and 56 of which were previously undetected. The processes of data reduction, candidate identification, and statistical validation are described, and the demographics of the candidates and newly validated planets are explored. We show tentative evidence of a gap in the planet radius distribution of our candidate sample. Comparing our sample to the Kepler candidate sample investigated by Fulton et al., we conclude that more planets are required to quantitatively confirm the gap with K2 candidates or validated planets. This work, in addition to increasing the population of validated K2 planets by nearly 50% and providing new targets for follow-up observations, will also serve as a framework for validating candidates from upcoming K2 campaigns and the Transiting Exoplanet Survey Satellite, expected to launch in 2018.
Peters, L L; Boter, H; Burgerhof, J G M; Slaets, J P J; Buskens, E
2015-09-01
The primary objective of the present study was to evaluate the validity of the Groningen Frailty Indicator (GFI) in a sample of Dutch elderly persons participating in LifeLines, a large population-based cohort study. Additional aims were to assess differences between frail and non-frail elderly and examine which individual characteristics were associated with frailty. By December 2012, 5712 elderly persons were enrolled in LifeLines and complied with the inclusion criteria of the present study. Mann-Whitney U or Kruskal-Wallis tests were used to assess the variability of GFI-scores among elderly subgroups that differed in demographic characteristics, morbidity, obesity, and healthcare utilization. Within subgroups Kruskal-Wallis tests were also used to examine differences in GFI-scores across age groups. Multivariate logistic regression analyses were performed to assess associations between individual characteristics and frailty. The GFI discriminated between subgroups: statistically significantly higher GFI-median scores (interquartile range) were found in e.g. males (1 [0-2]), the oldest old (2 [1-3]), in elderly who were single (1 [0-2]), with lower socio economic status (1 [0-3]), with increasing co-morbidity (2 [1-3]), who were obese (2 [1-3]), and used more healthcare (2 [1-4]). Overall age had an independent and statistically significant association with GFI scores. Compared with the non-frail, frail elderly persons experienced statistically significantly more chronic stress and more social/psychological related problems. In the multivariate logistic regression model, psychological morbidity had the strongest association with frailty. The present study supports the construct validity of the GFI and provides an insight in the characteristics of (non)frail community-dwelling elderly persons participating in LifeLines. Copyright © 2015 Elsevier Inc. All rights reserved.
auf dem Keller, Ulrich; Prudova, Anna; Gioia, Magda; Butler, Georgina S.; Overall, Christopher M.
2010-01-01
Terminal amine isotopic labeling of substrates (TAILS), our recently introduced platform for quantitative N-terminome analysis, enables wide dynamic range identification of original mature protein N-termini and protease cleavage products. Modifying TAILS by use of isobaric tag for relative and absolute quantification (iTRAQ)-like labels for quantification together with a robust statistical classifier derived from experimental protease cleavage data, we report reliable and statistically valid identification of proteolytic events in complex biological systems in MS2 mode. The statistical classifier is supported by a novel parameter evaluating ion intensity-dependent quantification confidences of single peptide quantifications, the quantification confidence factor (QCF). Furthermore, the isoform assignment score (IAS) is introduced, a new scoring system for the evaluation of single peptide-to-protein assignments based on high confidence protein identifications in the same sample prior to negative selection enrichment of N-terminal peptides. By these approaches, we identified and validated, in addition to known substrates, low abundance novel bioactive MMP-2 targets including the plasminogen receptor S100A10 (p11) and the proinflammatory cytokine proEMAP/p43 that were previously undescribed. PMID:20305283
Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun
2017-08-01
Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2 = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.
NASA Astrophysics Data System (ADS)
Martin, C.; Nicolaysen, K. P.; McConville, K.; Hatfield, V.; West, D.
2013-12-01
By examining the existing geological and archeological record of radiocarbon dated Aleutian tephras of the last 12,000 years, this study sought to determine whether there were spatial or temporal patterns of explosive eruptive activity. The Holocene tephra record has important implications because two episodes of migration and colonization by humans of distinct cultures established the Unangan/Aleut peoples of the Aleutian Islands concurrently with the volcanic activity. From Aniakchak Volcano on the Alaska Peninsula to the Andreanof Islands (158 to 178° W longitude), 55 distinct tephras represent significant explosive eruptions of the last 12,000 years. Initial results suggest that the Andreanof and Fox Island regions of the archipelago have had frequent explosive eruptions whereas the Islands of Four Mountains, Rat, and Near Island regions have apparently had little or no eruptive activity. However, one clear result of the investigation is that sampling bias strongly influences the apparent spatial patterns. For example field reconnaissance in the Islands of Four Mountains documents two Holocene calderas and a minimum of 20 undated tephras in addition to the large ignimbrites. Only the lack of significant explosive activity in the Near Islands seems a valid spatial result as archeological excavations and geologic reports failed to document Holocene tephras there. An intriguing preliminary temporal pattern is the apparent absence of large explosive eruptions across the archipelago from ca. 4,800 to 6,000 yBP. To test the validity of apparent patterns, a statistical treatment of the compiled data grappled with the sampling bias by considering three confounding variables: larger island size allows more opportunity for geologic preservation of tephras; larger magnitude eruption promotes tephra preservation by creating thicker and more widespread deposits; the comprehensiveness of the tephra sampling of each volcano and island varies widely because of logistical and financial limitations. This initial statistical investigation proposes variables to mitigate the effects of sampling bias and makes recommendations for sampling strategies to enable statistically valid examination of research questions. Further, though caldera-forming eruptions occurred throughout the Holocene - and several remain undated - four of six dated eruptions occurred throughout the archipelago between 8,000-9,100 yBP, a period coinciding with some of the earliest human occupation (Early Anangula Phase) of the eastern Aleutians.
Statistical Considerations of Food Allergy Prevention Studies.
Bahnson, Henry T; du Toit, George; Lack, Gideon
Clinical studies to prevent the development of food allergy have recently helped reshape public policy recommendations on the early introduction of allergenic foods. These trials are also prompting new research, and it is therefore important to address the unique design and analysis challenges of prevention trials. We highlight statistical concepts and give recommendations that clinical researchers may wish to adopt when designing future study protocols and analysis plans for prevention studies. Topics include selecting a study sample, addressing internal and external validity, improving statistical power, choosing alpha and beta, analysis innovations to address dilution effects, and analysis methods to deal with poor compliance, dropout, and missing data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Moghadam, Manije; Salavati, Mahyar; Sahaf, Robab; Rassouli, Maryam; Moghadam, Mojgan; Kamrani, Ahmad Ali Akbari
2018-03-01
After forward-backward translation, the LSS was administered to 334 Persian speaking, cognitively healthy elderly aged 60 years and over recruited through convenience sampling. To analyze the validity of the model's constructs and the relationships between the constructs, a confirmatory factor analysis followed by PLS analysis was performed. The Construct validity was further investigated by calculating the correlations between the LSS and the "Short Form Health Survey" (SF-36) subscales measuring similar and dissimilar constructs. The LSS was re-administered to 50 participants a month later to assess the reliability. For the eight-factor model of the life satisfaction construct, adequate goodness of fit between the hypothesized model and the model derived from the sample data was attained (positive and statistically significant beta coefficients, good R-squares and acceptable GoF). Construct validity was supported by convergent and discriminant validity, and correlations between the LSS and SF-36 subscales. Minimum Intraclass Correlation Coefficient level of 0.60 was exceeded by all subscales. Minimum level of reliability indices (Cronbach's α, composite reliability and indicator reliability) was exceeded by all subscales. The Persian-version of the Life Satisfaction Scale is a reliable and valid instrument, with psychometric properties which are consistent with the original version.
Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods.
Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J Sunil
2014-08-01
We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called "Patient Recursive Survival Peeling" is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called "combined" cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication.
Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods
Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil
2015-01-01
We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called “Patient Recursive Survival Peeling” is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called “combined” cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication. PMID:26997922
Development of a problematic mobile phone use scale for Turkish adolescents.
Güzeller, Cem Oktay; Coşguner, Tolga
2012-04-01
Abstract The aim of this study was to evaluate the psychometric properties of the Problematic Mobile Phone Use Scale (PMPUS) for Turkish Adolescents. The psychometric properties of PMPUS were tested in two separate sample groups that consisted of 950 Turkish high school students. The first sample group (n=309) was used to determine the factor structure of the scale. The second sample group (n=461) was used to test data conformity with the identified structure, discriminant validity and concurrent scale validity, internal consistency reliability calculations, and item statistics calculations. The results of exploratory factor analyses indicated that the scale had three factors: interference with negative effect, compulsion/persistence, and withdrawal/tolerance. The results showed that item and construct reliability values yielded satisfactory rates in general for the three-factor construct. On the other hand, the average variance extracted value remained below the scale value for three subscales. The scores for the scale significantly correlated with depression and loneliness. In addition, the discriminant validity value was above the scale in all sub-dimensions except one. Based on these data, the reliability of the PMPUS scale appears to be satisfactory and provides good internal consistency. Therefore, with limited exception, the PMPUS was found to be reliable and valid in the context of Turkish adolescents.
Geographic Information Systems to Assess External Validity in Randomized Trials.
Savoca, Margaret R; Ludwig, David A; Jones, Stedman T; Jason Clodfelter, K; Sloop, Joseph B; Bollhalter, Linda Y; Bertoni, Alain G
2017-08-01
To support claims that RCTs can reduce health disparities (i.e., are translational), it is imperative that methodologies exist to evaluate the tenability of external validity in RCTs when probabilistic sampling of participants is not employed. Typically, attempts at establishing post hoc external validity are limited to a few comparisons across convenience variables, which must be available in both sample and population. A Type 2 diabetes RCT was used as an example of a method that uses a geographic information system to assess external validity in the absence of a priori probabilistic community-wide diabetes risk sampling strategy. A geographic information system, 2009-2013 county death certificate records, and 2013-2014 electronic medical records were used to identify community-wide diabetes prevalence. Color-coded diabetes density maps provided visual representation of these densities. Chi-square goodness of fit statistic/analysis tested the degree to which distribution of RCT participants varied across density classes compared to what would be expected, given simple random sampling of the county population. Analyses were conducted in 2016. Diabetes prevalence areas as represented by death certificate and electronic medical records were distributed similarly. The simple random sample model was not a good fit for death certificate record (chi-square, 17.63; p=0.0001) and electronic medical record data (chi-square, 28.92; p<0.0001). Generally, RCT participants were oversampled in high-diabetes density areas. Location is a highly reliable "principal variable" associated with health disparities. It serves as a directly measurable proxy for high-risk underserved communities, thus offering an effective and practical approach for examining external validity of RCTs. Copyright © 2017 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Reliability and validity of an Internet traumatic stress survey with a college student sample.
Fortson, Beverly L; Scotti, Joseph R; Del Ben, Kevin S; Chen, Yi-Chuen
2006-10-01
The reliability and validity of Internet-based questionnaires were assessed in a sample of undergraduates (N = 411) by comparing data collected via the Internet with data collected in a more traditional format. A 2 x 2 x 2 repeated measures factorial design was used, forming four groups: Paper-Paper, Paper-Internet, Internet-Paper, and Internet-Internet. Scores on measures of trauma exposure, depression, and posttraumatic stress symptoms formed the dependent variables. Statistical analyses demonstrated that the psychometric properties of Internet-based questionnaires are similar to those established via formats that are more traditional. Questionnaire format and presentation order did not affect rates of psychological symptoms endorsed by participants. Researchers can feel comfortable that Internet data collection is a viable--and reliable--means for conducting trauma research.
Chapman, Benjamin P; Weiss, Alexander; Duberstein, Paul R
2016-12-01
Statistical learning theory (SLT) is the statistical formulation of machine learning theory, a body of analytic methods common in "big data" problems. Regression-based SLT algorithms seek to maximize predictive accuracy for some outcome, given a large pool of potential predictors, without overfitting the sample. Research goals in psychology may sometimes call for high dimensional regression. One example is criterion-keyed scale construction, where a scale with maximal predictive validity must be built from a large item pool. Using this as a working example, we first introduce a core principle of SLT methods: minimization of expected prediction error (EPE). Minimizing EPE is fundamentally different than maximizing the within-sample likelihood, and hinges on building a predictive model of sufficient complexity to predict the outcome well, without undue complexity leading to overfitting. We describe how such models are built and refined via cross-validation. We then illustrate how 3 common SLT algorithms-supervised principal components, regularization, and boosting-can be used to construct a criterion-keyed scale predicting all-cause mortality, using a large personality item pool within a population cohort. Each algorithm illustrates a different approach to minimizing EPE. Finally, we consider broader applications of SLT predictive algorithms, both as supportive analytic tools for conventional methods, and as primary analytic tools in discovery phase research. We conclude that despite their differences from the classic null-hypothesis testing approach-or perhaps because of them-SLT methods may hold value as a statistically rigorous approach to exploratory regression. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Experimental control in software reliability certification
NASA Technical Reports Server (NTRS)
Trammell, Carmen J.; Poore, Jesse H.
1994-01-01
There is growing interest in software 'certification', i.e., confirmation that software has performed satisfactorily under a defined certification protocol. Regulatory agencies, customers, and prospective reusers all want assurance that a defined product standard has been met. In other industries, products are typically certified under protocols in which random samples of the product are drawn, tests characteristic of operational use are applied, analytical or statistical inferences are made, and products meeting a standard are 'certified' as fit for use. A warranty statement is often issued upon satisfactory completion of a certification protocol. This paper outlines specific engineering practices that must be used to preserve the validity of the statistical certification testing protocol. The assumptions associated with a statistical experiment are given, and their implications for statistical testing of software are described.
Validation of the Hospital Ethical Climate Survey for older people care.
Suhonen, Riitta; Stolt, Minna; Katajisto, Jouko; Charalambous, Andreas; Olson, Linda L
2015-08-01
The exploration of the ethical climate in the care settings for older people is highlighted in the literature, and it has been associated with various aspects of clinical practice and nurses' jobs. However, ethical climate is seldom studied in the older people care context. Valid, reliable, feasible measures are needed for the measurement of ethical climate. This study aimed to test the reliability, validity, and sensitivity of the Hospital Ethical Climate Survey in healthcare settings for older people. A non-experimental cross-sectional study design was employed, and a survey using questionnaires, including the Hospital Ethical Climate Survey was used for data collection. Data were analyzed using descriptive statistics, inferential statistics, and multivariable methods. Survey data were collected from a sample of nurses working in the care settings for older people in Finland (N = 1513, n = 874, response rate = 58%) in 2011. This study was conducted according to good scientific inquiry guidelines, and ethical approval was obtained from the university ethics committee. The mean score for the Hospital Ethical Climate Survey total was 3.85 (standard deviation = 0.56). Cronbach's alpha was 0.92. Principal component analysis provided evidence for factorial validity. LISREL provided evidence for construct validity based on goodness-of-fit statistics. Pearson's correlations of 0.68-0.90 were found between the sub-scales and the Hospital Ethical Climate Survey. The Hospital Ethical Climate Survey was found able to reveal discrimination across care settings and proved to be a valid and reliable tool for measuring ethical climate in care settings for older people and sensitive enough to reveal variations across various clinical settings. The Finnish version of the Hospital Ethical Climate Survey, used mainly in the hospital settings previously, proved to be a valid instrument to be used in the care settings for older people. Further studies are due to analyze the factor structure and some items of the Hospital Ethical Climate Survey. © The Author(s) 2014.
[Psychometric properties and diagnostic value of 'lexical screening for aphasias'].
Pena-Chavez, R; Martinez-Jimenez, L; Lopez-Espinoza, M
2014-09-16
INTRODUCTION. Language assessment in persons with brain injury makes it possible to know whether they require language rehabilitation or not. Given the importance of a precise evaluation, assessment instruments must be valid and reliable, so as to avoid mistaken and subjective diagnoses. AIM. To validate 'lexical screening for aphasias' in a sample of 58 Chilean individuals. SUBJECTS AND METHODS. A screening-type language test, lasting 20 minutes and based on the lexical processing model devised by Patterson and Shewell (1987), was constructed. The sample was made up of two groups containing 29 aphasic subjects and 29 control subjects from different health centres in the regions of Biobio and Maule, Chile. Their ages ranged between 24 and 79 years and had between 0 and 17 years' schooling. Tests were carried out to determine discriminating validity, concurrent validity with the aphasia disorder assessment battery, reliability, sensitivity and specificity. RESULTS. The statistical analysis showed a high discriminating validity (p < 0.001), an acceptable mean concurrent validity with aphasia disorder assessment battery (rs = 0.65), high mean reliability (alpha = 0.87), moderate mean sensitivity (69%) and high mean specificity (86%). CONCLUSION. 'Lexical screening for aphasias' is valid and reliable for assessing language in persons with aphasias; it is sensitive for detecting aphasic subjects and is specific for precluding language disorders in persons with normal language abilities.
Hentschel, Annett G; Livesley, W John
2013-01-01
Recent developments in the classification of personality disorder, especially moves toward more dimensional systems, create the need to assess general personality disorder apart from individual differences in personality pathology. The General Assessment of Personality Disorder (GAPD) is a self-report questionnaire designed to evaluate general personality disorder. The measure evaluates 2 major components of disordered personality: self or identity problems and interpersonal dysfunction. This study explores whether there is a single factor reflecting general personality pathology as proposed by the Diagnostic and Statistical Manual of Mental Disorders (5th ed.), whether self-pathology has incremental validity over interpersonal pathology as measured by GAPD, and whether GAPD scales relate significantly to Diagnostic and Statistical Manual of Mental Disorders (4th ed. [DSM-IV]) personality disorders. Based on responses from a German psychiatric sample of 149 participants, parallel analysis yielded a 1-factor model. Self Pathology scales of the GAPD increased the predictive validity of the Interpersonal Pathology scales of the GAPD. The GAPD scales showed a moderate to high correlation for 9 of 12 DSM-IV personality disorders.
Longobardi, Francesco; Innamorato, Valentina; Di Gioia, Annalisa; Ventrella, Andrea; Lippolis, Vincenzo; Logrieco, Antonio F; Catucci, Lucia; Agostiano, Angela
2017-12-15
Lentil samples coming from two different countries, i.e. Italy and Canada, were analysed using untargeted 1 H NMR fingerprinting in combination with chemometrics in order to build models able to classify them according to their geographical origin. For such aim, Soft Independent Modelling of Class Analogy (SIMCA), k-Nearest Neighbor (k-NN), Principal Component Analysis followed by Linear Discriminant Analysis (PCA-LDA) and Partial Least Squares-Discriminant Analysis (PLS-DA) were applied to the NMR data and the results were compared. The best combination of average recognition (100%) and cross-validation prediction abilities (96.7%) was obtained for the PCA-LDA. All the statistical models were validated both by using a test set and by carrying out a Monte Carlo Cross Validation: the obtained performances were found to be satisfying for all the models, with prediction abilities higher than 95% demonstrating the suitability of the developed methods. Finally, the metabolites that mostly contributed to the lentil discrimination were indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evolution of high-mass star-forming regions .
NASA Astrophysics Data System (ADS)
Giannetti, A.; Leurini, S.; Wyrowski, F.; Urquhart, J.; König, C.; Csengeri, T.; Güsten, R.; Menten, K. M.
Observational identification of a coherent evolutionary sequence for high-mass star-forming regions is still missing. We use the progressive heating of the gas caused by the feedback of high-mass young stellar objects to prove the statistical validity of the most common schemes used to observationally define an evolutionary sequence for high-mass clumps, and identify which physical process dominates in the different phases. From the spectroscopic follow-ups carried out towards the TOP100 sample between 84 and 365 km s^-1 giga hertz, we selected several multiplets of CH3CN, CH3CCH, and CH3OH lines to derive the physical properties of the gas in the clumps along the evolutionary sequence. We demonstrate that the evolutionary sequence is statistically valid, and we define intervals in L/M separating the compression, collapse and accretion, and disruption phases. The first hot cores and ZAMS stars appear at L/M≈10usk {L_ȯ}msun-1
Analyzing thematic maps and mapping for accuracy
Rosenfield, G.H.
1982-01-01
Two problems which exist while attempting to test the accuracy of thematic maps and mapping are: (1) evaluating the accuracy of thematic content, and (2) evaluating the effects of the variables on thematic mapping. Statistical analysis techniques are applicable to both these problems and include techniques for sampling the data and determining their accuracy. In addition, techniques for hypothesis testing, or inferential statistics, are used when comparing the effects of variables. A comprehensive and valid accuracy test of a classification project, such as thematic mapping from remotely sensed data, includes the following components of statistical analysis: (1) sample design, including the sample distribution, sample size, size of the sample unit, and sampling procedure; and (2) accuracy estimation, including estimation of the variance and confidence limits. Careful consideration must be given to the minimum sample size necessary to validate the accuracy of a given. classification category. The results of an accuracy test are presented in a contingency table sometimes called a classification error matrix. Usually the rows represent the interpretation, and the columns represent the verification. The diagonal elements represent the correct classifications. The remaining elements of the rows represent errors by commission, and the remaining elements of the columns represent the errors of omission. For tests of hypothesis that compare variables, the general practice has been to use only the diagonal elements from several related classification error matrices. These data are arranged in the form of another contingency table. The columns of the table represent the different variables being compared, such as different scales of mapping. The rows represent the blocking characteristics, such as the various categories of classification. The values in the cells of the tables might be the counts of correct classification or the binomial proportions of these counts divided by either the row totals or the column totals from the original classification error matrices. In hypothesis testing, when the results of tests of multiple sample cases prove to be significant, some form of statistical test must be used to separate any results that differ significantly from the others. In the past, many analyses of the data in this error matrix were made by comparing the relative magnitudes of the percentage of correct classifications, for either individual categories, the entire map or both. More rigorous analyses have used data transformations and (or) two-way classification analysis of variance. A more sophisticated step of data analysis techniques would be to use the entire classification error matrices using the methods of discrete multivariate analysis or of multiviariate analysis of variance.
Loring, David W; Larrabee, Glenn J
2006-06-01
The Halstead-Reitan Battery has been instrumental in the development of neuropsychological practice in the United States. Although Reitan administered both the Wechsler-Bellevue Intelligence Scale and Halstead's test battery when evaluating Halstead's theory of biologic intelligence, the relative sensitivity of each test battery to brain damage continues to be an area of controversy. Because Reitan did not perform direct parametric analysis to contrast group performances, we reanalyze Reitan's original validation data from both Halstead (Reitan, 1955) and Wechsler batteries (Reitan, 1959a) and calculate effect sizes and probability levels using traditional parametric approaches. Eight of the 10 tests comprising Halstead's original Impairment Index, as well as the Impairment Index itself, statistically differentiated patients with unequivocal brain damage from controls. In addition, 13 of 14 Wechsler measures including Full-Scale IQ also differed statistically between groups (Brain Damage Full-Scale IQ = 96.2; Control Group Full Scale IQ = 112.6). We suggest that differences in the statistical properties of each battery (e.g., raw scores vs. standardized scores) likely contribute to classification characteristics including test sensitivity and specificity.
Comparison of the predictive validity of diagnosis-based risk adjusters for clinical outcomes.
Petersen, Laura A; Pietz, Kenneth; Woodard, LeChauncy D; Byrne, Margaret
2005-01-01
Many possible methods of risk adjustment exist, but there is a dearth of comparative data on their performance. We compared the predictive validity of 2 widely used methods (Diagnostic Cost Groups [DCGs] and Adjusted Clinical Groups [ACGs]) for 2 clinical outcomes using a large national sample of patients. We studied all patients who used Veterans Health Administration (VA) medical services in fiscal year (FY) 2001 (n = 3,069,168) and assigned both a DCG and an ACG to each. We used logistic regression analyses to compare predictive ability for death or long-term care (LTC) hospitalization for age/gender models, DCG models, and ACG models. We also assessed the effect of adding age to the DCG and ACG models. Patients in the highest DCG categories, indicating higher severity of illness, were more likely to die or to require LTC hospitalization. Surprisingly, the age/gender model predicted death slightly more accurately than the ACG model (c-statistic of 0.710 versus 0.700, respectively). The addition of age to the ACG model improved the c-statistic to 0.768. The highest c-statistic for prediction of death was obtained with a DCG/age model (0.830). The lowest c-statistics were obtained for age/gender models for LTC hospitalization (c-statistic 0.593). The c-statistic for use of ACGs to predict LTC hospitalization was 0.783, and improved to 0.792 with the addition of age. The c-statistics for use of DCGs and DCG/age to predict LTC hospitalization were 0.885 and 0.890, respectively, indicating the best prediction. We found that risk adjusters based upon diagnoses predicted an increased likelihood of death or LTC hospitalization, exhibiting good predictive validity. In this comparative analysis using VA data, DCG models were generally superior to ACG models in predicting clinical outcomes, although ACG model performance was enhanced by the addition of age.
2015-09-01
accuracy and validity of selected basic pay and entitlement transactions the IPA tested. The IPA tested a statistical sample of 405 leave and earnings...Subscribe to our RSS Feeds or E-mail Updates. Listen to our Podcasts . Visit GAO on the web at www.gao.gov. Contact: Website: http://www.gao.gov/fraudnet
Michael L. Hoppus; Andrew J. Lister
2002-01-01
A Landsat TM classification method (iterative guided spectral class rejection) produced a forest cover map of southern West Virginia that provided the stratification layer for producing estimates of timberland area from Forest Service FIA ground plots using a stratified sampling technique. These same high quality and expensive FIA ground plots provided ground reference...
Systematic review of prediction models for delirium in the older adult inpatient.
Lindroth, Heidi; Bratzke, Lisa; Purvis, Suzanne; Brown, Roger; Coburn, Mark; Mrkobrada, Marko; Chan, Matthew T V; Davis, Daniel H J; Pandharipande, Pratik; Carlsson, Cynthia M; Sanders, Robert D
2018-04-28
To identify existing prognostic delirium prediction models and evaluate their validity and statistical methodology in the older adult (≥60 years) acute hospital population. Systematic review. PubMed, CINAHL, PsychINFO, SocINFO, Cochrane, Web of Science and Embase were searched from 1 January 1990 to 31 December 2016. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses and CHARMS Statement guided protocol development. age >60 years, inpatient, developed/validated a prognostic delirium prediction model. alcohol-related delirium, sample size ≤50. The primary performance measures were calibration and discrimination statistics. Two authors independently conducted search and extracted data. The synthesis of data was done by the first author. Disagreement was resolved by the mentoring author. The initial search resulted in 7,502 studies. Following full-text review of 192 studies, 33 were excluded based on age criteria (<60 years) and 27 met the defined criteria. Twenty-three delirium prediction models were identified, 14 were externally validated and 3 were internally validated. The following populations were represented: 11 medical, 3 medical/surgical and 13 surgical. The assessment of delirium was often non-systematic, resulting in varied incidence. Fourteen models were externally validated with an area under the receiver operating curve range from 0.52 to 0.94. Limitations in design, data collection methods and model metric reporting statistics were identified. Delirium prediction models for older adults show variable and typically inadequate predictive capabilities. Our review highlights the need for development of robust models to predict delirium in older inpatients. We provide recommendations for the development of such models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Valero-Aguayo, Luis; Ferro-García, Rafael; López-Bermúdez, Miguel Ángel; Selva-López de Huralde, María de los Ángeles
2014-01-01
The Experiencing of Self Scale (EOSS) was created to evaluate the experience of the personal self, within the field of Functional Analytic Psychotherapy. This paper presents a study of the reliability and validity of the EOSS in a Spanish sample. The study sample, chosen from 24 different centres, comprised 1,040 participants aged between 18-75, of whom 32% were men and 68% women. The clinical sample was made up of 32.7%, whereas 67.3% had no known problem. To obtain evidence of convergent validity, other questionnaires related to the self (EPQ-R, DES, RSES) were used for comparison. The EOSS showed high internal consistency (Cronbach's α = .941) and significantly high correlations with the EPQ-R Neuroticism scale and the DES Dissociation scale, while showing negative correlations with the Rosenberg Self-Esteem Scale (RSES). The EOSS revealed 4 principal factors: a self in close relationships, a self with casual social relationships, a self in general and a positive self-concept. Significant statistical differences were found between the clinical and standard sample, the former showing a higher average. The EOSS had high internal consistency, showing evidence of convergent validity with similar scales and proving useful for the assessment of people with psychological problems related to the self.
Feu, Sebastián; Ibáñez, Sergio José; Graça, Amândio; Sampaio, Jaime
2007-11-01
The purpose of this study was to develop a questionnaire to investigate volleyball coaches' orientations toward the coaching process. The study was preceded by four developmental stages in order to improve user understanding, validate the content, and refine the psychometric properties of the instrument. Participants for the reliability and validity study were 334 Spanish volleyball team coaches, 86.5% men and 13.2% women. The following 6 factors emerged from the exploratory factor analysis: team-work orientation, technological orientation, innovative orientation, dialogue orientation, directive orientation, and social climate orientation. Statistical results indicated that the instrument produced reliable and valid scores in all the obtained factors (a> .70), showing that this questionnaire is a useful tool to examine coaches' orientations towards coaching.
Rank score and permutation testing alternatives for regression quantile estimates
Cade, B.S.; Richards, J.D.; Mielke, P.W.
2006-01-01
Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.
2014-01-01
Background Thresholds for statistical significance are insufficiently demonstrated by 95% confidence intervals or P-values when assessing results from randomised clinical trials. First, a P-value only shows the probability of getting a result assuming that the null hypothesis is true and does not reflect the probability of getting a result assuming an alternative hypothesis to the null hypothesis is true. Second, a confidence interval or a P-value showing significance may be caused by multiplicity. Third, statistical significance does not necessarily result in clinical significance. Therefore, assessment of intervention effects in randomised clinical trials deserves more rigour in order to become more valid. Methods Several methodologies for assessing the statistical and clinical significance of intervention effects in randomised clinical trials were considered. Balancing simplicity and comprehensiveness, a simple five-step procedure was developed. Results For a more valid assessment of results from a randomised clinical trial we propose the following five-steps: (1) report the confidence intervals and the exact P-values; (2) report Bayes factor for the primary outcome, being the ratio of the probability that a given trial result is compatible with a ‘null’ effect (corresponding to the P-value) divided by the probability that the trial result is compatible with the intervention effect hypothesised in the sample size calculation; (3) adjust the confidence intervals and the statistical significance threshold if the trial is stopped early or if interim analyses have been conducted; (4) adjust the confidence intervals and the P-values for multiplicity due to number of outcome comparisons; and (5) assess clinical significance of the trial results. Conclusions If the proposed five-step procedure is followed, this may increase the validity of assessments of intervention effects in randomised clinical trials. PMID:24588900
Data-adaptive test statistics for microarray data.
Mukherjee, Sach; Roberts, Stephen J; van der Laan, Mark J
2005-09-01
An important task in microarray data analysis is the selection of genes that are differentially expressed between different tissue samples, such as healthy and diseased. However, microarray data contain an enormous number of dimensions (genes) and very few samples (arrays), a mismatch which poses fundamental statistical problems for the selection process that have defied easy resolution. In this paper, we present a novel approach to the selection of differentially expressed genes in which test statistics are learned from data using a simple notion of reproducibility in selection results as the learning criterion. Reproducibility, as we define it, can be computed without any knowledge of the 'ground-truth', but takes advantage of certain properties of microarray data to provide an asymptotically valid guide to expected loss under the true data-generating distribution. We are therefore able to indirectly minimize expected loss, and obtain results substantially more robust than conventional methods. We apply our method to simulated and oligonucleotide array data. By request to the corresponding author.
[Health-related behavior in a sample of Brazilian college students: gender differences].
Colares, Viviane; Franca, Carolina da; Gonzalez, Emília
2009-03-01
This study investigated whether undergraduate students' health-risk behaviors differed according to gender. The sample consisted of 382 subjects, aged 20-29 years, from public universities in Pernambuco State, Brazil. Data were collected using the National College Health Risk Behavior Survey, previously validated in Portuguese. Descriptive and inferential statistical techniques were used. Associations were analyzed with the chi-square test or Fisher's exact test. Statistical significance was set at p < or = 0.05. In general, females engaged in the following risk behaviors less frequently than males: alcohol consumption (p = 0.005), smoking (p = 0.002), experimenting with marijuana (p = 0.002), consumption of inhalants (p < or = 0.001), steroid use (p = 0.003), carrying weapons (p = 0.001), and involvement in physical fights (p = 0.014). Meanwhile, female students displayed more concern about losing or maintaining weight, although they exercised less frequently than males. The findings thus showed statistically different health behaviors between genders. In conclusion, different approaches need to be used for the two genders.
Resilience Scale-25 Spanish version: validation and assessment in eating disorders.
Las Hayas, Carlota; Calvete, Esther; Gómez del Barrio, Andrés; Beato, Luís; Muñoz, Pedro; Padierna, Jesús Ángel
2014-08-01
To validate into Spanish the Wagnild and Young Resilience Scale - 25 (RS-25), assess and compare the scores on the scale among women from the general population, eating disorder (ED) patients and recovered ED patients. This is a cross-sectional study. ED participants were invited to participate by their respective therapists. The sample from the general population was gathered via an open online survey. Participants (N general population=279; N ED patients=124; and N recovered ED patients=45) completed the RS-25, the World Health Organization Quality of Life Scale-BREF and the Hospital Anxiety and Depression Scale. Mean age of participants ranged from 28.87 to 30.42years old. Statistical analysis included a multi-group confirmatory factor analysis and ANOVA. The two-factor model of the RS-25 produced excellent fit indexes. Measurement invariance across samples was generally supported. The ANOVA found statistically significant differences in the RS-25 mean scores between the ED patients (Mean=103.13, SD=31.32) and the recovered ED participants (Mean=138.42, SD=22.26) and between the ED patients and the general population participants (Mean=136.63, SD=19.56). The Spanish version of the RS-25 is a psychometrically sound measurement tool in samples of ED patients. Resilience is lower in people diagnosed with ED than in recovered individuals and the general population. Copyright © 2014 Elsevier Ltd. All rights reserved.
Jilcott Pitts, Stephanie Bell; Jahns, Lisa; Wu, Qiang; Moran, Nancy E; Bell, Ronny A; Truesdale, Kimberly P; Laska, Melissa N
2018-06-01
To assess the feasibility, reliability and validity of reflection spectroscopy (RS) to assess skin carotenoids in a racially diverse sample. Study 1 was a cross-sectional study of corner store customers (n 479) who completed the National Cancer Institute Fruit and Vegetable Screener as well as RS measures. Feasibility was assessed by examining the time it took to complete three RS measures, reliability was assessed by examining the variation between three RS measures, and validity was examined by correlation with self-reported fruit and vegetable consumption. In Study 2, validity was assessed in a smaller sample (n 30) by examining associations between RS measures and dietary carotenoids, fruits and vegetables as calculated from a validated FFQ and plasma carotenoids. Eastern North Carolina, USA. It took on average 94·0 s to complete three RS readings per person. The average variation between three readings for each participant was 6·8 %. In Study 2, in models adjusted for age, race and sex, there were statistically significant associations between RS measures and (i) FFQ-estimated carotenoid intake (P<0·0001); (ii) FFQ-estimated fruit and vegetable consumption (P<0·010); and (iii) plasma carotenoids (P<0·0001). RS is a potentially improved method to approximate fruit and vegetable consumption among diverse participants. RS is portable and easy to use in field-based public health nutrition settings. More research is needed to investigate validity and sensitivity in diverse populations.
Wang, Chang-Hwai; Lee, Jin-Chuan; Yuan, Yu-Hsi
2014-01-01
The purpose of this research is to establish and verify the psychometric and structural properties of the self-report Chinese Sexual Assault Symptom Scale (C-SASS) to assess the trauma experienced by Chinese victims of sexual assault. An earlier version of the C-SASS was constructed using a modified list of the same trauma symptoms administered to an American sample and used to develop and validate the Sexual Assault Symptom Scale II (SASS II). The rationale of this study is to revise the earlier version of the C-SASS, using a larger and more representative sample and more robust statistical analysis than in earlier research, to permit a more thorough examination of the instrument and further confirm the dimensions of sexual assault trauma in Chinese victims of rape. In this study, a sample of 418 victims from northern Taiwan was collected to confirm the reliability and validity of the C-SASS. Exploratory factor analysis yielded five common factors: Safety Fears, Self-Blame, Health Fears, Anger and Emotional Lability, and Fears About the Criminal Justice System. Further tests of the validity and composite reliability of the C-SASS were provided by the structural equation modeling (SEM). The results indicated that the C-SASS was a brief, valid, and reliable instrument for assessing sexual assault trauma among Chinese victims in Taiwan. The scale can be used to evaluate victims in sexual assault treatment centers around Taiwan, as well as to capture the characteristics of sexual assault trauma among Chinese victims.
Pontes, Halley M.; Macur, Mirna; Griffiths, Mark D.
2016-01-01
Background and aims Since the inclusion of Internet Gaming Disorder (IGD) in the latest (fifth) edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) as a tentative disorder, a few psychometric screening instruments have been developed to assess IGD, including the 9-item Internet Gaming Disorder Scale – Short-Form (IGDS9-SF) – a short, valid, and reliable instrument. Methods Due to the lack of research on IGD in Slovenia, this study aimed to examine the psychometric properties of the IGDS9-SF in addition to investigating the prevalence rates of IGD in a nationally representative sample of eighth graders from Slovenia (N = 1,071). Results The IGDS9-SF underwent rigorous psychometric scrutiny in terms of validity and reliability. Construct validation was investigated with confirmatory factor analysis to examine the factorial structure of the IGDS9-SF and a unidimensional structure appeared to fit the data well. Concurrent and criterion validation were also investigated by examining the association between IGD and relevant psychosocial and game-related measures, which warranted these forms of validity. In terms of reliability, the Slovenian version IGDS9-SF obtained excellent results regarding its internal consistency at different levels, and the test appears to be a valid and reliable instrument to assess IGD among Slovenian youth. Finally, the prevalence rates of IGD were found to be around 2.5% in the whole sample and 3.1% among gamers. Discussion and conclusion Taken together, these results illustrate the suitability of the IGDS9-SF and warrants further research on IGD in Slovenia. PMID:27363464
A new class of enhanced kinetic sampling methods for building Markov state models
NASA Astrophysics Data System (ADS)
Bhoutekar, Arti; Ghosh, Susmita; Bhattacharya, Swati; Chatterjee, Abhijit
2017-10-01
Markov state models (MSMs) and other related kinetic network models are frequently used to study the long-timescale dynamical behavior of biomolecular and materials systems. MSMs are often constructed bottom-up using brute-force molecular dynamics (MD) simulations when the model contains a large number of states and kinetic pathways that are not known a priori. However, the resulting network generally encompasses only parts of the configurational space, and regardless of any additional MD performed, several states and pathways will still remain missing. This implies that the duration for which the MSM can faithfully capture the true dynamics, which we term as the validity time for the MSM, is always finite and unfortunately much shorter than the MD time invested to construct the model. A general framework that relates the kinetic uncertainty in the model to the validity time, missing states and pathways, network topology, and statistical sampling is presented. Performing additional calculations for frequently-sampled states/pathways may not alter the MSM validity time. A new class of enhanced kinetic sampling techniques is introduced that aims at targeting rare states/pathways that contribute most to the uncertainty so that the validity time is boosted in an effective manner. Examples including straightforward 1D energy landscapes, lattice models, and biomolecular systems are provided to illustrate the application of the method. Developments presented here will be of interest to the kinetic Monte Carlo community as well.
Psychometrics of chronic liver disease questionnaire in Chinese chronic hepatitis B patients.
Zhou, Kai-Na; Zhang, Min; Wu, Qian; Ji, Zhen-Hao; Zhang, Xiao-Mei; Zhuang, Gui-Hua
2013-06-14
To evaluate psychometrics of the Chinese (mainland) chronic liver disease questionnaire (CLDQ) in patients with chronic hepatitis B (CHB). A cross-sectional sample of 460 Chinese patients with CHB was selected from the Outpatient Department of the Eighth Hospital of Xi'an, including CHB (CHB without cirrhosis) (n = 323) and CHB-related cirrhosis (n = 137). The psychometrics includes reliability, validity and sensitivity. Internal consistency reliability was measured using Cronbach's α. Convergent and discriminant validity was evaluated by item-scale correlation. Factorial validity was explored by principal component analysis with varimax rotation. Sensitivity was assessed using Cohen's effect size (ES), and independent sample t test between CHB and CHB-related cirrhosis groups and between alanine aminotransferase (ALT) normal and abnormal groups after stratifying the disease (CHB and CHB-related cirrhosis). Internal consistency reliability of the CLDQ was 0.83 (range: 0.65-0.90). Most of the hypothesized item-scale correlations were 0.40 or over, and all of such hypothesized correlations were higher than the alternative ones, indicating satisfactory convergent and discriminant validity. Six factors were extracted after varimax rotation from the 29 items of CLDQ. The eligible Cohen's ES with statistically significant independent sample t test was found in the overall CLDQ and abdominal, systematic, activity scales (CHB vs CHB-related cirrhosis), and in the overall CLDQ and abdominal scale in the stratification of patients with CHB (ALT normal vs abnormal). The CLDQ has acceptable reliability, validity and sensitivity in Chinese (mainland) patients with CHB.
Montgomery, Eric; Gao, Chen; de Luca, Julie; Bower, Jessie; Attwood, Kristropher; Ylagan, Lourdes
2014-12-01
The Cellient(®) cell block system has become available as an alternative, partially automated method to create cell blocks in cytology. We sought to show a validation method for immunohistochemical (IHC) staining on the Cellient cell block system (CCB) in comparison with the formalin fixed paraffin embedded traditional cell block (TCB). Immunohistochemical staining was performed using 31 antibodies on 38 patient samples for a total of 326 slides. Split samples were processed using both methods by following the Cellient(®) manufacturer's recommendations for the Cellient cell block (CCB) and the Histogel method for preparing the traditional cell block (TCB). Interpretation was performed by three pathologists and two cytotechnologists. Immunohistochemical stains were scored as: 0/1+ (negative) and 2/3+ (positive). Inter-rater agreement for each antibody was evaluated for CCB and TCB, as well as the intra-rater agreement between TCB and CCB between observers. Interobserver staining concordance for the TCB was obtained with statistical significance (P < 0.05) in 24 of 31 antibodies. Interobserver staining concordance for the CCB was obtained with statistical significance in 27 of 31 antibodies. Intra-observer staining concordance between TCB and CCB was obtained with statistical significance in 24 of 31 antibodies tested. In conclusions, immunohistochemical stains on cytologic specimens processed by the Cellient system are reliable and concordant with stains performed on the same split samples processed via a formalin fixed-paraffin embedded (FFPE) block. The Cellient system is a welcome adjunct to cytology work-flow by producing cell block material of sufficient quality to allow the use of routine IHC. © 2014 Wiley Periodicals, Inc.
Errors in radial velocity variance from Doppler wind lidar
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, H.; Barthelmie, R. J.; Doubrawa, P.
A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less
Errors in radial velocity variance from Doppler wind lidar
Wang, H.; Barthelmie, R. J.; Doubrawa, P.; ...
2016-08-29
A high-fidelity lidar turbulence measurement technique relies on accurate estimates of radial velocity variance that are subject to both systematic and random errors determined by the autocorrelation function of radial velocity, the sampling rate, and the sampling duration. Our paper quantifies the effect of the volumetric averaging in lidar radial velocity measurements on the autocorrelation function and the dependence of the systematic and random errors on the sampling duration, using both statistically simulated and observed data. For current-generation scanning lidars and sampling durations of about 30 min and longer, during which the stationarity assumption is valid for atmospheric flows, themore » systematic error is negligible but the random error exceeds about 10%.« less
Hybrid Gibbs Sampling and MCMC for CMB Analysis at Small Angular Scales
NASA Technical Reports Server (NTRS)
Jewell, Jeffrey B.; Eriksen, H. K.; Wandelt, B. D.; Gorski, K. M.; Huey, G.; O'Dwyer, I. J.; Dickinson, C.; Banday, A. J.; Lawrence, C. R.
2008-01-01
A) Gibbs Sampling has now been validated as an efficient, statistically exact, and practically useful method for "low-L" (as demonstrated on WMAP temperature polarization data). B) We are extending Gibbs sampling to directly propagate uncertainties in both foreground and instrument models to total uncertainty in cosmological parameters for the entire range of angular scales relevant for Planck. C) Made possible by inclusion of foreground model parameters in Gibbs sampling and hybrid MCMC and Gibbs sampling for the low signal to noise (high-L) regime. D) Future items to be included in the Bayesian framework include: 1) Integration with Hybrid Likelihood (or posterior) code for cosmological parameters; 2) Include other uncertainties in instrumental systematics? (I.e. beam uncertainties, noise estimation, calibration errors, other).
Yildirim, Aysegul; Akinci, Fevzi; Gozu, Hulya; Sargin, Haluk; Orbay, Ekrem; Sargin, Mehmet
2007-06-01
The aim of this study was to test the validity and reliability of the Turkish version of the diabetes quality of life (DQOL) questionnaire for use with patients with diabetes. Turkish version of the generic quality of life (QoL) scale 15D and DQOL, socio-demographics and clinical parameter characteristics were administered to 150 patients with type 2 diabetes. Study participants were randomly sampled from the Endocrinology and Diabetes Outpatient Department of Dr. Lutfi Kirdar Kartal Education and Research Hospital in Istanbul, Turkey. The Cronbach alpha coefficient of the overall DQOL scale was 0.89; the Cronbach alpha coefficient ranged from 0.80 to 0.94 for subscales. Distress, discomfort and its symptoms, depression, mobility, usual activities, and vitality on the 15 D scale had statistically significant correlations with social/vocational worry and diabetes-related worry on the DQOL scale indicating good convergent validity. Factor analysis identified four subscales: satisfaction", impact", "diabetes-related worry", and "social/vocational worry". Statistical analyses showed that the Turkish version of the DQOL is a valid and reliable instrument to measure disease related QoL in patients with diabetes. It is a simple and quick screening tool with about 15 +/- 5.8 min administration time for measuring QoL in this population.
Ortuño-Sierra, Javier; Aritio-Solana, Rebeca; Inchausti, Félix; Chocarro de Luis, Edurne; Lucas Molina, Beatriz; Pérez de Albéniz, Alicia; Fonseca-Pedrero, Eduardo
2017-01-01
The main purpose of the present study was to assess the depressive symptomatology and to gather new validity evidences of the Reynolds Depression Scale-Short form (RADS-SF) in a representative sample of youths. The sample consisted of 2914 adolescents with a mean age of 15.85 years (SD = 1.68). We calculated the descriptive statistics and internal consistency of the RADS-SF scores. Also, confirmatory factor analyses (CFAs) at the item level and successive multigroup CFAs to test measurement invariance, were conducted. Latent mean differences across gender and educational level groups were estimated, and finally, we studied the sources of validity evidences with other external variables. The level of internal consistency of the RADS-SF Total score by means of Ordinal alpha was .89. Results from CFAs showed that the one-dimensional model displayed appropriate goodness of-fit indices with CFI value over .95, and RMSEA value under .08. In addition, the results support the strong measurement invariance of the RADS-SF scores across gender and age. When latent means were compared, statistically significant differences were found by gender and age. Females scored 0.347 over than males in Depression latent variable, whereas older adolescents scored 0.111 higher than the younger group. In addition, the RADS-SF score was associated with the RADS scores. The results suggest that the RADS-SF could be used as an efficient screening test to assess self-reported depressive symptoms in adolescents from the general population.
Harris, Alex Hs; Kuo, Alfred C; Bowe, Thomas; Gupta, Shalini; Nordin, David; Giori, Nicholas J
2018-05-01
Statistical models to preoperatively predict patients' risk of death and major complications after total joint arthroplasty (TJA) could improve the quality of preoperative management and informed consent. Although risk models for TJA exist, they have limitations including poor transparency and/or unknown or poor performance. Thus, it is currently impossible to know how well currently available models predict short-term complications after TJA, or if newly developed models are more accurate. We sought to develop and conduct cross-validation of predictive risk models, and report details and performance metrics as benchmarks. Over 90 preoperative variables were used as candidate predictors of death and major complications within 30 days for Veterans Health Administration patients with osteoarthritis who underwent TJA. Data were split into 3 samples-for selection of model tuning parameters, model development, and cross-validation. C-indexes (discrimination) and calibration plots were produced. A total of 70,569 patients diagnosed with osteoarthritis who received primary TJA were included. C-statistics and bootstrapped confidence intervals for the cross-validation of the boosted regression models were highest for cardiac complications (0.75; 0.71-0.79) and 30-day mortality (0.73; 0.66-0.79) and lowest for deep vein thrombosis (0.59; 0.55-0.64) and return to the operating room (0.60; 0.57-0.63). Moderately accurate predictive models of 30-day mortality and cardiac complications after TJA in Veterans Health Administration patients were developed and internally cross-validated. By reporting model coefficients and performance metrics, other model developers can test these models on new samples and have a procedure and indication-specific benchmark to surpass. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Uhlemann, C.; Feix, M.; Codis, S.; Pichon, C.; Bernardeau, F.; L'Huillier, B.; Kim, J.; Hong, S. E.; Laigle, C.; Park, C.; Shin, J.; Pogosyan, D.
2018-02-01
Starting from a very accurate model for density-in-cells statistics of dark matter based on large deviation theory, a bias model for the tracer density in spheres is formulated. It adopts a mean bias relation based on a quadratic bias model to relate the log-densities of dark matter to those of mass-weighted dark haloes in real and redshift space. The validity of the parametrized bias model is established using a parametrization-independent extraction of the bias function. This average bias model is then combined with the dark matter PDF, neglecting any scatter around it: it nevertheless yields an excellent model for densities-in-cells statistics of mass tracers that is parametrized in terms of the underlying dark matter variance and three bias parameters. The procedure is validated on measurements of both the one- and two-point statistics of subhalo densities in the state-of-the-art Horizon Run 4 simulation showing excellent agreement for measured dark matter variance and bias parameters. Finally, it is demonstrated that this formalism allows for a joint estimation of the non-linear dark matter variance and the bias parameters using solely the statistics of subhaloes. Having verified that galaxy counts in hydrodynamical simulations sampled on a scale of 10 Mpc h-1 closely resemble those of subhaloes, this work provides important steps towards making theoretical predictions for density-in-cells statistics applicable to upcoming galaxy surveys like Euclid or WFIRST.
NASA Astrophysics Data System (ADS)
Dennison, Andrew G.
Classification of the seafloor substrate can be done with a variety of methods. These methods include Visual (dives, drop cameras); mechanical (cores, grab samples); acoustic (statistical analysis of echosounder returns). Acoustic methods offer a more powerful and efficient means of collecting useful information about the bottom type. Due to the nature of an acoustic survey, larger areas can be sampled, and by combining the collected data with visual and mechanical survey methods provide greater confidence in the classification of a mapped region. During a multibeam sonar survey, both bathymetric and backscatter data is collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on bottom type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, i.e a muddy area from a rocky area, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing of high-resolution multibeam data can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. The development of a new classification method is described here. It is based upon the analysis of textural features in conjunction with ground truth sampling. The processing and classification result of two geologically distinct areas in nearshore regions of Lake Superior; off the Lester River,MN and Amnicon River, WI are presented here, using the Minnesota Supercomputer Institute's Mesabi computing cluster for initial processing. Processed data is then calibrated using ground truth samples to conduct an accuracy assessment of the surveyed areas. From analysis of high-resolution bathymetry data collected at both survey sites is was possible to successfully calculate a series of measures that describe textural information about the lake floor. Further processing suggests that the features calculated capture a significant amount of statistical information about the lake floor terrain as well. Two sources of error, an anomalous heave and refraction error significantly deteriorated the quality of the processed data and resulting validate results. Ground truth samples used to validate the classification methods utilized for both survey sites, however, resulted in accuracy values ranging from 5 -30 percent at the Amnicon River, and between 60-70 percent for the Lester River. The final results suggest that this new processing methodology does adequately capture textural information about the lake floor and does provide an acceptable classification in the absence of significant data quality issues.
McClure, Foster D; Lee, Jung K
2005-01-01
Sample size formulas are developed to estimate the repeatability and reproducibility standard deviations (Sr and S(R)) such that the actual error in (Sr and S(R)) relative to their respective true values, sigmar and sigmaR, are at predefined levels. The statistical consequences associated with AOAC INTERNATIONAL required sample size to validate an analytical method are discussed. In addition, formulas to estimate the uncertainties of (Sr and S(R)) were derived and are provided as supporting documentation. Formula for the Number of Replicates Required for a Specified Margin of Relative Error in the Estimate of the Repeatability Standard Deviation.
Physical Validation of TRMM TMI and PR Monthly Rain Products Over Oklahoma
NASA Technical Reports Server (NTRS)
Fisher, Brad L.
2004-01-01
The Tropical Rainfall Measuring Mission (TRMM) provides monthly rainfall estimates using data collected by the TRMM satellite. These estimates cover a substantial fraction of the earth's surface. The physical validation of TRMM estimates involves corroborating the accuracy of spaceborne estimates of areal rainfall by inferring errors and biases from ground-based rain estimates. The TRMM error budget consists of two major sources of error: retrieval and sampling. Sampling errors are intrinsic to the process of estimating monthly rainfall and occur because the satellite extrapolates monthly rainfall from a small subset of measurements collected only during satellite overpasses. Retrieval errors, on the other hand, are related to the process of collecting measurements while the satellite is overhead. One of the big challenges confronting the TRMM validation effort is how to best estimate these two main components of the TRMM error budget, which are not easily decoupled. This four-year study computed bulk sampling and retrieval errors for the TRMM microwave imager (TMI) and the precipitation radar (PR) by applying a technique that sub-samples gauge data at TRMM overpass times. Gridded monthly rain estimates are then computed from the monthly bulk statistics of the collected samples, providing a sensor-dependent gauge rain estimate that is assumed to include a TRMM equivalent sampling error. The sub-sampled gauge rain estimates are then used in conjunction with the monthly satellite and gauge (without sub- sampling) estimates to decouple retrieval and sampling errors. The computed mean sampling errors for the TMI and PR were 5.9% and 7.796, respectively, in good agreement with theoretical predictions. The PR year-to-year retrieval biases exceeded corresponding TMI biases, but it was found that these differences were partially due to negative TMI biases during cold months and positive TMI biases during warm months.
Kim, Hark Kyun; Reyzer, Michelle L.; Choi, Il Ju; Kim, Chan Gyoo; Kim, Hee Sung; Oshima, Akira; Chertov, Oleg; Colantonio, Simona; Fisher, Robert J.; Allen, Jamie L.; Caprioli, Richard M.; Green, Jeffrey E.
2012-01-01
To date, proteomic analyses on gastrointestinal cancer tissue samples have been performed using surgical specimens only, which are obtained after a diagnosis is made. To determine if a proteomic signature obtained from endoscopic biopsy samples could be found to assist with diagnosis, frozen endoscopic biopsy samples collected from 63 gastric cancer patients and 43 healthy volunteers were analyzed using matrix-assisted laser desorption/ionization (MALDI) mass spectrometry. A statistical classification model was developed to distinguish tumor from normal tissues using half the samples and validated with the other half. A protein profile was discovered consisting of 73 signals that could classify 32 cancer and 22 normal samples in the validation set with high predictive values (positive and negative predictive values for cancer, 96.8% and 91.3%; sensitivity, 93.8%; specificity, 95.5%). Signals overexpressed in tumors were identified as α-defensin-1, α-defensin-2, calgranulin A, and calgranulin B. A protein profile was also found to distinguish pathologic stage Ia (pT1N0M0) samples (n = 10) from more advanced stage (Ib or higher) tumors (n = 48). Thus, protein profiles obtained from endoscopic biopsy samples may be useful in assisting with the diagnosis of gastric cancer and, possibly, in identifying early stage disease. PMID:20557134
Using CRANID to test the population affinity of known crania.
Kallenberger, Lauren; Pilbrow, Varsha
2012-11-01
CRANID is a statistical program used to infer the source population of a cranium of unknown origin by comparing its cranial dimensions with a worldwide craniometric database. It has great potential for estimating ancestry in archaeological, forensic and repatriation cases. In this paper we test the validity of CRANID in classifying crania of known geographic origin. Twenty-three crania of known geographic origin but unknown sex were selected from the osteological collections of the University of Melbourne. Only 18 crania showed good statistical match with the CRANID database. Without considering accuracy of sex allocation, 11 crania were accurately classified into major geographic regions and nine were correctly classified to geographically closest available reference populations. Four of the five crania with poor statistical match were nonetheless correctly allocated to major geographical regions, although none was accurately assigned to geographically closest reference samples. We conclude that if sex allocations are overlooked, CRANID can accurately assign 39% of specimens to geographically closest matching reference samples and 48% to major geographic regions. Better source population representation may improve goodness of fit, but known sex-differentiated samples are needed to further test the utility of CRANID. © 2012 The Authors Journal of Anatomy © 2012 Anatomical Society.
Kulesz, Paulina A.; Tian, Siva; Juranek, Jenifer; Fletcher, Jack M.; Francis, David J.
2015-01-01
Objective Weak structure-function relations for brain and behavior may stem from problems in estimating these relations in small clinical samples with frequently occurring outliers. In the current project, we focused on the utility of using alternative statistics to estimate these relations. Method Fifty-four children with spina bifida meningomyelocele performed attention tasks and received MRI of the brain. Using a bootstrap sampling process, the Pearson product moment correlation was compared with four robust correlations: the percentage bend correlation, the Winsorized correlation, the skipped correlation using the Donoho-Gasko median, and the skipped correlation using the minimum volume ellipsoid estimator Results All methods yielded similar estimates of the relations between measures of brain volume and attention performance. The similarity of estimates across correlation methods suggested that the weak structure-function relations previously found in many studies are not readily attributable to the presence of outlying observations and other factors that violate the assumptions behind the Pearson correlation. Conclusions Given the difficulty of assembling large samples for brain-behavior studies, estimating correlations using multiple, robust methods may enhance the statistical conclusion validity of studies yielding small, but often clinically significant, correlations. PMID:25495830
Kulesz, Paulina A; Tian, Siva; Juranek, Jenifer; Fletcher, Jack M; Francis, David J
2015-03-01
Weak structure-function relations for brain and behavior may stem from problems in estimating these relations in small clinical samples with frequently occurring outliers. In the current project, we focused on the utility of using alternative statistics to estimate these relations. Fifty-four children with spina bifida meningomyelocele performed attention tasks and received MRI of the brain. Using a bootstrap sampling process, the Pearson product-moment correlation was compared with 4 robust correlations: the percentage bend correlation, the Winsorized correlation, the skipped correlation using the Donoho-Gasko median, and the skipped correlation using the minimum volume ellipsoid estimator. All methods yielded similar estimates of the relations between measures of brain volume and attention performance. The similarity of estimates across correlation methods suggested that the weak structure-function relations previously found in many studies are not readily attributable to the presence of outlying observations and other factors that violate the assumptions behind the Pearson correlation. Given the difficulty of assembling large samples for brain-behavior studies, estimating correlations using multiple, robust methods may enhance the statistical conclusion validity of studies yielding small, but often clinically significant, correlations. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Validation of a proposal for evaluating hospital infection control programs.
Silva, Cristiane Pavanello Rodrigues; Lacerda, Rúbia Aparecida
2011-02-01
To validate the construct and discriminant properties of a hospital infection prevention and control program. The program consisted of four indicators: technical-operational structure; operational prevention and control guidelines; epidemiological surveillance system; and prevention and control activities. These indicators, with previously validated content, were applied to 50 healthcare institutions in the city of São Paulo, Southeastern Brazil, in 2009. Descriptive statistics were used to characterize the hospitals and indicator scores, and Cronbach's α coefficient was used to evaluate the internal consistency. The discriminant validity was analyzed by comparing indicator scores between groups of hospitals: with versus without quality certification. The construct validity analysis was based on exploratory factor analysis with a tetrachoric correlation matrix. The indicators for the technical-operational structure and epidemiological surveillance presented almost 100% conformity in the whole sample. The indicators for the operational prevention and control guidelines and the prevention and control activities presented internal consistency ranging from 0.67 to 0.80. The discriminant validity of these indicators indicated higher and statistically significant mean conformity scores among the group of institutions with healthcare certification or accreditation processes. In the construct validation, two dimensions were identified for the operational prevention and control guidelines: recommendations for preventing hospital infection and recommendations for standardizing prophylaxis procedures, with good correlation between the analysis units that formed the guidelines. The same was found for the prevention and control activities: interfaces with treatment units and support units were identified. Validation of the measurement properties of the hospital infection prevention and control program indicators made it possible to develop a tool for evaluating these programs in an ethical and scientific manner in order to obtain a quality diagnosis in this field.
Development of an Independent Global Land Cover Validation Dataset
NASA Astrophysics Data System (ADS)
Sulla-Menashe, D. J.; Olofsson, P.; Woodcock, C. E.; Holden, C.; Metcalfe, M.; Friedl, M. A.; Stehman, S. V.; Herold, M.; Giri, C.
2012-12-01
Accurate information related to the global distribution and dynamics in global land cover is critical for a large number of global change science questions. A growing number of land cover products have been produced at regional to global scales, but the uncertainty in these products and the relative strengths and weaknesses among available products are poorly characterized. To address this limitation we are compiling a database of high spatial resolution imagery to support international land cover validation studies. Validation sites were selected based on a probability sample, and may therefore be used to estimate statistically defensible accuracy statistics and associated standard errors. Validation site locations were identified using a stratified random design based on 21 strata derived from an intersection of Koppen climate classes and a population density layer. In this way, the two major sources of global variation in land cover (climate and human activity) are explicitly included in the stratification scheme. At each site we are acquiring high spatial resolution (< 1-m) satellite imagery for 5-km x 5-km blocks. The response design uses an object-oriented hierarchical legend that is compatible with the UN FAO Land Cover Classification System. Using this response design, we are classifying each site using a semi-automated algorithm that blends image segmentation with a supervised RandomForest classification algorithm. In the long run, the validation site database is designed to support international efforts to validate land cover products. To illustrate, we use the site database to validate the MODIS Collection 4 Land Cover product, providing a prototype for validating the VIIRS Surface Type Intermediate Product scheduled to start operational production early in 2013. As part of our analysis we evaluate sources of error in coarse resolution products including semantic issues related to the class definitions, mixed pixels, and poor spectral separation between classes.
Cummins, Steven; Macintyre, Sally
2009-12-01
To assess the validity of a publicly available list of food stores through field observations of their existence, in order to contribute to research on neighbourhood food environments and health. All multiple-owned supermarkets, and a 1 in 8 sample of other food outlets, listed in 1997 and 2007 in the public register of food premises held by Glasgow City Council, Scotland, were visited to establish whether they were trading as foodstores. Postcode sectors in which foodstores were located were classified into least, middling and most deprived neighbourhoods. In total, 325 listed foodstores were visited in 1997 and 508 in 2007. Of these 87% and 88%, respectively, were trading as foodstores. There was a very slight gradient in validity by deprivation, with validity higher in least deprived neighbourhoods, though this was not statistically significant. There was reasonable, but not perfect, agreement between the list of food premises and field observations, with nearly 1 in 9 of sampled foodstores not present on the ground. Since the use of inaccurate secondary data sources may affect estimates of relationships between the neighbourhood food environment and health, further work is required to establish the validity of such data in different contexts.
Validity of the Brunel Mood Scale for use With Malaysian Athletes.
Lan, Mohamad Faizal; Lane, Andrew M; Roy, Jolly; Hanin, Nik Azma
2012-01-01
The aim of the present study was to investigate the factorial validity of the Brunel Mood Scale for use with Malaysian athletes. Athletes (N = 1485 athletes) competing at the Malaysian Games completed the Brunel of Mood Scale (BRUMS). Confirmatory Factor Analysis (CFA) results indicated a Confirmatory Fit Index (CFI) of .90 and Root Mean Squared Error of Approximation (RMSEA) was 0.05. The CFI was below the 0.95 criterion for acceptability and the RMSEA value was within the limits for acceptability suggested by Hu and Bentler, 1999. We suggest that results provide some support for validity of the BRUMS for use with Malaysian athletes. Given the large sample size used in the present study, descriptive statistics could be used as normative data for Malaysian athletes. Key pointsFindings from the present study lend support to the validity of the BRUMS for use with Malaysian athletes.Given the size of the sample used in the present study, we suggest descriptive data be used as the normative data for researchers using the scale with Malaysian athletes.It is suggested that future research investigate the effects of cultural differences on emotional states experienced by athletes before, during and post-competition.
Validity of the Brunel Mood Scale for use With Malaysian Athletes
Lan, Mohamad Faizal; Lane, Andrew M.; Roy, Jolly; Hanin, Nik Azma
2012-01-01
The aim of the present study was to investigate the factorial validity of the Brunel Mood Scale for use with Malaysian athletes. Athletes (N = 1485 athletes) competing at the Malaysian Games completed the Brunel of Mood Scale (BRUMS). Confirmatory Factor Analysis (CFA) results indicated a Confirmatory Fit Index (CFI) of .90 and Root Mean Squared Error of Approximation (RMSEA) was 0.05. The CFI was below the 0.95 criterion for acceptability and the RMSEA value was within the limits for acceptability suggested by Hu and Bentler, 1999. We suggest that results provide some support for validity of the BRUMS for use with Malaysian athletes. Given the large sample size used in the present study, descriptive statistics could be used as normative data for Malaysian athletes. Key points Findings from the present study lend support to the validity of the BRUMS for use with Malaysian athletes. Given the size of the sample used in the present study, we suggest descriptive data be used as the normative data for researchers using the scale with Malaysian athletes. It is suggested that future research investigate the effects of cultural differences on emotional states experienced by athletes before, during and post-competition. PMID:24149128
A methodological analysis of chaplaincy research: 2000-2009.
Galek, Kathleen; Flannelly, Kevin J; Jankowski, Katherine R B; Handzo, George F
2011-01-01
The present article presents a comprehensive review and analysis of quantitative research conducted in the United States on chaplaincy and closely related topics published between 2000 and 2009. A combined search strategy identified 49 quantitative studies in 13 journals. The analysis focuses on the methodological sophistication of the studies, compared to earlier research on chaplaincy and pastoral care. Cross-sectional surveys of convenience samples still dominate the field, but sample sizes have increased somewhat over the past three decades. Reporting of the validity and reliability of measures continues to be low, although reporting of response rates has improved. Improvements in the use of inferential statistics and statistical controls were also observed, compared to previous research. The authors conclude that more experimental research is needed on chaplaincy, along with an increased use of hypothesis testing, regardless of the research designs that are used.
Hillen, Marij A; Postma, Rosa-May; Verdam, Mathilde G E; Smets, Ellen M A
2017-03-01
The original 18-item, four-dimensional Trust in Oncologist Scale assesses cancer patients' trust in their oncologist. The current aim was to develop and validate a short form version of the scale to enable more efficient assessment of cancer patients' trust. Existing validation data of the full-length Trust in Oncologist Scale were used to create a short form of the Trust in Oncologist Scale. The resulting short form was validated in a new sample of cancer patients (n = 92). Socio-demographics, medical characteristics, trust in the oncologist, satisfaction with communication, trust in healthcare, willingness to recommend the oncologist to others and to contact the oncologist in case of questions were assessed. Internal consistency, reliability, convergent and structural validity were tested. The five-item Trust in Oncologist Scale Short Form was created by selecting the statistically best performing item from each dimension of the original scale, to ensure content validity. Mean trust in the oncologist was high in the validation sample (response rate 86%, M = 4.30, SD = 0.98). Exploratory factor analyses supported one-dimensionality of the short form. Internal consistency was high, and temporal stability was moderate. Initial convergent validity was suggested by moderate correlations between trust scores with associated constructs. The Trust in Oncologist Scale Short Form appears to efficiently, reliably and validly measures cancer patients' trust in their oncologist. It may be used in research and as a quality indicator in clinical practice. More thorough validation of the scale is recommended to confirm this initial evidence of its validity.
Leontjevas, Ruslan; Gerritsen, Debby L; Koopmans, Raymond T C M; Smalbrugge, Martin; Vernooij-Dassen, Myrra J F J
2012-06-01
A multidisciplinary, evidence-based care program to improve the management of depression in nursing home residents was implemented and tested using a stepped-wedge design in 23 nursing homes (NHs): "Act in case of Depression" (AiD). Before effect analyses, to evaluate AiD process data on sampling quality (recruitment and randomization, reach) and intervention quality (relevance and feasibility, extent to which AiD was performed), which can be used for understanding internal and external validity. In this article, a model is presented that divides process evaluation data into first- and second-order process data. Qualitative and quantitative data based on personal files of residents, interviews of nursing home professionals, and a research database were analyzed according to the following process evaluation components: sampling quality and intervention quality. Nursing home. The pattern of residents' informed consent rates differed for dementia special care units and somatic units during the study. The nursing home staff was satisfied with the AiD program and reported that the program was feasible and relevant. With the exception of the first screening step (nursing staff members using a short observer-based depression scale), AiD components were not performed fully by NH staff as prescribed in the AiD protocol. Although NH staff found the program relevant and feasible and was satisfied with the program content, individual AiD components may have different feasibility. The results on sampling quality implied that statistical analyses of AiD effectiveness should account for the type of unit, whereas the findings on intervention quality implied that, next to the type of unit, analyses should account for the extent to which individual AiD program components were performed. In general, our first-order process data evaluation confirmed internal and external validity of the AiD trial, and this evaluation enabled further statistical fine tuning. The importance of evaluating the first-order process data before executing statistical effect analyses is thus underlined. Copyright © 2012 American Medical Directors Association, Inc. Published by Elsevier Inc. All rights reserved.
Cortés-Castell, Ernesto; Juste, Mercedes; Palazón-Bru, Antonio; Monge, Laura; Sánchez-Ferrer, Francisco; Rizo-Baeza, María Mercedes
2017-01-01
Dual-energy X-ray absorptiometry (DXA) provides separate measurements of fat mass, fat-free mass and bone mass, and is a quick, accurate, and safe technique, yet one that is not readily available in routine clinical practice. Consequently, we aimed to develop statistical formulas to predict fat mass (%) and fat mass index (FMI) with simple parameters (age, sex, weight and height). We conducted a retrospective observational cross-sectional study in 416 overweight or obese patients aged 4-18 years that involved assessing adiposity by DXA (fat mass percentage and FMI), body mass index (BMI), sex and age. We randomly divided the sample into two parts (construction and validation). In the construction sample, we developed formulas to predict fat mass and FMI using linear multiple regression models. The formulas were validated in the other sample, calculating the intraclass correlation coefficient via bootstrapping. The fat mass percentage formula had a coefficient of determination of 0.65. This value was 0.86 for FMI. In the validation, the constructed formulas had an intraclass correlation coefficient of 0.77 for fat mass percentage and 0.92 for FMI. Our predictive formulas accurately predicted fat mass and FMI with simple parameters (BMI, sex and age) in children with overweight and obesity. The proposed methodology could be applied in other fields. Further studies are needed to externally validate these formulas.
Niemeijer, Anuschka S; van Waelvelde, Hilde; Smits-Engelsman, Bouwien C M
2015-02-01
The Movement Assessment Battery for Children has been revised as the Movement ABC-2 (Henderson, Sugden, & Barnett, 2007). In Europe, the 15th percentile score on this test is recommended for one of the DSM-IV diagnostic criteria for Developmental Coordination Disorder (DCD). A representative sample of Dutch and Flemish children was tested to cross-validate the UK standard scores, including the 15th percentile score. First, the mean, SD and percentile scores of Dutch children were compared to those of UK normative samples. Item standard scores of Dutch speaking children deviated from the UK reference values suggesting necessary adjustments. Except for very young children, the Dutch-speaking samples performed better. Second, based on the mean and SD and clinical relevant cut-off scores (5th and 15th percentile), norms were adjusted for the Dutch population. For diagnostic use, researchers and clinicians should use the reference norms that are valid for the group of children they are testing. The results indicate that there possibly is an effect of testing procedure in other countries that validated the UK norms and/or cultural influence on the age norms of the Movement ABC-2. It is suggested to formulate criterion-based norms for age groups in addition to statistical norms. Copyright © 2014 Elsevier B.V. All rights reserved.
Ducat, Giseli; Felsner, Maria L; da Costa Neto, Pedro R; Quináia, Sueli P
2015-06-15
Recently the use of brown sugar has increased due to its nutritional characteristics, thus requiring a more rigid quality control. The development of a method for water content analysis in soft brown sugar is carried out for the first time by TG/DTA with application of different statistical tests. The results of the optimization study suggest that heating rates of 5°C min(-1) and an alumina sample holder improve the efficiency of the drying process. The validation study showed that thermo gravimetry presents good accuracy and precision for water content analysis in soft brown sugar samples. This technique offers advantages over other analytical methods as it does not use toxic and costly reagents or solvents, it does not need any sample preparation, and it allows the identification of the temperature at which water is completely eliminated in relation to other volatile degradation products. This is an important advantage over the official method (loss on drying). Copyright © 2015 Elsevier Ltd. All rights reserved.
Grubbs, Joshua B; Volk, Fred; Exline, Julie J; Pargament, Kenneth I
2015-01-01
The authors aimed to validate a brief measure of perceived addiction to Internet pornography refined from the 32-item Cyber Pornography Use Inventory, report its psychometric properties, and examine how the notion of perceived addiction to Internet pornography might be related to other domains of psychological functioning. To accomplish this, 3 studies were conducted using a sample of undergraduate psychology students, a web-based adult sample, and a sample of college students seeking counseling at a university's counseling center. The authors developed and refined a short 9-item measure of perceived addiction to Internet pornography, confirmed its structure in multiple samples, examined its relatedness to hypersexuality more broadly, and demonstrated that the notion of perceived addiction to Internet pornography is very robustly related to various measures of psychological distress. Furthermore, the relation between psychological distress and the new measure persisted, even when other potential contributors (e.g., neuroticism, self-control, amount of time spent viewing pornography) were controlled for statistically, indicating the clinical relevance of assessing perceived addiction to Internet pornography.
von Zerssen, D; Barthelmes, H; Pössl, J; Black, C; Garzynski, E; Wessel, E; Hecht, H
1998-01-01
The Biographical Personality Interview (BPI) was applied to 179 subjects (158 psychiatric patients and 21 probands from the general population); 100 patients and 20 healthy controls served as a validation sample; the others had been interviewed during the training period or did not meet the inclusion criteria for the validation of the BPI. The acceptance of the interview was high, the inter-rater reliability of the ratings of premorbid personality structures ("types") varied between 0.81 and 0.88 per type. Concurrent validity of the typological constructs as assessed by means of the BPI was inferred from the intercorrelations of type scores and correlations of these scores with questionnaire data and proved to be adequate. Clinical validity of the assessment was indicated by statistically significant differences between diagnostic groups. Problems and further developments of the instrument and its application are discussed.
Pearson's chi-square test and rank correlation inferences for clustered data.
Shih, Joanna H; Fay, Michael P
2017-09-01
Pearson's chi-square test has been widely used in testing for association between two categorical responses. Spearman rank correlation and Kendall's tau are often used for measuring and testing association between two continuous or ordered categorical responses. However, the established statistical properties of these tests are only valid when each pair of responses are independent, where each sampling unit has only one pair of responses. When each sampling unit consists of a cluster of paired responses, the assumption of independent pairs is violated. In this article, we apply the within-cluster resampling technique to U-statistics to form new tests and rank-based correlation estimators for possibly tied clustered data. We develop large sample properties of the new proposed tests and estimators and evaluate their performance by simulations. The proposed methods are applied to a data set collected from a PET/CT imaging study for illustration. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Dantas, Raquel Batista; Oliveira, Graziella Lage; Silveira, Andréa Maria
2017-01-01
ABSTRACT OBJECTIVE Adapt and evaluate the psychometric properties of the Vulnerability to Abuse Screening Scale to identify risk of domestic violence against older adults in Brazil. METHODS The instrument was adapted and validated in a sample of 151 older adults from a geriatric reference center in the municipality of Belo Horizonte, State of Minas Gerais, in 2014. We collected sociodemographic, clinical, and abuse-related information, and verified reliability by reproducibility in a sample of 55 older people, who underwent re-testing of the instrument seven days after the first application. Descriptive and comparative analyses were performed for all variables, with a significance level of 5%. The construct validity was analyzed by the principal components method with a tetrachoric correlation matrix, the reliability of the scale by the weighted Kappa (Kp) statistic, and the internal consistency by the Kuder-Richardson estimator formula 20 (KR-20). RESULTS The average age of the participants was 72.1 years (DP = 6.96; 95%CI 70.94–73.17), with a maximum of 92 years, and they were predominantly female (76.2%; 95%CI 69.82–83.03). When analyzing the relationship between the scores of the Vulnerability to Abuse Screening Scale, categorized by presence (score > 3) or absence (score < 3) of vulnerability to abuse, with clinical and health conditions, we found statistically significant differences for self-perception of health (p = 0.002), depressive symptoms (p = 0.000), and presence of rheumatism (p = 0.003). There were no statistically significant differences between sexes. The Vulnerability to Abuse Screening Scale acceptably evaluated validity in the transcultural adaptation process, demonstrating dimensionality coherent with the original proposal (four factors). In the internal consistency analysis, the instrument presented good results (KR-20 = 0.69) and the reliability via reproducibility was considered excellent for the global scale (Kp = 0.92). CONCLUSIONS The Vulnerability to Abuse Screening Scale proved to be a valid instrument with good psychometric capacity for screening domestic abuse against older adults in Brazil. PMID:28423137
Biomarker development in the precision medicine era: lung cancer as a case study.
Vargas, Ashley J; Harris, Curtis C
2016-08-01
Precision medicine relies on validated biomarkers with which to better classify patients by their probable disease risk, prognosis and/or response to treatment. Although affordable 'omics'-based technology has enabled faster identification of putative biomarkers, the validation of biomarkers is still stymied by low statistical power and poor reproducibility of results. This Review summarizes the successes and challenges of using different types of molecule as biomarkers, using lung cancer as a key illustrative example. Efforts at the national level of several countries to tie molecular measurement of samples to patient data via electronic medical records are the future of precision medicine research.
Vespa, Anna; Giulietti, Maria Velia; Spatuzzi, Roberta; Fabbietti, Paolo; Meloni, Cristina; Gattafoni, Pisana; Ottaviani, Marica
2017-06-01
This study aimed at assessing the reliability and construct validity of Brief Multidimensional Measure of Religiousness/Spirituality (BMMRS) on Italian sample. 353 Italian participants: 58.9% affected by different diseases and 41.1% healthy subjects. The results of descriptive statistics of internal consistency reliabilities (Chronbach's coefficient) of the BMMRS revealed a remarkable consistency and reliability of different scales DSE, SpC, SC, CSC, VB, SPY-WELL and a good Inter-Class Correlations ≥70 maintaining a good stability of the measures over the time. BMMRS is a useful inventory for the evaluation of the principal spiritual dimensions.
Zhao, Qi; Liu, Yuanning; Zhang, Ning; Hu, Menghan; Zhang, Hao; Joshi, Trupti; Xu, Dong
2018-01-01
In recent years, an increasing number of studies have reported the presence of plant miRNAs in human samples, which resulted in a hypothesis asserting the existence of plant-derived exogenous microRNA (xenomiR). However, this hypothesis is not widely accepted in the scientific community due to possible sample contamination and the small sample size with lack of rigorous statistical analysis. This study provides a systematic statistical test that can validate (or invalidate) the plant-derived xenomiR hypothesis by analyzing 388 small RNA sequencing data from human samples in 11 types of body fluids/tissues. A total of 166 types of plant miRNAs were found in at least one human sample, of which 14 plant miRNAs represented more than 80% of the total plant miRNAs abundance in human samples. Plant miRNA profiles were characterized to be tissue-specific in different human samples. Meanwhile, the plant miRNAs identified from microbiome have an insignificant abundance compared to those from humans, while plant miRNA profiles in human samples were significantly different from those in plants, suggesting that sample contamination is an unlikely reason for all the plant miRNAs detected in human samples. This study also provides a set of testable synthetic miRNAs with isotopes that can be detected in situ after being fed to animals.
Flow Chamber System for the Statistical Evaluation of Bacterial Colonization on Materials
Menzel, Friederike; Conradi, Bianca; Rodenacker, Karsten; Gorbushina, Anna A.; Schwibbert, Karin
2016-01-01
Biofilm formation on materials leads to high costs in industrial processes, as well as in medical applications. This fact has stimulated interest in the development of new materials with improved surfaces to reduce bacterial colonization. Standardized tests relying on statistical evidence are indispensable to evaluate the quality and safety of these new materials. We describe here a flow chamber system for biofilm cultivation under controlled conditions with a total capacity for testing up to 32 samples in parallel. In order to quantify the surface colonization, bacterial cells were DAPI (4`,6-diamidino-2-phenylindole)-stained and examined with epifluorescence microscopy. More than 100 images of each sample were automatically taken and the surface coverage was estimated using the free open source software g’mic, followed by a precise statistical evaluation. Overview images of all gathered pictures were generated to dissect the colonization characteristics of the selected model organism Escherichia coli W3310 on different materials (glass and implant steel). With our approach, differences in bacterial colonization on different materials can be quantified in a statistically validated manner. This reliable test procedure will support the design of improved materials for medical, industrial, and environmental (subaquatic or subaerial) applications. PMID:28773891
Classen, Sherrilene; Wang, Yanning; Winter, Sandra M; Velozo, Craig A; Lanford, Desiree N; Bédard, Michel
2013-01-01
We determined the concurrent criterion validity of the Safe Driving Behavior Measure (SDBM) for on-road outcomes (passing or failing the on-road test as determined by a certified driving rehabilitation specialist) among older drivers and their family members-caregivers. On the basis of ratings from 168 older drivers and 168 family members-caregivers, we calculated receiver operating characteristic curves. The drivers' area under the curve (AUC) was .620 (95% confidence interval [CI] = .514-.725, p = .043). The family members-caregivers' AUC was .726 (95% CI = .622-.829, p ≤ .01). Older drivers' ratings showed statistically significant yet poor concurrent criterion validity, but family members-caregivers' ratings showed good concurrent criterion validity for the criterion on-road driving test. Continuing research with a more representative sample is being pursued to confirm the SDBM's concurrent criterion validity. This screening tool may be useful for generalist practitioners to use in making decisions regarding driving. Copyright © 2013 by the American Occupational Therapy Association, Inc.
Haslam, Divna; Filus, Ania; Morawska, Alina; Sanders, Matthew R; Fletcher, Renee
2015-06-01
This paper outlines the development and validation of the Work-Family Conflict Scale (WAFCS) designed to measure work-to-family conflict (WFC) and family-to-work conflict (FWC) for use with parents of young children. An expert informant and consumer feedback approach was utilised to develop and refine 20 items, which were subjected to a rigorous validation process using two separate samples of parents of 2-12 year old children (n = 305 and n = 264). As a result of statistical analyses several items were dropped resulting in a brief 10-item scale comprising two subscales assessing theoretically distinct but related constructs: FWC (five items) and WFC (five items). Analyses revealed both subscales have good internal consistency, construct validity as well as concurrent and predictive validity. The results indicate the WAFCS is a promising brief measure for the assessment of work-family conflict in parents. Benefits of the measure as well as potential uses are discussed.
Wang, Yanning; Winter, Sandra M.; Velozo, Craig A.; Lanford, Desiree N.; Bédard, Michel
2013-01-01
We determined the concurrent criterion validity of the Safe Driving Behavior Measure (SDBM) for on-road outcomes (passing or failing the on-road test as determined by a certified driving rehabilitation specialist) among older drivers and their family members–caregivers. On the basis of ratings from 168 older drivers and 168 family members–caregivers, we calculated receiver operating characteristic curves. The drivers’ area under the curve (AUC) was .620 (95% confidence interval [CI] = .514–.725, p = .043). The family members–caregivers’ AUC was .726 (95% CI = .622–.829, p ≤ .01). Older drivers’ ratings showed statistically significant yet poor concurrent criterion validity, but family members–caregivers’ ratings showed good concurrent criterion validity for the criterion on-road driving test. Continuing research with a more representative sample is being pursued to confirm the SDBM’s concurrent criterion validity. This screening tool may be useful for generalist practitioners to use in making decisions regarding driving. PMID:23245789
Reliability and Validity of the Turkish Version of the Job Performance Scale Instrument.
Harmanci Seren, Arzu Kader; Tuna, Rujnan; Eskin Bacaksiz, Feride
2018-02-01
Objective measurement of the job performance of nursing staff using valid and reliable instruments is important in the evaluation of healthcare quality. A current, valid, and reliable instrument that specifically measures the performance of nurses is required for this purpose. The aim of this study was to determine the validity and reliability of the Turkish version of the Job Performance Instrument. This study used a methodological design and a sample of 240 nurses working at different units in four hospitals in Istanbul, Turkey. A descriptive data form, the Job Performance Scale, and the Employee Performance Scale were used to collect data. Data were analyzed using IBM SPSS Statistics Version 21.0 and LISREL Version 8.51. On the basis of the data analysis, the instrument was revised. Some items were deleted, and subscales were combined. The Turkish version of the Job Performance Instrument was determined to be valid and reliable to measure the performance of nurses. The instrument is suitable for evaluating current nursing roles.
Progressive statistics for studies in sports medicine and exercise science.
Hopkins, William G; Marshall, Stephen W; Batterham, Alan M; Hanin, Juri
2009-01-01
Statistical guidelines and expert statements are now available to assist in the analysis and reporting of studies in some biomedical disciplines. We present here a more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science. We offer forthright advice on the following controversial or novel issues: using precision of estimation for inferences about population effects in preference to null-hypothesis testing, which is inadequate for assessing clinical or practical importance; justifying sample size via acceptable precision or confidence for clinical decisions rather than via adequate power for statistical significance; showing SD rather than SEM, to better communicate the magnitude of differences in means and nonuniformity of error; avoiding purely nonparametric analyses, which cannot provide inferences about magnitude and are unnecessary; using regression statistics in validity studies, in preference to the impractical and biased limits of agreement; making greater use of qualitative methods to enrich sample-based quantitative projects; and seeking ethics approval for public access to the depersonalized raw data of a study, to address the need for more scrutiny of research and better meta-analyses. Advice on less contentious issues includes the following: using covariates in linear models to adjust for confounders, to account for individual differences, and to identify potential mechanisms of an effect; using log transformation to deal with nonuniformity of effects and error; identifying and deleting outliers; presenting descriptive, effect, and inferential statistics in appropriate formats; and contending with bias arising from problems with sampling, assignment, blinding, measurement error, and researchers' prejudices. This article should advance the field by stimulating debate, promoting innovative approaches, and serving as a useful checklist for authors, reviewers, and editors.
The Validity and Reliability of the Turkish Version of the Neonatal Skin Risk Assessment Scale.
Sari, Çiğdem; Altay, Naime
2017-03-01
The study created a Turkish translation of the Neonatal Skin Risk Assessment Scale (NSRAS) that was developed by Huffines and Longsdon in 1997. Study authors used a cross-sectional survey design in order to determine the validity and reliability of the Turkish translation. The study was conducted at the neonatal intensive care unit of a university hospital in Ankara between March 15 and June 30, 2014. The research sample included 130 neonatal assessments from 17 patients. Data were collected by questionnaire regarding the characteristics of the participating neonates, 7 nurse observers, and the NSRAS and its subarticles. After translation and back-translation were performed to assess language validity of the scale, necessary corrections were made in line with expert suggestions, and content validity was ensured. Internal consistency of the scale was assessed by its homogeneity, Cronbach's α, and subarticle-general scale grade correlation. Cronbach's α for the scale overall was .88, and Cronbach's α values for the subarticles were between .83 and .90. Results showed a positive relationship among all the subarticles and the overall NSRAS scale grade (P < .01) with correlation values between 0.333 and 0.721. Explanatory and predicative factor analysis was applied for structural validity. Kaiser-Meyer-Olkin analysis was applied for sample sufficiency, and Bartlett test analysis was applied in order to assess the factor analysis of the sample. The Kaiser-Meyer-Olkin coefficient was 0.73, and the χ value found according to the Bartlett test was statistically significant at an advanced level (P < .05). In the 6 subarticles of the scale and in the general scale total grade, a high, positive, and significant relationship among the grades given by the researcher and the nurse observers was found (P < .05). The Turkish NSRAS is reliable and valid.
Gao, Wenjun; Yuan, Changrong; Wang, Jichuan; Du, Jiarui; Wu, Huiqiao; Qian, Xiaojie; Hinds, Pamela S
2013-01-01
The City of Hope Quality of Life-Ostomy Questionnaire is a widely accepted scale to assess quality of life in ostomy patients. However, the validity and reliability of the Chinese version (C-COH) have not been studied. The objective of the study was to assess the validity and reliability of the C-COH among ostomy patients sampled from Shanghai from August 2010 to June 2011. Content validity was examined based on the reviews of a panel of 10 experts; test-retest was conducted to assess the item reliabilities of the scale; a pilot sample (n = 274) was selected to explore the factorial structure of the C-COH using exploratory factor analysis; a validation sample (n = 370) was selected to confirm the findings from the exploratory study using confirmatory factor analysis (CFA). Statistical package SPSS version 16.0 was used for the exploratory factor analysis, and Amos 17.0 was used for the CFA. The C-COH was developed by modifying 1 item and excluding 11 items from the original scale. Four factors/subscales (physical well-being, psychological well-being, social well-being, and spiritual well-being) were identified and confirmed in the C-COH The scale reliabilities estimated from the CFA results for the 4 subscales were 0.860, 0.885, 0.864, and 0.686, respectively. Findings support the reliability and validity of the C-COH. The C-COH could be a useful measure of the level of quality of life among Chinese patients with a stoma and may provide important intervention implications for healthcare providers to help improve the life quality of patients with a stoma.
Tavoli, Azadeh; Melyani, Mahdiyeh; Bakhtiari, Maryam; Ghaedi, Gholam Hossein; Montazeri, Ali
2009-07-09
The Brief Fear of Negative Evaluation Scale (BFNE) is a commonly used instrument to measure social anxiety. This study aimed to translate and to test the reliability and validity of the BFNE in Iran. The English language version of the BFNE was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 235 students with (n = 33, clinical group) and without social phobia (n = 202, non-clinical group). In addition to the BFNE, two standard instruments were used to measure social phobia severity: the Social Phobia Inventory (SPIN), and the Social Interaction Anxiety Scale (SIAS). All participants completed a brief background information questionnaire, the SPIN, the SIAS and the BFNE scales. Statistical analysis was performed to test the reliability and validity of the BFNE. In all 235 students were studied (111 male and 124 female). The mean age for non-clinical group was 22.2 (SD = 2.1) years and for clinical sample it was 22.4 (SD = 1.8) years. Cronbach's alpha coefficient (to test reliability) was acceptable for both non-clinical and clinical samples (alpha = 0.90 and 0.82 respectively). In addition, 3-week test-retest reliability was performed in non-clinical sample and the intraclass correlation coefficient (ICC) was quite high (ICC = 0.71). Validity as performed using convergent and discriminant validity showed satisfactory results. The questionnaire correlated well with established measures of social phobia such as the SPIN (r = 0.43, p < 0.001) and the SIAS (r = 0.54, p < 0.001). Also the BFNE discriminated well between men and women with and without social phobia in the expected direction. Factor analysis supported a two-factor solution corresponding to positive and reverse-worded items. This validation study of the Iranian version of BFNE proved that it is an acceptable, reliable and valid measure of social phobia. However, since the scale showed a two-factor structure and this does not confirm to the theoretical basis for the BFNE, thus we suggest the use of the BFNE-II when it becomes available in Iran. The validation study of the BFNE-II is in progress.
Sampling designs for HIV molecular epidemiology with application to Honduras.
Shepherd, Bryan E; Rossini, Anthony J; Soto, Ramon Jeremias; De Rivera, Ivette Lorenzana; Mullins, James I
2005-11-01
Proper sampling is essential to characterize the molecular epidemiology of human immunodeficiency virus (HIV). HIV sampling frames are difficult to identify, so most studies use convenience samples. We discuss statistically valid and feasible sampling techniques that overcome some of the potential for bias due to convenience sampling and ensure better representation of the study population. We employ a sampling design called stratified cluster sampling. This first divides the population into geographical and/or social strata. Within each stratum, a population of clusters is chosen from groups, locations, or facilities where HIV-positive individuals might be found. Some clusters are randomly selected within strata and individuals are randomly selected within clusters. Variation and cost help determine the number of clusters and the number of individuals within clusters that are to be sampled. We illustrate the approach through a study designed to survey the heterogeneity of subtype B strains in Honduras.
Van den Broeck, Joke; Rossi, Gina; De Clercq, Barbara; Dierckx, Eva; Bastiaansen, Leen
2013-01-01
Research on the applicability of the five factor model (FFM) to capture personality pathology coincided with the development of a FFM personality disorder (PD) count technique, which has been validated in adolescent, young, and middle-aged samples. This study extends the literature by validating this technique in an older sample. Five alternative FFM PD counts based upon the Revised NEO Personality Inventory (NEO PI-R) are computed and evaluated in terms of both convergent and divergent validity with the Assessment of DSM-IV Personality Disorders Questionnaire (shortly ADP-IV; DSM-IV, Diagnostic and Statistical Manual of Mental Disorders - Fourth edition). For the best working count for each PD normative data are presented, from which cut-off scores are derived. The validity of these cut-offs and their usefulness as a screening tool is tested against both a categorical (i.e., the DSM-IV - Text Revision), and a dimensional (i.e., the Dimensional Assessment of Personality Pathology; DAPP) measure of personality pathology. All but the Antisocial and Obsessive-Compulsive counts exhibited adequate convergent and divergent validity, supporting the use of this method in older adults. Using the ADP-IV and the DAPP - Short Form as validation criteria, results corroborate the use of the FFM PD count technique to screen for PDs in older adults, in particular for the Paranoid, Borderline, Histrionic, Avoidant, and Dependent PDs. Given the age-neutrality of the NEO PI-R and the considerable lack of valid personality assessment tools, current findings appear to be promising for the assessment of pathology in older adults.
Reliability and validity of the Salford-Scott Nursing Values Questionnaire in Turkish.
Ulusoy, Hatice; Güler, Güngör; Yıldırım, Gülay; Demir, Ecem
2018-02-01
Developing professional values among nursing students is important because values are a significant predictor of the quality care that will be provided, the clients' recognition, and consequently the nurses' job satisfaction. The literature analysis showed that there is only one validated tool available in Turkish that examines both the personal and the professional values of nursing students. The aim of this study was to assess the reliability and validity of the Salford-Scott Nursing Values Questionnaire in Turkish. This study was a Turkish linguistic and cultural adaptation of a research tool. Participants and research context: The sample of this study consisted of 627 undergraduate nursing students from different geographical areas of Turkey. Two questionnaires were used for data collection: a socio-demographic form and the Salford-Scott Nursing Values Questionnaire. For the Salford-Scott Nursing Values Questionnaire, construct validity was examined using factor analyses. Ethical considerations: The study was approved by the Cumhuriyet University Faculty of Medicine Research Ethics Board. Students were informed that participation in the study was entirely voluntary and anonymous. Item content validity index ranged from 0.66 to 1.0, and the total content validity index was 0.94. The Kaiser-Meyer-Olkin measure of sampling was 0.870, and Bartlett's test of sphericity was statistically significant (x 2 = 3108.714, p < 0.001). Construct validity was examined using factor analyses and the six factors were identified. Cronbach's alpha was used to assess the internal consistency reliability and the value of 0.834 was obtained. Our analyses showed that the Turkish version of Salford-Scott Nursing Values Questionnaire has high validity and reliability.
Ruiz, Begoña; Urzúa, Iván; Cabello, Rodrigo; Rodríguez, Gonzalo; Espelid, Ivar
2013-01-01
To translate and validate a Spanish version of the "Questionnaire on the treatment of approximal and occlusal caries" as a method of collecting information about treatment decisions on caries management in Chilean primary health care services. The original questionnaire proposed by Espelid et al. was translated into Spanish using the forward-backward translation technique. Subsequently, validation of the Spanish version was undertaken. Data were collected from two separate samples; first, from 132 Spanish-speaking dentists recruited from primary health care services and second, from 21 individuals characterised as cariologists. Internal consistency was evaluated by the generation of Cronbach's alpha, test-retest reliability was evaluated by Cohen's kappa, convergent validity was evaluated by comparing the total scale scores to a global evaluation of treatment trends and discriminant validity was evaluated by investigating the differences in total scale scores between the Spanish-speaking dentist and cariologist samples. Cronbach's alpha indicated an internal consistency of 0.63 for the entire scale. Cohen's kappa correlation coefficient expressed a test-retest reliability of 0.83. Convergent validity determined a Pearson's correlation coefficient of 0.24 (p < 0.01). The comparison of proportions (chi-squared) indicated that discriminant validity was statistically significant (p < 0.01), using a one-tailed test. The Spanish version of the "Questionnaire on the treatment of approximal and occlusal caries" is a valid and reliable instrument for collecting information regarding treatment decisions in cariology. The clinical relevance of this study is to acquire a reliable instrument that allows for the determination of treatment decisions in Spanish-speaking dentists.
Harun, Norlida; Anderson, Robert A; Miller, Eleanor I
2009-01-01
An ELISA and a liquid chromatography-tandem mass spectrometry (LC-MS-MS) confirmation method were developed and validated for the identification and quantitation of ketamine and its major metabolite norketamine in urine samples. The Neogen ketamine microplate ELISA was optimized with respect to sample and enzyme conjugate volumes and the sample preincubation time before addition of the enzyme conjugate. The ELISA kit was validated to include an assessment of the dose-response curve, intra- and interday precision, limit of detection (LOD), and cross-reactivity. The sensitivity and specificity were calculated by comparison to the results from the validated LC-MS-MS confirmation method. An LC-MS-MS method was developed and validated with respect to LOD, lower limit of quantitation (LLOQ), linearity, recovery, intra- and interday precision, and matrix effects. The ELISA dose-response curve was a typical S-shaped binding curve, with a linear portion of the graph observed between 25 and 500 ng/mL for ketamine. The cross-reactivity of 200 ng/mL norketamine to ketamine was 2.1%, and no cross-reactivity was detected with 13 common drugs tested at 10,000 ng/mL. The ELISA LOD was calculated to be 5 ng/mL. Both intra- (n = 10) and interday (n = 50) precisions were below 5.0% at 25 ng/mL. The LOD for ketamine and norketamine was calculated statistically to be 0.6 ng/mL. The LLOQ values were also calculated statistically and were 1.9 ng/mL and 2.1 ng/mL for ketamine and norketamine, respectively. The test linearity was 0-1200 ng/mL with correlation coefficient (R(2)) > 0.99 for both analytes. Recoveries at 50, 500, and 1000 ng/mL range from 97.9% to 113.3%. Intra- (n = 5) and interday (n = 25) precisions between extracts for ketamine and norketamine were excellent (< 10%). Matrix effects analysis showed an average ion suppression of 5.7% for ketamine and an average ion enhancement of 13.0% for norketamine for urine samples collected from six individuals. A comparison of ELISA and LC-MS-MS results demonstrated a sensitivity, specificity, and efficiency of 100%. These results indicated that a cutoff value of 25 ng/mL ketamine in the ELISA screen is particularly suitable and reliable for urine testing in a forensic toxicology setting. Furthermore, both ketamine and norketamine were detected in all 34 urine samples collected from individuals socializing in pubs by the Royal Malaysian Police. Ketamine concentrations detected by LC-MS-MS ranged from 22 to 31,670 ng/mL, and norketamine concentrations ranged from 25 to 10,990 ng/mL. The concentrations of ketamine and norketamine detected in the samples are most ikely indicative of ketamine abuse.
Plank, David W; Szpylka, John; Sapirstein, Harry; Woollard, David; Zapf, Charles M; Lee, Vong; Chen, C Y Oliver; Liu, Rui Hai; Tsao, Rong; Düsterloh, André; Baugh, Steve
2012-01-01
A colorimetric method for the determination of total antioxidant activity in a variety of foods and beverages was validated in both a single-laboratory validation and a collaborative laboratory validation study. The procedure involved extraction of the antioxidants directly into a methanol-water solution containing a known amount of 2,2'-diphenyl-1-picrylhydrazyl (DPPH), thus promoting the rapid reaction of extracted materials with DPPH. The reaction was monitored by spectrophotometric measurement of the absorbance loss at 517 nm. Antioxidant activity was quantified relative to a dilution series of vitamin E analog standards (Trolox), which were analyzed in parallel simultaneously with the food and beverage samples. The antioxidant activities of the samples ranged from 131 to 131 000 micromole Trolox equivalents/100 g. Statistical analysis of the results showed that nine of the 11 matrixes gave acceptable HorRat values, indicating that the method performed well in these cases. The acceptable matrixes include pomegranate juice, blueberry juice, carrot juice, green tea, wine, rosemary spice, ready-to-eat cereal, and yogurt. Two samples failed the HorRat test: the first was an almond milk that had an antioxidant level below the practical LOQ for the method; the second was a sample of canola oil with added omega-3 fatty acid that was immiscible in the reaction medium.
ERIC Educational Resources Information Center
Nolan, Meaghan M.; Beran, Tanya; Hecker, Kent G.
2012-01-01
Students with positive attitudes toward statistics are likely to show strong academic performance in statistics courses. Multiple surveys measuring students' attitudes toward statistics exist; however, a comparison of the validity and reliability of interpretations based on their scores is needed. A systematic review of relevant electronic…
Alay, Asli; Usta, Taner A; Ozay, Pinar; Karadugan, Ozgur; Ates, Ugur
2014-05-01
The objective of this study was to compare classical blind endometrial tissue sampling with hysteroscopic biopsy sampling following methylene blue dyeing in premenopausal and postmenopausal patients with abnormal uterine bleeding. A prospective case-control study was carried out in the Office Hysteroscopy Unit. Fifty-four patients with complaints of abnormal uterine bleeding were evaluated. Data of 38 patients were included in the statistical analysis. Three groups were compared by examining samples obtained through hysteroscopic biopsy before and after methylene blue dyeing, and classical blind endometrial tissue sampling. First, uterine cavity was evaluated with office hysteroscopy. Methylene blue dye was administered through the hysteroscopic inlet. Tissue samples were obtained from stained and non-stained areas. Blind endometrial sampling was performed in the same patients immediately after the hysteroscopy procedure. The results of hysteroscopic biopsy from methylene blue stained and non-stained areas and blind biopsy were compared. No statistically significant differences were determined in the comparison of biopsy samples obtained from methylene-blue stained, non-stained areas and blind biopsy (P > 0.05). We suggest that chromohysteroscopy is not superior to endometrial sampling in cases of abnormal uterine bleeding. Further studies with greater sample sizes should be performed to assess the validity of routine use of endometrial dyeing. © 2014 The Authors. Journal of Obstetrics and Gynaecology Research © 2014 Japan Society of Obstetrics and Gynecology.
Schärer, Lars O; Krienke, Ute J; Graf, Sandra-Mareike; Meltzer, Katharina; Langosch, Jens M
2015-03-14
Long-term monitoring in bipolar affective disorders constitutes an important therapeutic and preventive method. The present study examines the validity of the Personal Life-Chart App (PLC App), in both German and in English. This App is based on the National Institute of Mental Health's Life-Chart Method, the de facto standard for long-term monitoring in the treatment of bipolar disorders. Methods have largely been replicated from 2 previous Life-Chart studies. The participants documented Life-Charts with the PLC App on a daily basis. Clinicians assessed manic and depressive symptoms in clinical interviews using the Inventory of Depressive Symptomatology, clinician-rated (IDS-C) and the Young Mania Rating Scale (YMRS) on a monthly basis on average. Spearman correlations of the total scores of IDS-C and YMRS were calculated with both the Life-Chart functional impairment rating and mood rating documented with the PLC App. 44 subjects used the PLC App in German and 10 subjects used the PLC App in English. 118 clinical interviews from the German sub-sample and 97 from the English sub-sample were analysed separately. The results in both sub-samples are similar to previous Life-Chart validation studies. Again statistically significant high correlations were found between the Life-Chart function rating assigned through the PLC App and well-established observer-rated methods. Again correlations were weaker for the Life-Chart mood rating than for the Life-Chart function impairment. No relevant correlation was found between the Life-chart mood rating and YMRS in the German sub-sample. This study gives further evidence for the validity of the Life-Chart method as a valid tool for the recognition of both manic and depressive episodes. Documenting Life-Charts with the PLC App (English and German) does not seem to impair the validity of patient ratings.
Near infrared spectroscopy for prediction of antioxidant compounds in the honey.
Escuredo, Olga; Seijo, M Carmen; Salvador, Javier; González-Martín, M Inmaculada
2013-12-15
The selection of antioxidant variables in honey is first time considered applying the near infrared (NIR) spectroscopic technique. A total of 60 honey samples were used to develop the calibration models using the modified partial least squares (MPLS) regression method and 15 samples were used for external validation. Calibration models on honey matrix for the estimation of phenols, flavonoids, vitamin C, antioxidant capacity (DPPH), oxidation index and copper using near infrared (NIR) spectroscopy has been satisfactorily obtained. These models were optimised by cross-validation, and the best model was evaluated according to multiple correlation coefficient (RSQ), standard error of cross-validation (SECV), ratio performance deviation (RPD) and root mean standard error (RMSE) in the prediction set. The result of these statistics suggested that the equations developed could be used for rapid determination of antioxidant compounds in honey. This work shows that near infrared spectroscopy can be considered as rapid tool for the nondestructive measurement of antioxidant constitutes as phenols, flavonoids, vitamin C and copper and also the antioxidant capacity in the honey. Copyright © 2013 Elsevier Ltd. All rights reserved.
Silverstein, Michael J; Faraone, Stephen V; Alperin, Samuel; Leon, Terry L; Biederman, Joseph; Spencer, Thomas J; Adler, Lenard A
2018-02-01
The aim of this study is to validate the Adult ADHD Self-Report Scale (ASRS) and Adult ADHD Investigator Symptom Rating Scale (AISRS) expanded versions, including executive function deficits (EFDs) and emotional dyscontrol (EC) items, and to present ASRS and AISRS pilot normative data. Two patient samples (referred and primary care physician [PCP] controls) were pooled together for these analyses. Final analysis included 297 respondents, 171 with adult ADHD. Cronbach's alphas were high for all sections of the scales. Examining histograms of ASRS 31-item and AISRS 18-item total scores for ADHD controls, 95% cutoff scores were 70 and 23, respectively; histograms for pilot normative sample suggest cutoffs of 82 and 26, respectively. (a) ASRS- and AISRS-expanded versions have high validity in assessment of core 18 adult ADHD Diagnostic and Statistical Manual of Mental Disorders ( DSM) symptoms and EFD and EC symptoms. (b) ASRS (31-item) scores 70 to 82 and AISRS (18-item) scores from 23 to 26 suggest a high likelihood of adult ADHD.
Gaudin, Valerie; Juhel-Gaugain, Murielle; Morétain, Jean-Pierre; Sanders, Pascal
2008-12-01
Premi Test contains viable spores of a strain of Bacillus stearothermophilus which is sensitive to antimicrobial residues, such as beta-lactams, tetracyclines, macrolides and sulphonamides. The growth of the strain is inhibited by the presence of antimicrobial residues in muscle tissue samples. Premi Test was validated according to AFNOR rules (French Association for Normalisation). The AFNOR validation was based on the comparison of reference methods (French Official method, i.e. four plate test (FPT) and the STAR protocol (five plate test)) with the alternative method (Premi Test). A preliminary study was conducted in an expert laboratory (Community Reference Laboratory, CRL) on both spiked and incurred samples (field samples). Several method performance criteria (sensitivity, specificity, relative accuracy) were estimated and are discussed, in addition to detection capabilities. Adequate agreement was found between the alternative method and the reference methods. However, Premi Test was more sensitive to beta-lactams and sulphonamides than the FPT. Subsequently, a collaborative study with 11 laboratories was organised by the CRL. Blank and spiked meat juice samples were sent to participants. The expert laboratory (CRL) statistically analysed the results. It was concluded that Premi Test could be used for the routine determination of antimicrobial residues in muscle of different animal origin with acceptable analytical performance. The detection capabilities of Premi Test for beta-lactams (amoxicillin, ceftiofur), one macrolide (tylosin) and tetracycline were at the level of the respective maximum residue limits (MRL) in muscle samples or even lower.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hertzler, C.L.; Poloski, J.P.; Bates, R.A.
1988-01-01
The Compliance Program Data Management System (DMS) developed at the Idaho National Engineering Laboratory (INEL) validates and maintains the integrity of data collected to support the Consent Order and Compliance Agreement (COCA) between the INEL and the Environmental Protection Agency (EPA). The system uses dBase III Plus programs and dBase III Plus in an interactive mode to enter, store, validate, manage, and retrieve analytical information provided on EPA Contract Laboratory Program (CLP) forms and CLP forms modified to accommodate 40 CFR 264 Appendix IX constituent analyses. Data analysis and presentation is performed utilizing SAS, a statistical analysis software program. Archivingmore » of data and results is performed at appropriate stages of data management. The DMS is useful for sampling and analysis programs where adherence to EPA CLP protocol, along with maintenance and retrieval of waste site investigation sampling results is desired or requested. 3 refs.« less
Reproducible and Verifiable Equations of State Using Microfabricated Materials
NASA Astrophysics Data System (ADS)
Martin, J. F.; Pigott, J. S.; Panero, W. R.
2017-12-01
Accurate interpretation of observable geophysical data, relevant to the structure, composition, and evolution of planetary interiors, requires precise determination of appropriate equations of state. We present the synthesis of controlled-geometry nanofabricated samples and insulation layers for the laser-heated diamond anvil cell. We present electron-gun evaporation, sputter deposition, and photolithography methods to mass-produce Pt/SiO2/Fe/SiO2 stacks and MgO insulating disks to be used in LHDAC experiments to reduce uncertainties in equation of state measurements due to large temperature gradients. We present a reanalysis of published iron PVT data to establish a statistically-valid extrapolation of the equation of state to inner core conditions with quantified uncertainties, addressing the complication of covariance in equation of state parameters. We use this reanalysis, together with the synthesized samples, to propose a scheme for measurement and validation of high-precision equations of state relevant to the Earth and super-Earth exoplanets.
Shortening the Xerostomia Inventory
Thomson, William Murray; van der Putten, Gert-Jan; de Baat, Cees; Ikebe, Kazunori; Matsuda, Ken-ichi; Enoki, Kaori; Hopcraft, Matthew; Ling, Guo Y
2011-01-01
Objectives To determine the validity and properties of the Summated Xerostomia Inventory-Dutch Version in samples from Australia, The Netherlands, Japan and New Zealand. Study design Six cross-sectional samples of older people from The Netherlands (N = 50), Australia (N = 637 and N = 245), Japan (N = 401) and New Zealand (N = 167 and N = 86). Data were analysed using the Summated Xerostomia Inventory-Dutch Version. Results Almost all data-sets revealed a single extracted factor which explained about half of the variance, with Cronbach’s alpha values of at least 0.70. When mean scale scores were plotted against a “gold standard” xerostomia question, statistically significant gradients were observed, with the highest score seen in those who always had dry mouth, and the lowest in those who never had it. Conclusion The Summated Xerostomia Inventory-Dutch Version is valid for measuring xerostomia symptoms in clinical and epidemiological research. PMID:21684773
Linking Associations of Rare Low-Abundance Species to Their Environments by Association Networks
Karpinets, Tatiana V.; Gopalakrishnan, Vancheswaran; Wargo, Jennifer; ...
2018-03-07
Studies of microbial communities by targeted sequencing of rRNA genes lead to recovering numerous rare low-abundance taxa with unknown biological roles. We propose to study associations of such rare organisms with their environments by a computational framework based on transformation of the data into qualitative variables. Namely, we analyze the sparse table of putative species or OTUs (operational taxonomic units) and samples generated in such studies, also known as an OTU table, by collecting statistics on co-occurrences of the species and on shared species richness across samples. Based on the statistics we built two association networks, of the rare putativemore » species and of the samples respectively, using a known computational technique, Association networks (Anets) developed for analysis of qualitative data. Clusters of samples and clusters of OTUs are then integrated and combined with metadata of the study to produce a map of associated putative species in their environments. We tested and validated the framework on two types of microbiomes, of human body sites and that of the Populus tree root systems. We show that in both studies the associations of OTUs can separate samples according to environmental or physiological characteristics of the studied systems.« less
Linking Associations of Rare Low-Abundance Species to Their Environments by Association Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karpinets, Tatiana V.; Gopalakrishnan, Vancheswaran; Wargo, Jennifer
Studies of microbial communities by targeted sequencing of rRNA genes lead to recovering numerous rare low-abundance taxa with unknown biological roles. We propose to study associations of such rare organisms with their environments by a computational framework based on transformation of the data into qualitative variables. Namely, we analyze the sparse table of putative species or OTUs (operational taxonomic units) and samples generated in such studies, also known as an OTU table, by collecting statistics on co-occurrences of the species and on shared species richness across samples. Based on the statistics we built two association networks, of the rare putativemore » species and of the samples respectively, using a known computational technique, Association networks (Anets) developed for analysis of qualitative data. Clusters of samples and clusters of OTUs are then integrated and combined with metadata of the study to produce a map of associated putative species in their environments. We tested and validated the framework on two types of microbiomes, of human body sites and that of the Populus tree root systems. We show that in both studies the associations of OTUs can separate samples according to environmental or physiological characteristics of the studied systems.« less
Psychometrics of chronic liver disease questionnaire in Chinese chronic hepatitis B patients
Zhou, Kai-Na; Zhang, Min; Wu, Qian; Ji, Zhen-Hao; Zhang, Xiao-Mei; Zhuang, Gui-Hua
2013-01-01
AIM: To evaluate psychometrics of the Chinese (mainland) chronic liver disease questionnaire (CLDQ) in patients with chronic hepatitis B (CHB). METHODS: A cross-sectional sample of 460 Chinese patients with CHB was selected from the Outpatient Department of the Eighth Hospital of Xi’an, including CHB (CHB without cirrhosis) (n = 323) and CHB-related cirrhosis (n = 137). The psychometrics includes reliability, validity and sensitivity. Internal consistency reliability was measured using Cronbach’s α. Convergent and discriminant validity was evaluated by item-scale correlation. Factorial validity was explored by principal component analysis with varimax rotation. Sensitivity was assessed using Cohen’s effect size (ES), and independent sample t test between CHB and CHB-related cirrhosis groups and between alanine aminotransferase (ALT) normal and abnormal groups after stratifying the disease (CHB and CHB-related cirrhosis). RESULTS: Internal consistency reliability of the CLDQ was 0.83 (range: 0.65-0.90). Most of the hypothesized item-scale correlations were 0.40 or over, and all of such hypothesized correlations were higher than the alternative ones, indicating satisfactory convergent and discriminant validity. Six factors were extracted after varimax rotation from the 29 items of CLDQ. The eligible Cohen’s ES with statistically significant independent sample t test was found in the overall CLDQ and abdominal, systematic, activity scales (CHB vs CHB-related cirrhosis), and in the overall CLDQ and abdominal scale in the stratification of patients with CHB (ALT normal vs abnormal). CONCLUSION: The CLDQ has acceptable reliability, validity and sensitivity in Chinese (mainland) patients with CHB. PMID:23801844
Romaniuk, Madeline; Khawaja, Nigar G
2013-09-25
The 30-item USDI is a self-report measure that assesses depressive symptoms among university students. It consists of three correlated three factors: lethargy, cognitive-emotional and academic motivation. The current research used confirmatory factor analysis to asses construct validity and determine whether the original factor structure would be replicated in a different sample. Psychometric properties were also examined. Participants were 1148 students (mean age 22.84 years, SD=6.85) across all faculties from a large Australian metropolitan university. Students completed a questionnaire comprising of the USDI, the depression anxiety stress scale (DASS) and Life Satisfaction Scale (LSS). The three correlated factor model was shown to be an acceptable fit to the data, indicating sound construct validity. Internal consistency of the scale was also demonstrated to be sound, with high Cronbach alpha values. Temporal stability of the scale was also shown to be strong through test-retest analysis. Finally, concurrent and discriminant validity was examined with correlations between the USDI and DASS subscales as well as the LSS, with sound results further supporting the construct validity of the scale. Cut-off points were also developed to aid total score interpretation. Response rates are unclear. In addition, the representativeness of the sample could be improved potentially through targeted recruitment (i.e. reviewing the online sample statistics during data collection, examining the representativeness trends and addressing particular faculties within the university that were underrepresented). The USDI provides a valid and reliable method of assessing depressive symptoms found among university students. © 2013 Elsevier B.V. All rights reserved.
Gerber, Markus; Lang, Christin; Lemola, Sakari; Colledge, Flora; Kalak, Nadeem; Holsboer-Trachsler, Edith; Pühse, Uwe; Brand, Serge
2016-05-31
A variety of objective and subjective methods exist to assess insomnia. The Insomnia Severity Index (ISI) was developed to provide a brief self-report instrument useful to assess people's perception of sleep complaints. The ISI was developed in English, and has been translated into several languages including German. Surprisingly, the psychometric properties of the German version have not been evaluated, although the ISI is often used with German-speaking populations. The psychometric properties of the ISI are tested in three independent samples: 1475 adolescents, 862 university students, and 533 police and emergency response service officers. In all three studies, participants provide information about insomnia (ISI), sleep quality (Pittsburgh Sleep Quality Index), and psychological functioning (diverse instruments). Descriptive statistics, gender differences, homogeneity and internal consistency, convergent validity, and factorial validity (including measurement invariance across genders) are examined in each sample. The findings show that the German version of the ISI has generally acceptable psychometric properties and sufficient concurrent validity. Confirmatory factor analyses show that a 1-factor solution achieves good model fit. Furthermore, measurement invariance across gender is supported in all three samples. While the ISI has been widely used in German-speaking countries, this study is the first to provide empirical evidence that the German version of this instrument has good psychometric properties and satisfactory convergent and factorial validity across various age groups and both men and women. Thus, the German version of the ISI can be recommended as a brief screening measure in German-speaking populations.
ERIC Educational Resources Information Center
Karadag, Engin; Caliskan, Nihat; Yesil, Rustu
2008-01-01
In this research, it is aimed to develop a scale to observe the body language which is used during an argument. A sample group of 266 teacher candidates study at the departments of Class, Turkish or Social Sciences at the Faculty of Education was used in this study. A logical and statistical approach was pursued during the development of scale. An…
Uncertainties in Coastal Ocean Color Products: Impacts of Spatial Sampling
NASA Technical Reports Server (NTRS)
Pahlevan, Nima; Sarkar, Sudipta; Franz, Bryan A.
2016-01-01
With increasing demands for ocean color (OC) products with improved accuracy and well characterized, per-retrieval uncertainty budgets, it is vital to decompose overall estimated errors into their primary components. Amongst various contributing elements (e.g., instrument calibration, atmospheric correction, inversion algorithms) in the uncertainty of an OC observation, less attention has been paid to uncertainties associated with spatial sampling. In this paper, we simulate MODIS (aboard both Aqua and Terra) and VIIRS OC products using 30 m resolution OC products derived from the Operational Land Imager (OLI) aboard Landsat-8, to examine impacts of spatial sampling on both cross-sensor product intercomparisons and in-situ validations of R(sub rs) products in coastal waters. Various OLI OC products representing different productivity levels and in-water spatial features were scanned for one full orbital-repeat cycle of each ocean color satellite. While some view-angle dependent differences in simulated Aqua-MODIS and VIIRS were observed, the average uncertainties (absolute) in product intercomparisons (due to differences in spatial sampling) at regional scales are found to be 1.8%, 1.9%, 2.4%, 4.3%, 2.7%, 1.8%, and 4% for the R(sub rs)(443), R(sub rs)(482), R(sub rs)(561), R(sub rs)(655), Chla, K(sub d)(482), and b(sub bp)(655) products, respectively. It is also found that, depending on in-water spatial variability and the sensor's footprint size, the errors for an in-situ validation station in coastal areas can reach as high as +/- 18%. We conclude that a) expected biases induced by the spatial sampling in product intercomparisons are mitigated when products are averaged over at least 7 km × 7 km areas, b) VIIRS observations, with improved consistency in cross-track spatial sampling, yield more precise calibration/validation statistics than that of MODIS, and c) use of a single pixel centered on in-situ coastal stations provides an optimal sampling size for validation efforts. These findings will have implications for enhancing our understanding of uncertainties in ocean color retrievals and for planning of future ocean color missions and the associated calibration/validation exercises.
Authorization of Animal Experiments Is Based on Confidence Rather than Evidence of Scientific Rigor
Nathues, Christina; Würbel, Hanno
2016-01-01
Accumulating evidence indicates high risk of bias in preclinical animal research, questioning the scientific validity and reproducibility of published research findings. Systematic reviews found low rates of reporting of measures against risks of bias in the published literature (e.g., randomization, blinding, sample size calculation) and a correlation between low reporting rates and inflated treatment effects. That most animal research undergoes peer review or ethical review would offer the possibility to detect risks of bias at an earlier stage, before the research has been conducted. For example, in Switzerland, animal experiments are licensed based on a detailed description of the study protocol and a harm–benefit analysis. We therefore screened applications for animal experiments submitted to Swiss authorities (n = 1,277) for the rates at which the use of seven basic measures against bias (allocation concealment, blinding, randomization, sample size calculation, inclusion/exclusion criteria, primary outcome variable, and statistical analysis plan) were described and compared them with the reporting rates of the same measures in a representative sub-sample of publications (n = 50) resulting from studies described in these applications. Measures against bias were described at very low rates, ranging on average from 2.4% for statistical analysis plan to 19% for primary outcome variable in applications for animal experiments, and from 0.0% for sample size calculation to 34% for statistical analysis plan in publications from these experiments. Calculating an internal validity score (IVS) based on the proportion of the seven measures against bias, we found a weak positive correlation between the IVS of applications and that of publications (Spearman’s rho = 0.34, p = 0.014), indicating that the rates of description of these measures in applications partly predict their rates of reporting in publications. These results indicate that the authorities licensing animal experiments are lacking important information about experimental conduct that determines the scientific validity of the findings, which may be critical for the weight attributed to the benefit of the research in the harm–benefit analysis. Similar to manuscripts getting accepted for publication despite poor reporting of measures against bias, applications for animal experiments may often be approved based on implicit confidence rather than explicit evidence of scientific rigor. Our findings shed serious doubt on the current authorization procedure for animal experiments, as well as the peer-review process for scientific publications, which in the long run may undermine the credibility of research. Developing existing authorization procedures that are already in place in many countries towards a preregistration system for animal research is one promising way to reform the system. This would not only benefit the scientific validity of findings from animal experiments but also help to avoid unnecessary harm to animals for inconclusive research. PMID:27911892
Authorization of Animal Experiments Is Based on Confidence Rather than Evidence of Scientific Rigor.
Vogt, Lucile; Reichlin, Thomas S; Nathues, Christina; Würbel, Hanno
2016-12-01
Accumulating evidence indicates high risk of bias in preclinical animal research, questioning the scientific validity and reproducibility of published research findings. Systematic reviews found low rates of reporting of measures against risks of bias in the published literature (e.g., randomization, blinding, sample size calculation) and a correlation between low reporting rates and inflated treatment effects. That most animal research undergoes peer review or ethical review would offer the possibility to detect risks of bias at an earlier stage, before the research has been conducted. For example, in Switzerland, animal experiments are licensed based on a detailed description of the study protocol and a harm-benefit analysis. We therefore screened applications for animal experiments submitted to Swiss authorities (n = 1,277) for the rates at which the use of seven basic measures against bias (allocation concealment, blinding, randomization, sample size calculation, inclusion/exclusion criteria, primary outcome variable, and statistical analysis plan) were described and compared them with the reporting rates of the same measures in a representative sub-sample of publications (n = 50) resulting from studies described in these applications. Measures against bias were described at very low rates, ranging on average from 2.4% for statistical analysis plan to 19% for primary outcome variable in applications for animal experiments, and from 0.0% for sample size calculation to 34% for statistical analysis plan in publications from these experiments. Calculating an internal validity score (IVS) based on the proportion of the seven measures against bias, we found a weak positive correlation between the IVS of applications and that of publications (Spearman's rho = 0.34, p = 0.014), indicating that the rates of description of these measures in applications partly predict their rates of reporting in publications. These results indicate that the authorities licensing animal experiments are lacking important information about experimental conduct that determines the scientific validity of the findings, which may be critical for the weight attributed to the benefit of the research in the harm-benefit analysis. Similar to manuscripts getting accepted for publication despite poor reporting of measures against bias, applications for animal experiments may often be approved based on implicit confidence rather than explicit evidence of scientific rigor. Our findings shed serious doubt on the current authorization procedure for animal experiments, as well as the peer-review process for scientific publications, which in the long run may undermine the credibility of research. Developing existing authorization procedures that are already in place in many countries towards a preregistration system for animal research is one promising way to reform the system. This would not only benefit the scientific validity of findings from animal experiments but also help to avoid unnecessary harm to animals for inconclusive research.
Khan, Asaduzzaman; Chien, Chi-Wen; Bagraith, Karl S
2015-04-01
To investigate whether using a parametric statistic in comparing groups leads to different conclusions when using summative scores from rating scales compared with using their corresponding Rasch-based measures. A Monte Carlo simulation study was designed to examine between-group differences in the change scores derived from summative scores from rating scales, and those derived from their corresponding Rasch-based measures, using 1-way analysis of variance. The degree of inconsistency between the 2 scoring approaches (i.e. summative and Rasch-based) was examined, using varying sample sizes, scale difficulties and person ability conditions. This simulation study revealed scaling artefacts that could arise from using summative scores rather than Rasch-based measures for determining the changes between groups. The group differences in the change scores were statistically significant for summative scores under all test conditions and sample size scenarios. However, none of the group differences in the change scores were significant when using the corresponding Rasch-based measures. This study raises questions about the validity of the inference on group differences of summative score changes in parametric analyses. Moreover, it provides a rationale for the use of Rasch-based measures, which can allow valid parametric analyses of rating scale data.
NASA Astrophysics Data System (ADS)
Yoo, Donghoon; Lee, Joohyun; Lee, Byeongchan; Kwon, Suyong; Koo, Junemo
2018-02-01
The Transient Hot-Wire Method (THWM) was developed to measure the absolute thermal conductivity of gases, liquids, melts, and solids with low uncertainty. The majority of nanofluid researchers used THWM to measure the thermal conductivity of test fluids. Several reasons have been suggested for the discrepancies in these types of measurements, including nanofluid generation, nanofluid stability, and measurement challenges. The details of the transient hot-wire method such as the test cell size, the temperature coefficient of resistance (TCR) and the sampling number are further investigated to improve the accuracy and consistency of the measurements of different researchers. It was observed that smaller test apparatuses were better because they can delay the onset of natural convection. TCR values of a coated platinum wire were measured and statistically analyzed to reduce the uncertainty in thermal conductivity measurements. For validation, ethylene glycol (EG) and water thermal conductivity were measured and analyzed in the temperature range between 280 and 310 K. Furthermore, a detailed statistical analysis was conducted for such measurements, and the results confirmed the minimum number of samples required to achieve the desired resolution and precision of the measurements. It is further proposed that researchers fully report the information related to their measurements to validate the measurements and to avoid future inconsistent nanofluid data.
An experimental validation of a statistical-based damage detection approach.
DOT National Transportation Integrated Search
2011-01-01
In this work, a previously-developed, statistical-based, damage-detection approach was validated for its ability to : autonomously detect damage in bridges. The damage-detection approach uses statistical differences in the actual and : predicted beha...
Evidence-based dentistry: analysis of dental anxiety scales for children.
Al-Namankany, A; de Souza, M; Ashley, P
2012-03-09
To review paediatric dental anxiety measures (DAMs) and assess the statistical methods used for validation and their clinical implications. A search of four computerised databases between 1960 and January 2011 associated with DAMs, using pre-specified search terms, to assess the method of validation including the reliability as intra-observer agreement 'repeatability or stability' and inter-observer agreement 'reproducibility' and all types of validity. Fourteen paediatric DAMs were predominantly validated in schools and not in the clinical setting while five of the DAMs were not validated at all. The DAMs that were validated were done so against other paediatric DAMs which may not have been validated previously. Reliability was not assessed in four of the DAMs. However, all of the validated studies assessed reliability which was usually 'good' or 'acceptable'. None of the current DAMs used a formal sample size technique. Diversity was seen between the studies ranging from a few simple pictograms to lists of questions reported by either the individual or an observer. To date there is no scale that can be considered as a gold standard, and there is a need to further develop an anxiety scale with a cognitive component for children and adolescents.
Survey of statistical techniques used in validation studies of air pollution prediction models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bornstein, R D; Anderson, S F
1979-03-01
Statistical techniques used by meteorologists to validate predictions made by air pollution models are surveyed. Techniques are divided into the following three groups: graphical, tabular, and summary statistics. Some of the practical problems associated with verification are also discussed. Characteristics desired in any validation program are listed and a suggested combination of techniques that possesses many of these characteristics is presented.
Sakunpak, Apirak; Suksaeree, Jirapornchai; Monton, Chaowalit; Pathompak, Pathamaporn; Kraisintu, Krisana
2014-02-01
To develop and validate an image analysis method for quantitative analysis of γ-oryzanol in cold pressed rice bran oil. TLC-densitometric and TLC-image analysis methods were developed, validated, and used for quantitative analysis of γ-oryzanol in cold pressed rice bran oil. The results obtained by these two different quantification methods were compared by paired t-test. Both assays provided good linearity, accuracy, reproducibility and selectivity for determination of γ-oryzanol. The TLC-densitometric and TLC-image analysis methods provided a similar reproducibility, accuracy and selectivity for the quantitative determination of γ-oryzanol in cold pressed rice bran oil. A statistical comparison of the quantitative determinations of γ-oryzanol in samples did not show any statistically significant difference between TLC-densitometric and TLC-image analysis methods. As both methods were found to be equal, they therefore can be used for the determination of γ-oryzanol in cold pressed rice bran oil.
Sakunpak, Apirak; Suksaeree, Jirapornchai; Monton, Chaowalit; Pathompak, Pathamaporn; Kraisintu, Krisana
2014-01-01
Objective To develop and validate an image analysis method for quantitative analysis of γ-oryzanol in cold pressed rice bran oil. Methods TLC-densitometric and TLC-image analysis methods were developed, validated, and used for quantitative analysis of γ-oryzanol in cold pressed rice bran oil. The results obtained by these two different quantification methods were compared by paired t-test. Results Both assays provided good linearity, accuracy, reproducibility and selectivity for determination of γ-oryzanol. Conclusions The TLC-densitometric and TLC-image analysis methods provided a similar reproducibility, accuracy and selectivity for the quantitative determination of γ-oryzanol in cold pressed rice bran oil. A statistical comparison of the quantitative determinations of γ-oryzanol in samples did not show any statistically significant difference between TLC-densitometric and TLC-image analysis methods. As both methods were found to be equal, they therefore can be used for the determination of γ-oryzanol in cold pressed rice bran oil. PMID:25182282
NASA Astrophysics Data System (ADS)
da Silva Oliveira, C. I.; Martinez-Martinez, D.; Al-Rjoub, A.; Rebouta, L.; Menezes, R.; Cunha, L.
2018-04-01
In this paper, we present a statistical method that allows evaluating the degree of a transparency of a thin film. To do so, the color coordinates are measured on different substrates, and the standard deviation is evaluated. In case of low values, the color depends on the film and not on the substrate, and intrinsic colors are obtained. In contrast, transparent films lead to high values of standard deviation, since the value of the color coordinates depends on the substrate. Between both extremes, colored films with a certain degree of transparency can be found. This method allows an objective and simple evaluation of the transparency of any film, improving the subjective visual inspection and avoiding the thickness problems related to optical spectroscopy evaluation. Zirconium oxynitride films deposited on three different substrates (Si, steel and glass) are used for testing the validity of this method, whose results have been validated with optical spectroscopy, and agree with the visual impression of the samples.
Trutschel, Diana; Palm, Rebecca; Holle, Bernhard; Simon, Michael
2017-11-01
Because not every scientific question on effectiveness can be answered with randomised controlled trials, research methods that minimise bias in observational studies are required. Two major concerns influence the internal validity of effect estimates: selection bias and clustering. Hence, to reduce the bias of the effect estimates, more sophisticated statistical methods are needed. To introduce statistical approaches such as propensity score matching and mixed models into representative real-world analysis and to conduct the implementation in statistical software R to reproduce the results. Additionally, the implementation in R is presented to allow the results to be reproduced. We perform a two-level analytic strategy to address the problems of bias and clustering: (i) generalised models with different abilities to adjust for dependencies are used to analyse binary data and (ii) the genetic matching and covariate adjustment methods are used to adjust for selection bias. Hence, we analyse the data from two population samples, the sample produced by the matching method and the full sample. The different analysis methods in this article present different results but still point in the same direction. In our example, the estimate of the probability of receiving a case conference is higher in the treatment group than in the control group. Both strategies, genetic matching and covariate adjustment, have their limitations but complement each other to provide the whole picture. The statistical approaches were feasible for reducing bias but were nevertheless limited by the sample used. For each study and obtained sample, the pros and cons of the different methods have to be weighted. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Genetic Programming as Alternative for Predicting Development Effort of Individual Software Projects
Chavoya, Arturo; Lopez-Martin, Cuauhtemoc; Andalon-Garcia, Irma R.; Meda-Campaña, M. E.
2012-01-01
Statistical and genetic programming techniques have been used to predict the software development effort of large software projects. In this paper, a genetic programming model was used for predicting the effort required in individually developed projects. Accuracy obtained from a genetic programming model was compared against one generated from the application of a statistical regression model. A sample of 219 projects developed by 71 practitioners was used for generating the two models, whereas another sample of 130 projects developed by 38 practitioners was used for validating them. The models used two kinds of lines of code as well as programming language experience as independent variables. Accuracy results from the model obtained with genetic programming suggest that it could be used to predict the software development effort of individual projects when these projects have been developed in a disciplined manner within a development-controlled environment. PMID:23226305
Pezzotti, Giuseppe; Zhu, Wenliang; Boffelli, Marco; Adachi, Tetsuya; Ichioka, Hiroaki; Yamamoto, Toshiro; Marunaka, Yoshinori; Kanamura, Narisato
2015-05-01
The Raman spectroscopic method has quantitatively been applied to the analysis of local crystallographic orientation in both single-crystal hydroxyapatite and human teeth. Raman selection rules for all the vibrational modes of the hexagonal structure were expanded into explicit functions of Euler angles in space and six Raman tensor elements (RTE). A theoretical treatment has also been put forward according to the orientation distribution function (ODF) formalism, which allows one to resolve the statistical orientation patterns of the nm-sized hydroxyapatite crystallite comprised in the Raman microprobe. Close-form solutions could be obtained for the Euler angles and their statistical distributions resolved with respect to the direction of the average texture axis. Polarized Raman spectra from single-crystalline hydroxyapatite and textured polycrystalline (teeth enamel) samples were compared, and a validation of the proposed Raman method could be obtained through confirming the agreement between RTE values obtained from different samples.
Unbiased estimation of oceanic mean rainfall from satellite borne radiometer measurements
NASA Technical Reports Server (NTRS)
Mittal, M. C.
1981-01-01
The statistical properties of the radar derived rainfall obtained during the GARP Atlantic Tropical Experiment (GATE) are used to derive quantitative estimates of the spatial and temporal sampling errors associated with estimating rainfall from brightness temperature measurements such as would be obtained from a satelliteborne microwave radiometer employing a practical size antenna aperture. A basis for a method of correcting the so called beam filling problem, i.e., for the effect of nonuniformity of rainfall over the radiometer beamwidth is provided. The method presented employs the statistical properties of the observations themselves without need for physical assumptions beyond those associated with the radiative transfer model. The simulation results presented offer a validation of the estimated accuracy that can be achieved and the graphs included permit evaluation of the effect of the antenna resolution on both the temporal and spatial sampling errors.
Challenges of Big Data Analysis.
Fan, Jianqing; Han, Fang; Liu, Han
2014-06-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.
Challenges of Big Data Analysis
Fan, Jianqing; Han, Fang; Liu, Han
2014-01-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469
Olsen, L R; Jensen, D V; Noerholm, V; Martiny, K; Bech, P
2003-02-01
We have developed the Major Depression Inventory (MDI), consisting of 10 items, covering the DSM-IV as well as the ICD-10 symptoms of depressive illness. We aimed to evaluate this as a scale measuring severity of depressive states with reference to both internal and external validity. Patients representing the score range from no depression to marked depression on the Hamilton Depression Scale (HAM-D) completed the MDI. Both classical and modern psychometric methods were applied for the evaluation of validity, including the Rasch analysis. In total, 91 patients were included. The results showed that the MDI had an adequate internal validity in being a unidimensional scale (the total score an appropriate or sufficient statistic). The external validity of the MDI was also confirmed as the total score of the MDI correlated significantly with the HAM-D (Pearson's coefficient 0.86, P < or = 0.01, Spearman 0.80, P < or = 0.01). When used in a sample of patients with different states of depression the MDI has an adequate internal and external validity.
Assessing the significance of pedobarographic signals using random field theory.
Pataky, Todd C
2008-08-07
Traditional pedobarographic statistical analyses are conducted over discrete regions. Recent studies have demonstrated that regionalization can corrupt pedobarographic field data through conflation when arbitrary dividing lines inappropriately delineate smooth field processes. An alternative is to register images such that homologous structures optimally overlap and then conduct statistical tests at each pixel to generate statistical parametric maps (SPMs). The significance of SPM processes may be assessed within the framework of random field theory (RFT). RFT is ideally suited to pedobarographic image analysis because its fundamental data unit is a lattice sampling of a smooth and continuous spatial field. To correct for the vast number of multiple comparisons inherent in such data, recent pedobarographic studies have employed a Bonferroni correction to retain a constant family-wise error rate. This approach unfortunately neglects the spatial correlation of neighbouring pixels, so provides an overly conservative (albeit valid) statistical threshold. RFT generally relaxes the threshold depending on field smoothness and on the geometry of the search area, but it also provides a framework for assigning p values to suprathreshold clusters based on their spatial extent. The current paper provides an overview of basic RFT concepts and uses simulated and experimental data to validate both RFT-relevant field smoothness estimations and RFT predictions regarding the topological characteristics of random pedobarographic fields. Finally, previously published experimental data are re-analysed using RFT inference procedures to demonstrate how RFT yields easily understandable statistical results that may be incorporated into routine clinical and laboratory analyses.
The Performance Analysis Based on SAR Sample Covariance Matrix
Erten, Esra
2012-01-01
Multi-channel systems appear in several fields of application in science. In the Synthetic Aperture Radar (SAR) context, multi-channel systems may refer to different domains, as multi-polarization, multi-interferometric or multi-temporal data, or even a combination of them. Due to the inherent speckle phenomenon present in SAR images, the statistical description of the data is almost mandatory for its utilization. The complex images acquired over natural media present in general zero-mean circular Gaussian characteristics. In this case, second order statistics as the multi-channel covariance matrix fully describe the data. For practical situations however, the covariance matrix has to be estimated using a limited number of samples, and this sample covariance matrix follow the complex Wishart distribution. In this context, the eigendecomposition of the multi-channel covariance matrix has been shown in different areas of high relevance regarding the physical properties of the imaged scene. Specifically, the maximum eigenvalue of the covariance matrix has been frequently used in different applications as target or change detection, estimation of the dominant scattering mechanism in polarimetric data, moving target indication, etc. In this paper, the statistical behavior of the maximum eigenvalue derived from the eigendecomposition of the sample multi-channel covariance matrix in terms of multi-channel SAR images is simplified for SAR community. Validation is performed against simulated data and examples of estimation and detection problems using the analytical expressions are as well given. PMID:22736976
A risk score for in-hospital death in patients admitted with ischemic or hemorrhagic stroke.
Smith, Eric E; Shobha, Nandavar; Dai, David; Olson, DaiWai M; Reeves, Mathew J; Saver, Jeffrey L; Hernandez, Adrian F; Peterson, Eric D; Fonarow, Gregg C; Schwamm, Lee H
2013-01-28
We aimed to derive and validate a single risk score for predicting death from ischemic stroke (IS), intracerebral hemorrhage (ICH), and subarachnoid hemorrhage (SAH). Data from 333 865 stroke patients (IS, 82.4%; ICH, 11.2%; SAH, 2.6%; uncertain type, 3.8%) in the Get With The Guidelines-Stroke database were used. In-hospital mortality varied greatly according to stroke type (IS, 5.5%; ICH, 27.2%; SAH, 25.1%; unknown type, 6.0%; P<0.001). The patients were randomly divided into derivation (60%) and validation (40%) samples. Logistic regression was used to determine the independent predictors of mortality and to assign point scores for a prediction model in the overall population and in the subset with the National Institutes of Health Stroke Scale (NIHSS) recorded (37.1%). The c statistic, a measure of how well the models discriminate the risk of death, was 0.78 in the overall validation sample and 0.86 in the model including NIHSS. The model with NIHSS performed nearly as well in each stroke type as in the overall model including all types (c statistics for IS alone, 0.85; for ICH alone, 0.83; for SAH alone, 0.83; uncertain type alone, 0.86). The calibration of the model was excellent, as demonstrated by plots of observed versus predicted mortality. A single prediction score for all stroke types can be used to predict risk of in-hospital death following stroke admission. Incorporation of NIHSS information substantially improves this predictive accuracy.
Molaeinezhad, Mitra; Roudsari, Robab Latifnejad; Yousefy, Alireza; Salehi, Mehrdad; Khoei, Effat Merghati
2014-04-01
Vaginismus is considered as one of the most common female psychosexual dysfunctions. Although the importance of using a multidisciplinary approach for assessment of vaginal penetration disorder is emphasized, the paucity of instruments for this purpose is clear. We designed a study to develop and investigate the psychometric properties of a multidimensional vaginal penetration disorder questionnaire (MVPDQ), thereby assisting specialists for clinical assessment of women with lifelong vaginismus (LLV). MVPDQ was developed using the findings from a thematic qualitative research conducted with 20 unconsummated couples from a former study, which was followed by an extensive literature review. Then, during a cross-sectional design, a consecutive sample of 214 women, who were diagnosed as LLV based on Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV-TR criteria completed MVPDQ and additional questions regarding their demographic and sexual history. Validation measures and reliability were tested by exploratory factor analysis and Cronbach's alpha coefficient via Statistical Package for the Social Sciences (SPSS) version 16. After conducting exploratory factor analysis, MVPDQ emerged with 72 items and 9 dimensions: Catastrophic cognitions and tightening, helplessness, marital adjustment, hypervigilance, avoidance, penetration motivation, sexual information, genital incompatibility, and optimism. Subscales of MVPDQ showed a significant reliability that varied between 0.70 and 0.87 and results of test-retest were satisfactory. The present study shows that MVPDQ is a valid and reliable self-report questionnaire for clinical assessment of women complaining of LLV. This instrument may assist specialists to make a clinical judgment and plan appropriately for clinical management.
Implementation of unsteady sampling procedures for the parallel direct simulation Monte Carlo method
NASA Astrophysics Data System (ADS)
Cave, H. M.; Tseng, K.-C.; Wu, J.-S.; Jermy, M. C.; Huang, J.-C.; Krumdieck, S. P.
2008-06-01
An unsteady sampling routine for a general parallel direct simulation Monte Carlo method called PDSC is introduced, allowing the simulation of time-dependent flow problems in the near continuum range. A post-processing procedure called DSMC rapid ensemble averaging method (DREAM) is developed to improve the statistical scatter in the results while minimising both memory and simulation time. This method builds an ensemble average of repeated runs over small number of sampling intervals prior to the sampling point of interest by restarting the flow using either a Maxwellian distribution based on macroscopic properties for near equilibrium flows (DREAM-I) or output instantaneous particle data obtained by the original unsteady sampling of PDSC for strongly non-equilibrium flows (DREAM-II). The method is validated by simulating shock tube flow and the development of simple Couette flow. Unsteady PDSC is found to accurately predict the flow field in both cases with significantly reduced run-times over single processor code and DREAM greatly reduces the statistical scatter in the results while maintaining accurate particle velocity distributions. Simulations are then conducted of two applications involving the interaction of shocks over wedges. The results of these simulations are compared to experimental data and simulations from the literature where there these are available. In general, it was found that 10 ensembled runs of DREAM processing could reduce the statistical uncertainty in the raw PDSC data by 2.5-3.3 times, based on the limited number of cases in the present study.
Tarescavage, Anthony M; Alosco, Michael L; Ben-Porath, Yossef S; Wood, Arcangela; Luna-Jones, Lynn
2015-04-01
We investigated the internal structure comparability of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) scores derived from the MMPI-2 and MMPI-2-RF booklets in a sample of 320 criminal defendants (229 males and 54 females). After exclusion of invalid protocols, the final sample consisted of 96 defendants who were administered the MMPI-2-RF booklet and 83 who completed the MMPI-2. No statistically significant differences in MMPI-2-RF invalidity rates were observed between the two forms. Individuals in the final sample who completed the MMPI-2-RF did not statistically differ on demographics or referral question from those who were administered the MMPI-2 booklet. Independent t tests showed no statistically significant differences between MMPI-2-RF scores generated with the MMPI-2 and MMPI-2-RF booklets on the test's substantive scales. Statistically significant small differences were observed on the revised Variable Response Inconsistency (VRIN-r) and True Response Inconsistency (TRIN-r) scales. Cronbach's alpha and standard errors of measurement were approximately equal between the booklets for all MMPI-2-RF scales. Finally, MMPI-2-RF intercorrelations produced from the two forms yielded mostly small and a few medium differences, indicating that discriminant validity and test structure are maintained. Overall, our findings reflect the internal structure comparability of MMPI-2-RF scale scores generated from MMPI-2 and MMPI-2-RF booklets. Implications of these results and limitations of these findings are discussed. © The Author(s) 2014.
Postcraniometric sex and ancestry estimation in South Africa: a validation study.
Liebenberg, Leandi; Krüger, Gabriele C; L'Abbé, Ericka N; Stull, Kyra E
2018-05-24
With the acceptance of the Daubert criteria as the standards for best practice in forensic anthropological research, more emphasis is being placed on the validation of published methods. Methods, both traditional and novel, need to be validated, adjusted, and refined for optimal performance within forensic anthropological analyses. Recently, a custom postcranial database of modern South Africans was created for use in Fordisc 3.1. Classification accuracies of up to 85% for ancestry estimation and 98% for sex estimation were achieved using a multivariate approach. To measure the external validity and report more realistic performance statistics, an independent sample was tested. The postcrania from 180 black, white, and colored South Africans were measured and classified using the custom postcranial database. A decrease in accuracy was observed for both ancestry estimation (79%) and sex estimation (95%) of the validation sample. When incorporating both sex and ancestry simultaneously, the method achieved 70% accuracy, and 79% accuracy when sex-specific ancestry analyses were run. Classification matrices revealed that postcrania were more likely to misclassify as a result of ancestry rather than sex. While both sex and ancestry influence the size of an individual, sex differences are more marked in the postcranial skeleton and are therefore easier to identify. The external validity of the postcranial database was verified and therefore shown to be a useful tool for forensic casework in South Africa. While the classification rates were slightly lower than the original method, this is expected when a method is generalized.
Monacis, Lucia; Palo, Valeria de; Griffiths, Mark D; Sinatra, Maria
2016-12-01
Background and aims The inclusion of Internet Gaming Disorder (IGD) in Section III of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders has increased the interest of researchers in the development of new standardized psychometric tools for the assessment of such a disorder. To date, the nine-item Internet Gaming Disorder Scale - Short-Form (IGDS9-SF) has only been validated in English, Portuguese, and Slovenian languages. Therefore, the aim of this investigation was to examine the psychometric properties of the IGDS9-SF in an Italian-speaking sample. Methods A total of 757 participants were recruited to the present study. Confirmatory factor analysis and multi-group analyses were applied to assess the construct validity. Reliability analyses comprised the average variance extracted, the standard error of measurement, and the factor determinacy coefficient. Convergent and criterion validities were established through the associations with other related constructs. The receiver operating characteristic curve analysis was used to determine an empirical cut-off point. Results Findings confirmed the single-factor structure of the instrument, its measurement invariance at the configural level, and the convergent and criterion validities. Satisfactory levels of reliability and a cut-off point of 21 were obtained. Discussion and conclusions The present study provides validity evidence for the use of the Italian version of the IGDS9-SF and may foster research into gaming addiction in the Italian context.
Systematic Development and Validation of a Theory-Based Questionnaire to Assess Toddler Feeding12
Hurley, Kristen M.; Pepper, M. Reese; Candelaria, Margo; Wang, Yan; Caulfield, Laura E.; Latta, Laura; Hager, Erin R.; Black, Maureen M.
2013-01-01
This paper describes the development and validation of a 27-item caregiver-reported questionnaire on toddler feeding. The development of the Toddler Feeding Behavior Questionnaire was based on a theory of interactive feeding that incorporates caregivers’ responses to concerns about their children’s dietary intake, appetite, size, and behaviors rather than relying exclusively on caregiver actions. Content validity included review by an expert panel (n = 7) and testing in a pilot sample (n = 105) of low-income mothers of toddlers. Construct validity and reliability were assessed among a second sample of low-income mothers of predominately African-American (70%) toddlers aged 12–32 mo (n = 297) participating in the baseline evaluation of a toddler overweight prevention study. Internal consistency (Cronbach’s α: 0.64–0.87) and test-retest (0.57–0.88) reliability were acceptable for most constructs. Exploratory and confirmatory factor analyses revealed 5 theoretically derived constructs of feeding: responsive, forceful/pressuring, restrictive, indulgent, and uninvolved (root mean square error of approximation = 0.047, comparative fit index = 0.90, standardized root mean square residual = 0.06). Statistically significant (P < 0.05) convergent validity results further validated the scale, confirming established relations between feeding behaviors, toddler overweight status, perceived toddler fussiness, and maternal mental health. The Toddler Feeding Behavior Questionnaire adds to the field by providing a brief instrument that can be administered in 5 min to examine how caregiver-reported feeding behaviors relate to toddler health and behavior. PMID:24068792
Monacis, Lucia; de Palo, Valeria; Griffiths, Mark D.; Sinatra, Maria
2016-01-01
Background and aims The inclusion of Internet Gaming Disorder (IGD) in Section III of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders has increased the interest of researchers in the development of new standardized psychometric tools for the assessment of such a disorder. To date, the nine-item Internet Gaming Disorder Scale – Short-Form (IGDS9-SF) has only been validated in English, Portuguese, and Slovenian languages. Therefore, the aim of this investigation was to examine the psychometric properties of the IGDS9-SF in an Italian-speaking sample. Methods A total of 757 participants were recruited to the present study. Confirmatory factor analysis and multi-group analyses were applied to assess the construct validity. Reliability analyses comprised the average variance extracted, the standard error of measurement, and the factor determinacy coefficient. Convergent and criterion validities were established through the associations with other related constructs. The receiver operating characteristic curve analysis was used to determine an empirical cut-off point. Results Findings confirmed the single-factor structure of the instrument, its measurement invariance at the configural level, and the convergent and criterion validities. Satisfactory levels of reliability and a cut-off point of 21 were obtained. Discussion and conclusions The present study provides validity evidence for the use of the Italian version of the IGDS9-SF and may foster research into gaming addiction in the Italian context. PMID:27876422
Systematic development and validation of a theory-based questionnaire to assess toddler feeding.
Hurley, Kristen M; Pepper, M Reese; Candelaria, Margo; Wang, Yan; Caulfield, Laura E; Latta, Laura; Hager, Erin R; Black, Maureen M
2013-12-01
This paper describes the development and validation of a 27-item caregiver-reported questionnaire on toddler feeding. The development of the Toddler Feeding Behavior Questionnaire was based on a theory of interactive feeding that incorporates caregivers' responses to concerns about their children's dietary intake, appetite, size, and behaviors rather than relying exclusively on caregiver actions. Content validity included review by an expert panel (n = 7) and testing in a pilot sample (n = 105) of low-income mothers of toddlers. Construct validity and reliability were assessed among a second sample of low-income mothers of predominately African-American (70%) toddlers aged 12-32 mo (n = 297) participating in the baseline evaluation of a toddler overweight prevention study. Internal consistency (Cronbach's α: 0.64-0.87) and test-retest (0.57-0.88) reliability were acceptable for most constructs. Exploratory and confirmatory factor analyses revealed 5 theoretically derived constructs of feeding: responsive, forceful/pressuring, restrictive, indulgent, and uninvolved (root mean square error of approximation = 0.047, comparative fit index = 0.90, standardized root mean square residual = 0.06). Statistically significant (P < 0.05) convergent validity results further validated the scale, confirming established relations between feeding behaviors, toddler overweight status, perceived toddler fussiness, and maternal mental health. The Toddler Feeding Behavior Questionnaire adds to the field by providing a brief instrument that can be administered in 5 min to examine how caregiver-reported feeding behaviors relate to toddler health and behavior.
Statistical validation of normal tissue complication probability models.
Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis
2012-09-01
To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.
Esteban, Santiago; Rodríguez Tablado, Manuel; Peper, Francisco; Mahumud, Yamila S; Ricci, Ricardo I; Kopitowski, Karin; Terrasa, Sergio
2017-01-01
Precision medicine requires extremely large samples. Electronic health records (EHR) are thought to be a cost-effective source of data for that purpose. Phenotyping algorithms help reduce classification errors, making EHR a more reliable source of information for research. Four algorithm development strategies for classifying patients according to their diabetes status (diabetics; non-diabetics; inconclusive) were tested (one codes-only algorithm; one boolean algorithm, four statistical learning algorithms and six stacked generalization meta-learners). The best performing algorithms within each strategy were tested on the validation set. The stacked generalization algorithm yielded the highest Kappa coefficient value in the validation set (0.95 95% CI 0.91, 0.98). The implementation of these algorithms allows for the exploitation of data from thousands of patients accurately, greatly reducing the costs of constructing retrospective cohorts for research.
Brunault, Paul; Ballon, Nicolas; Gaillard, Philippe; Réveillère, Christian; Courtois, Robert
2014-05-01
The concept of food addiction has recently been proposed by applying the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision, criteria for substance dependence to eating behaviour. Food addiction has received increased attention given that it may play a role in binge eating, eating disorders, and the recent increase in obesity prevalence. Currently, there is no psychometrically sound tool for assessing food addiction in French. Our study aimed to test the psychometric properties of a French version of the Yale Food Addiction Scale (YFAS) by establishing its factor structure and construct validity in a nonclinical population. A total of 553 participants were assessed for food addiction (French version of the YFAS) and binge eating behaviour (Bulimic Investigatory Test Edinburgh and Binge Eating Scale). We tested the scale's factor structure (factor analysis for dichotomous data based on tetrachoric correlation coefficients), internal consistency, and construct validity with measures of binge eating. Our results supported a 1-factor structure, which accounted for 54.1% of the variance. This tool had adequate reliability and high construct validity with measures of binge eating in this population, both in its diagnosis and symptom count version. A 2-factor structure explained an additional 9.1% of the variance, and could differentiate between patients with high, compared with low, levels of insight regarding addiction symptoms. In our study, we validated a psychometrically sound French version of the YFAS, both in its symptom count and diagnostic version. Future studies should validate this tool in clinical samples.
The stroke impairment assessment set: its internal consistency and predictive validity.
Tsuji, T; Liu, M; Sonoda, S; Domen, K; Chino, N
2000-07-01
To study the scale quality and predictive validity of the Stroke Impairment Assessment Set (SIAS) developed for stroke outcome research. Rasch analysis of the SIAS; stepwise multiple regression analysis to predict discharge functional independence measure (FIM) raw scores from demographic data, the SIAS scores, and the admission FIM scores; cross-validation of the prediction rule. Tertiary rehabilitation center in Japan. One hundred ninety stroke inpatients for the study of the scale quality and the predictive validity; a second sample of 116 stroke inpatients for the cross-validation study. Mean square fit statistics to study the degree of fit to the unidimensional model; logits to express item difficulties; discharge FIM scores for the study of predictive validity. The degree of misfit was acceptable except for the shoulder range of motion (ROM), pain, visuospatial function, and speech items; and the SIAS items could be arranged on a common unidimensional scale. The difficulty patterns were identical at admission and at discharge except for the deep tendon reflexes, ROM, and pain items. They were also similar for the right- and left-sided brain lesion groups except for the speech and visuospatial items. For the prediction of the discharge FIM scores, the independent variables selected were age, the SIAS total scores, and the admission FIM scores; and the adjusted R2 was .64 (p < .0001). Stability of the predictive equation was confirmed in the cross-validation sample (R2 = .68, p < .001). The unidimensionality of the SIAS was confirmed, and the SIAS total scores proved useful for stroke outcome prediction.
Husbands, Adrian; Mathieson, Alistair; Dowell, Jonathan; Cleland, Jennifer; MacKenzie, Rhoda
2014-04-23
The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson's correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT's role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life.
2014-01-01
Background The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. Methods The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson’s correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Results Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Conclusions Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT’s role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life. PMID:24762134
Siliquini, R; Saulle, R; Rabacchi, G; Bert, F; Massimi, A; Bulzomì, V; Boccia, A; La Torre, G
2012-01-01
Objective of this pilot study was to evaluate the reliability and validity of the web-based questionnaire in pregnant women as a tool to examine prevalence, knowledge and attitudes about internet utilization for health-related purposes, in a sample of Italian pregnant women. The questionnaire was composed by 9 sections for a total of 73 items. Reliability analysis was tested and content validity was evaluated using Cronbach's alpha to check internal consistency. Statistical analysis was performed through SPSS 13.0. Questionnaire was administered to 56 pregnant women. The higher value of Cronbach's alpha resulted on 61 items: alpha = 0.786 (n. 73 items: alpha = 0.579). High rate of pregnant women generally utilized internet (87.5%) and the 92.1% confirmed to use internet with the focus to acquire information about pregnancy (p < 0.0001). The questionnaire showed a good reliability property in the pilot study. In terms of internal consistency and validity appeared to have a good performance. Given the high prevalence of pregnant women that use internet to search information about their pregnancy status, professional healthcare workers should give advice regarding official websites where they could retrieve safe information and learn knowledge based on scientific evidence.
John B. Loomis; Hung Le Trong; Armando González-Cabán
2009-01-01
We estimate a marginal benefit function for using prescribed burning and mechanical fuel reduction programs to reduce acres burned by wildfire in three states. Since each state had different acre reductions, a statistically significant coefficient on the reduction in acres burned is also a split sample scope test frequently used as an indicator of the internal validity...
Parsons, Nick R; Price, Charlotte L; Hiskens, Richard; Achten, Juul; Costa, Matthew L
2012-04-25
The application of statistics in reported research in trauma and orthopaedic surgery has become ever more important and complex. Despite the extensive use of statistical analysis, it is still a subject which is often not conceptually well understood, resulting in clear methodological flaws and inadequate reporting in many papers. A detailed statistical survey sampled 100 representative orthopaedic papers using a validated questionnaire that assessed the quality of the trial design and statistical analysis methods. The survey found evidence of failings in study design, statistical methodology and presentation of the results. Overall, in 17% (95% confidence interval; 10-26%) of the studies investigated the conclusions were not clearly justified by the results, in 39% (30-49%) of studies a different analysis should have been undertaken and in 17% (10-26%) a different analysis could have made a difference to the overall conclusions. It is only by an improved dialogue between statistician, clinician, reviewer and journal editor that the failings in design methodology and analysis highlighted by this survey can be addressed.
Alladio, Eugenio; Caruso, Roberto; Gerace, Enrico; Amante, Eleonora; Salomone, Alberto; Vincenti, Marco
2016-05-30
The Technical Document TD2014EAAS was drafted by the World Anti-Doping Agency (WADA) in order to fight the spread of endogenous anabolic androgenic steroids (EAAS) misuse in several sport disciplines. In particular, adoption of the so-called Athlete Biological Passport (ABP) - Steroidal Module allowed control laboratories to identify anomalous EAAS concentrations within the athletes' physiological urinary steroidal profile. Gas chromatography (GC) combined with mass spectrometry (MS), indicated by WADA as an appropriate technique to detect urinary EAAS, was utilized in the present study to develop and fully-validate an analytical method for the determination of all EAAS markers specified in TD2014EAAS, plus two further markers hypothetically useful to reveal microbial degradation of the sample. In particular, testosterone, epitestosterone, androsterone, etiocholanolone, 5α-androstane-3α,17β-diol, 5β-androstane-3α,17β-diol, dehydroepiandrosterone, 5α-dihydrotestosterone, were included in the analytical method. Afterwards, the multi-parametric feature of ABP profile was exploited to develop a robust approach for the detection of EAAS misuse, based on multivariate statistical analysis. In particular, Principal Component Analysis (PCA) was combined with Hotelling T(2) tests to explore the EAAS data obtained from 60 sequential urine samples collected from six volunteers, in comparison with a reference population of single urine samples collected from 96 volunteers. The new approach proved capable of identifying anomalous results, including (i) the recognition of samples extraneous to each of the individual urine series and (ii) the discrimination of the urine samples collected from individuals to whom "endogenous" steroids had been administrated with respect to the rest of the samples population. The proof-of-concept results presented in this study will need further extension and validation on a population of sport professionals. Copyright © 2016 Elsevier B.V. All rights reserved.
Silva, Jani; Cerqueira, Fátima; Medeiros, Rui
2015-10-15
Assuming a possible association between Y chromosome (Yc)-DNA and sexually transmitted infection (STI) transmission rate, could Yc-DNA be related to an increased prevalence of Human Papillomavirus (HPV), Herpes Simplex Virus (HSV-1/2) and Chlamydia trachomatis (CT)? Could Yc-DNA be used to validate self-reported condom use and sexual behaviors? Cervicovaginal (CV) self-collected samples of 612 Portuguese women at childbearing age were tested for Yc, HPV, HSV-1/2 and CT by polymerase chain reaction (PCR). The prevalence of Yc, HPV, CT and HSV-2 was 4.9%, 17.6%, 11.6% and 2.8%, respectively. There was a statistically significant trend for increased Yc-DNA prevalence in HPV positive samples [odds ratio (OR) 2.35, 95% confidence interval (CI) 1.03-5.31] and oral contraceptive (OC) use (OR 4.73, 95% CI 1.09-20.44). A protective effect of condom use was observed in Yc-DNA detection (OR 0.40, 95% CI 0.18-0.89). No statistically significant difference was found between Yc-DNA, CT and HSV-2 infection. HPV infection risk increased with age (>20 years), young age at first sexual intercourse (FSI) (≤18 years), >1 lifetime sexual partner (LSP) and OC use. Risk factors for CT infection were young age (≤20 years) and young age at FSI (≤18 years). HSV-2 infection risk increased with age (>20 years) and >1 LSP. Considering the prevalence of HPV and CT in Yc positive samples, we hypothesize a current infection due to recent sexual activity. The study of Yc PCR may add information as (i) a predictor of STI transmission and (ii) an indicative biomarker to validate self-reported condom use. Copyright © 2015. Published by Elsevier Inc.
The development of the Pictorial Thai Quality of Life.
Phattharayuttawat, Sucheera; Ngamthipwatthana, Thienchai; Pitiyawaranun, Buncha
2005-11-01
"Quality of life" has become a main focus of interest in medicine. The Pictorial Thai Quality of Life (PTQL) was developed in order to measure the Thai mental illness both in a clinical setting and community. The purpose of this study was to develop the Pictorial Thai Quality of Life (PTQL), having adequate and sufficient construct validity, discriminant power, concurrent validity, and reliability. To develop the Pictorial Thai Quality of Life Test, two samples groups were used in the present study: (1) pilot study samples: 30 samples and (2) survey samples were 672 samples consisting of normal, and psychiatric patients. The developing tests items were collected from a review of the literature in which all the items were based on the WHO definition of Quality of Life. Then, experts judgment by the Delphi technique was used in the first stage. After that a pilot study was used to evaluate the testing administration, and wording of the tests items. The final stage was collected data from the survey samples. The results of the present study showed that the final test was composed 25 items. The construct validity of this test consists of six domains: Physical, Cognitive, Affective, Social Function, Economic and Self-Esteem. All the PTQL items have sufficient discriminant power It was found to be statistically significant different at the. 001 level between those people with mental disorders and normal people. There was a high level of concurrent validity association with WHOQOL-BREF, Pearson correlation coefficient and Area under ROC curve were 0.92 and 0.97 respectively. The reliability coefficients for the Alpha coefficients of the PTQL total test was 0.88. The values of the six scales were from 0.81 to 0:91. The present study was directed at developing an effective psychometric properties pictorial quality of life questionnaire. The result will be a more direct and meaningful application of an instrument to detect the mental health illness poor quality of life in Thai communities.
Heterogenic Solid Biofuel Sampling Methodology and Uncertainty Associated with Prompt Analysis
Pazó, Jose A.; Granada, Enrique; Saavedra, Ángeles; Patiño, David; Collazo, Joaquín
2010-01-01
Accurate determination of the properties of biomass is of particular interest in studies on biomass combustion or cofiring. The aim of this paper is to develop a methodology for prompt analysis of heterogeneous solid fuels with an acceptable degree of accuracy. Special care must be taken with the sampling procedure to achieve an acceptable degree of error and low statistical uncertainty. A sampling and error determination methodology for prompt analysis is presented and validated. Two approaches for the propagation of errors are also given and some comparisons are made in order to determine which may be better in this context. Results show in general low, acceptable levels of uncertainty, demonstrating that the samples obtained in the process are representative of the overall fuel composition. PMID:20559506
How Many Is Enough?—Statistical Principles for Lexicostatistics
Zhang, Menghan; Gong, Tao
2016-01-01
Lexicostatistics has been applied in linguistics to inform phylogenetic relations among languages. There are two important yet not well-studied parameters in this approach: the conventional size of vocabulary list to collect potentially true cognates and the minimum matching instances required to confirm a recurrent sound correspondence. Here, we derive two statistical principles from stochastic theorems to quantify these parameters. These principles validate the practice of using the Swadesh 100- and 200-word lists to indicate degree of relatedness between languages, and enable a frequency-based, dynamic threshold to detect recurrent sound correspondences. Using statistical tests, we further evaluate the generality of the Swadesh 100-word list compared to the Swadesh 200-word list and other 100-word lists sampled randomly from the Swadesh 200-word list. All these provide mathematical support for applying lexicostatistics in historical and comparative linguistics. PMID:28018261
Assessment of Processes of Change for Weight Management in a UK Sample
Andrés, Ana; Saldaña, Carmina; Beeken, Rebecca J.
2015-01-01
Objective The present study aimed to validate the English version of the Processes of Change questionnaire in weight management (P-Weight). Methods Participants were 1,087 UK adults, including people enrolled in a behavioural weight management programme, university students and an opportunistic sample. The mean age of the sample was 34.80 (SD = 13.56) years, and 83% were women. BMI ranged from 18.51 to 55.36 (mean = 25.92, SD = 6.26) kg/m2. Participants completed both the stages and processes questionnaires in weight management (S-Weight and P-Weight), and subscales from the EDI-2 and EAT-40. A refined version of the P-Weight consisting of 32 items was obtained based on the item analysis. Results The internal structure of the scale fitted a four-factor model, and statistically significant correlations with external measures supported the convergent validity of the scale. Conclusion The adequate psychometric properties of the P-Weight English version suggest that it could be a useful tool to tailor weight management interventions. PMID:25765163
Aleixandre-Tudo, José Luis; Nieuwoudt, Helené; Aleixandre, José Luis; Du Toit, Wessel J
2015-02-04
The validation of ultraviolet-visible (UV-vis) spectroscopy combined with partial least-squares (PLS) regression to quantify red wine tannins is reported. The methylcellulose precipitable (MCP) tannin assay and the bovine serum albumin (BSA) tannin assay were used as reference methods. To take the high variability of wine tannins into account when the calibration models were built, a diverse data set was collected from samples of South African red wines that consisted of 18 different cultivars, from regions spanning the wine grape-growing areas of South Africa with their various sites, climates, and soils, ranging in vintage from 2000 to 2012. A total of 240 wine samples were analyzed, and these were divided into a calibration set (n = 120) and a validation set (n = 120) to evaluate the predictive ability of the models. To test the robustness of the PLS calibration models, the predictive ability of the classifying variables cultivar, vintage year, and experimental versus commercial wines was also tested. In general, the statistics obtained when BSA was used as a reference method were slightly better than those obtained with MCP. Despite this, the MCP tannin assay should also be considered as a valid reference method for developing PLS calibrations. The best calibration statistics for the prediction of new samples were coefficient of correlation (R 2 val) = 0.89, root mean standard error of prediction (RMSEP) = 0.16, and residual predictive deviation (RPD) = 3.49 for MCP and R 2 val = 0.93, RMSEP = 0.08, and RPD = 4.07 for BSA, when only the UV region (260-310 nm) was selected, which also led to a faster analysis time. In addition, a difference in the results obtained when the predictive ability of the classifying variables vintage, cultivar, or commercial versus experimental wines was studied suggests that tannin composition is highly affected by many factors. This study also discusses the correlations in tannin values between the methylcellulose and protein precipitation methods.
Kerig, Patricia K; Charak, Ruby; Chaplo, Shannon D; Bennett, Diana C; Armour, Cherie; Modrowski, Crosby A; McGee, Andrew B
2016-09-01
The inclusion of a dissociative subtype in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM–5 ) criteria for the diagnosis of posttraumatic stress disorder (PTSD) has highlighted the need for valid and reliable measures of dissociative symptoms across developmental periods. The Adolescent Dissociative Experiences Scale (A-DES) is 1 of the few measures validated for young persons, but previous studies have yielded inconsistent results regarding its factor structure. Further, research to date on the A-DES has been based upon nonclinical samples of youth or those without a known history of trauma. To address these gaps in the literature, the present study investigated the factor structure and construct validity of the A-DES in a sample of highly trauma-exposed youth involved in the juvenile justice system. A sample of 784 youth (73.7% boys) recruited from a detention center completed self-report measures of trauma exposure and the A-DES, a subset of whom (n = 212) also completed a measure of PTSD symptoms. Confirmatory factor analyses revealed a best fitting 3-factor structure comprised of depersonalization or derealization, amnesia, and loss of conscious control, with configural and metric invariance across gender. Logistic regression analyses indicated that the depersonalization or derealization factor effectively distinguished between those youth who did and did not likely meet criteria for a diagnosis of PTSD as well as those with PTSD who did and did not likely meet criteria for the dissociative subtype. These results provide support for the multidimensionality of the construct of posttraumatic dissociation and contribute to the understanding of the dissociative subtype of PTSD among adolescents. (PsycINFO Database Record PsycINFO Database Record (c) 2016 APA, all rights reserved
Osborne, N J; Koplin, J J; Martin, P E; Gurrin, L C; Thiele, L; Tang, M L; Ponsonby, A-L; Dharmage, S C; Allen, K J
2010-10-01
The incidence of hospital admissions for food allergy-related anaphylaxis in Australia has increased, in line with world-wide trends. However, a valid measure of food allergy prevalence and risk factor data from a population-based study is still lacking. To describe the study design and methods used to recruit infants from a population for skin prick testing and oral food challenges, and the use of preliminary data to investigate the extent to which the study sample is representative of the target population. The study sampling frame design comprises 12-month-old infants presenting for routine scheduled vaccination at immunization clinics in Melbourne, Australia. We compared demographic features of participating families to population summary statistics from the Victorian Perinatal census database, and administered a survey to those non-responders who chose not to participate in the study. Study design proved acceptable to the community with good uptake (response rate 73.4%), with 2171 participants recruited. Demographic information on the study population mirrored the Victorian population with most the population parameters measured falling within our confidence intervals (CI). Use of a non-responder questionnaire revealed that a higher proportion of infants who declined to participate (non-responders) were already eating and tolerating peanuts, than those agreeing to participate (54.4%; 95% CI 50.8, 58.0 vs. 27.4%; 95% CI 25.5, 29.3 among participants). A high proportion of individuals approached in a community setting participated in a food allergy study. The study population differed from the eligible sample in relation to family history of allergy and prior consumption and peanut tolerance, providing some insights into the internal validity of the sample. The study exhibited external validity on general demographics to all births in Victoria. © 2010 Blackwell Publishing Ltd.
Ding, Ding; Hofstetter, C Richard; Norman, Gregory J; Irvin, Veronica L; Chhay, Douglas; Hovell, Melbourne F
2011-02-01
Immigration involves challenges and distress, which affect health and well-being of immigrants. Koreans are a recent, fast-growing, but understudied group of immigrants in the USA, and no study has established or evaluated any immigration stress measure among this population. This study explores psychometric properties of Korean-translated Demands of Immigration (DI) Scale among first-generation female Korean immigrants in California. Analyses included evaluation of factor structure, reliability, validity, and descriptive statistics of subscales. A surname-driven sampling strategy was applied to randomly select a representative sample of adult female Korean immigrants in California. Telephone interviews were conducted by trained bilingual interviewers. Study sample included 555 first-generation female Korean immigrants who were interviewed in Korean language. The 22-item DI Scale was used to assess immigration stress in the study sample. Exploratory factor analysis suggested six correlated factors in the DI Scale: language barriers; sense of loss; not feeling at home; perceived discrimination; novelty; and occupation. Confirmatory factor analysis validated the factor structure. Language barriers accounted for the most variance of the DI Scale (29.11%). The DI Scale demonstrated good internal consistency reliability and construct validity. Evidence has been offered that the Korean-translated DI Scale is a reliable and valid measurement tool to examine immigration stress among Korean immigrants. The Korean-translated DI Scale has replicated factor structure obtained in other ethnicities, but addition of cultural-specific items is suggested for Korean immigrants. High levels of language and occupation-related stress warrant attention from researchers, social workers, and policy-makers. Findings from this study will inform future interventions to alleviate stress due to demands of immigration.
Cloke, Jonathan; Evans, Katharine; Crabtree, David; Hughes, Annette; Simpson, Helen; Holopainen, Jani; Wickstrand, Nina; Kauppinen, Mikko; Leon-Velarde, Carlos; Larson, Nathan; Dave, Keron
2014-01-01
The Thermo Scientific SureTect Listeria species Assay is a new real-time PCR assay for the detection of all species of Listeria in food and environmental samples. This validation study was conducted using the AOAC Research Institute (RI) Performance Tested Methods program to validate the SureTect Listeria species Assay in comparison to the reference method detailed in International Organization for Standardization 11290-1:1996 including amendment 1:2004 in a variety of foods plus plastic and stainless steel. The food matrixes validated were smoked salmon, processed cheese, fresh bagged spinach, cantaloupe, cooked prawns, cooked sliced turkey meat, cooked sliced ham, salami, pork frankfurters, and raw ground beef. All matrixes were tested by Thermo Fisher Scientific, Microbiology Division, Basingstoke, UK. In addition, three matrixes (pork frankfurters, fresh bagged spinach, and stainless steel surface samples) were analyzed independently as part of the AOAC-RI-controlled independent laboratory study by the University ofGuelph, Canada. Using probability of detection statistical analysis, a significant difference in favour of the SureTect assay was demonstrated between the SureTect and reference method for high level spiked samples of pork frankfurters, smoked salmon, cooked prawns, stainless steel, and low-spiked samples of salami. For all other matrixes, no significant difference was seen between the two methods during the study. Inclusivity testing was conducted with 68 different isolates of Listeria species, all of which were detected by the SureTect Listeria species Assay. None of the 33 exclusivity isolates were detected by the SureTect Listeria species Assay. Ruggedness testing was conducted to evaluate the performance of the assay with specific method deviations outside of the recommended parameters open to variation, which demonstrated that the assay gave reliable performance. Accelerated stability testing was additionally conducted, validating the assay shelf life.
Evaluation of the Thermo Scientific™ SureTect™ Listeria species Assay.
Cloke, Jonathan; Evans, Katharine; Crabtree, David; Hughes, Annette; Simpson, Helen; Holopainen, Jani; Wickstrand, Nina; Kauppinen, Mikko
2014-03-01
The Thermo Scientific™ SureTect™ Listeria species Assay is a new real-time PCR assay for the detection of all species of Listeria in food and environmental samples. This validation study was conducted using the AOAC Research Institute (RI) Performance Tested MethodsSM program to validate the SureTect Listeria species Assay in comparison to the reference method detailed in International Organization for Standardization 11290-1:1996 including amendment 1:2004 in a variety of foods plus plastic and stainless steel. The food matrixes validated were smoked salmon, processed cheese, fresh bagged spinach, cantaloupe, cooked prawns, cooked sliced turkey meat, cooked sliced ham, salami, pork frankfurters, and raw ground beef. All matrixes were tested by Thermo Fisher Scientific, Microbiology Division, Basingstoke, UK. In addition, three matrixes (pork frankfurters, fresh bagged spinach, and stainless steel surface samples) were analyzed independently as part of the AOAC-RI-controlled independent laboratory study by the University of Guelph, Canada. Using probability of detection statistical analysis, a significant difference in favour of the SureTect assay was demonstrated between the SureTect and reference method for high level spiked samples of pork frankfurters, smoked salmon, cooked prawns, stainless steel, and low-spiked samples of salami. For all other matrixes, no significant difference was seen between the two methods during the study. Inclusivity testing was conducted with 68 different isolates of Listeria species, all of which were detected by the SureTect Listeria species Assay. None of the 33 exclusivity isolates were detected by the SureTect Listeria species Assay. Ruggedness testing was conducted to evaluate the performance of the assay with specific method deviations outside of the recommended parameters open to variation, which demonstrated that the assay gave reliable performance. Accelerated stability testing was additionally conducted, validating the assay shelf life.
Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch
2017-06-06
An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
Less is more? Assessing the validity of the ICD-11 model of PTSD across multiple trauma samples
Hansen, Maj; Hyland, Philip; Armour, Cherie; Shevlin, Mark; Elklit, Ask
2015-01-01
Background In the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), the symptom profile of posttraumatic stress disorder (PTSD) was expanded to include 20 symptoms. An alternative model of PTSD is outlined in the proposed 11th edition of the International Classification of Diseases (ICD-11) that includes just six symptoms. Objectives and method The objectives of the current study are: 1) to independently investigate the fit of the ICD-11 model of PTSD, and three DSM-5-based models of PTSD, across seven different trauma samples (N=3,746) using confirmatory factor analysis; 2) to assess the concurrent validity of the ICD-11 model of PTSD; and 3) to determine if there are significant differences in diagnostic rates between the ICD-11 guidelines and the DSM-5 criteria. Results The ICD-11 model of PTSD was found to provide excellent model fit in six of the seven trauma samples, and tests of factorial invariance showed that the model performs equally well for males and females. DSM-5 models provided poor fit of the data. Concurrent validity was established as the ICD-11 PTSD factors were all moderately to strongly correlated with scores of depression, anxiety, dissociation, and aggression. Levels of association were similar for ICD-11 and DSM-5 suggesting that explanatory power is not affected due to the limited number of items included in the ICD-11 model. Diagnostic rates were significantly lower according to ICD-11 guidelines compared to the DSM-5 criteria. Conclusions The proposed factor structure of the ICD-11 model of PTSD appears valid across multiple trauma types, possesses good concurrent validity, and is more stringent in terms of diagnosis compared to the DSM-5 criteria. PMID:26450830
Validity of the CAGE questionnaire for men who have sex with men (MSM) in China.
Chen, Yen-Tyng; Ibragimov, Umedjon; Nehl, Eric J; Zheng, Tony; He, Na; Wong, Frank Y
2016-03-01
Detection of heavy drinking among men who have sex with men (MSM) is crucial for both intervention and treatment. The CAGE questionnaire is a popular screening instrument for alcohol use problems. However, the validity of CAGE for Chinese MSM is unknown. Data were from three waves of cross-sectional assessments among general MSM (n=523) and men who sell sex to other men ("money boys" or MBs, n=486) in Shanghai, China. Specifically, participants were recruited using respondent-driven, community popular opinion leader, and venue-based sampling methods. The validity of the CAGE was examined for different cutoff scores and individual CAGE items using self-reported heavy drinking (≥14 drinks in the past week) as a criterion. In the full sample, 75 (7.4%) of participants were classified as heavy drinkers. 32 (6.1%) of general MSM and 43 (8.9%) of MBs were heavy drinkers. The area under curve statistics for overall sample was 0.7 (95% CI: 0.36-0.77). Overall, the sensitivities (ranging from 18.7 to 66.7%), specificities (ranging from 67.5 to 95.8%), and positive predictive values (ranging from 14.1 to 26.4%) for different cutoff scores were inadequate using past week heavy drinking as the criterion. The ability of CAGE to discriminate heavy drinkers from non-heavy drinkers was limited. Our findings showed the inadequate validity of CAGE as a screening instrument for current heavy drinking in Chinese MSM. Further research using a combination of validity criteria is needed to determine the applicability of CAGE for this population. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Design and validation of a model to predict early mortality in haemodialysis patients.
Mauri, Joan M; Clèries, Montse; Vela, Emili
2008-05-01
Mortality and morbidity rates are higher in patients receiving haemodialysis therapy than in the general population. Detection of risk factors related to early death in these patients could be of aid for clinical and administrative decision making. Objectives. The aims of this study were (1) to identify risk factors (comorbidity and variables specific to haemodialysis) associated with death in the first year following the start of haemodialysis and (2) to design and validate a prognostic model to quantify the probability of death for each patient. An analysis was carried out on all patients starting haemodialysis treatment in Catalonia during the period 1997-2003 (n = 5738). The data source was the Renal Registry of Catalonia, a mandatory population registry. Patients were randomly divided into two samples: 60% (n = 3455) of the total were used to develop the prognostic model and the remaining 40% (n = 2283) to validate the model. Logistic regression analysis was used to construct the model. One-year mortality in the total study population was 16.5%. The predictive model included the following variables: age, sex, primary renal disease, grade of functional autonomy, chronic obstructive pulmonary disease, malignant processes, chronic liver disease, cardiovascular disease, initial vascular access and malnutrition. The analyses showed adequate calibration for both the sample to develop the model and the validation sample (Hosmer-Lemeshow statistic 0.97 and P = 0.49, respectively) as well as adequate discrimination (ROC curve 0.78 in both cases). Risk factors implicated in mortality at one year following the start of haemodialysis have been determined and a prognostic model designed. The validated, easy-to-apply model quantifies individual patient risk attributable to various factors, some of them amenable to correction by directed interventions.
Less is more? Assessing the validity of the ICD-11 model of PTSD across multiple trauma samples.
Hansen, Maj; Hyland, Philip; Armour, Cherie; Shevlin, Mark; Elklit, Ask
2015-01-01
In the 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), the symptom profile of posttraumatic stress disorder (PTSD) was expanded to include 20 symptoms. An alternative model of PTSD is outlined in the proposed 11th edition of the International Classification of Diseases (ICD-11) that includes just six symptoms. The objectives of the current study are: 1) to independently investigate the fit of the ICD-11 model of PTSD, and three DSM-5-based models of PTSD, across seven different trauma samples (N=3,746) using confirmatory factor analysis; 2) to assess the concurrent validity of the ICD-11 model of PTSD; and 3) to determine if there are significant differences in diagnostic rates between the ICD-11 guidelines and the DSM-5 criteria. The ICD-11 model of PTSD was found to provide excellent model fit in six of the seven trauma samples, and tests of factorial invariance showed that the model performs equally well for males and females. DSM-5 models provided poor fit of the data. Concurrent validity was established as the ICD-11 PTSD factors were all moderately to strongly correlated with scores of depression, anxiety, dissociation, and aggression. Levels of association were similar for ICD-11 and DSM-5 suggesting that explanatory power is not affected due to the limited number of items included in the ICD-11 model. Diagnostic rates were significantly lower according to ICD-11 guidelines compared to the DSM-5 criteria. The proposed factor structure of the ICD-11 model of PTSD appears valid across multiple trauma types, possesses good concurrent validity, and is more stringent in terms of diagnosis compared to the DSM-5 criteria.
Crestani, Anelise Henrich; Moraes, Anaelena Bragança de; Souza, Ana Paula Ramos de
2017-08-10
To analyze the results of the validation of building enunciative signs of language acquisition for children aged 3 to 12 months. The signs were built based on mechanisms of language acquisition in an enunciative perspective and on clinical experience with language disorders. The signs were submitted to judgment of clarity and relevance by a sample of six experts, doctors in linguistic in with knowledge of psycholinguistics and language clinic. In the validation of reliability, two judges/evaluators helped to implement the instruments in videos of 20% of the total sample of mother-infant dyads using the inter-evaluator method. The method known as internal consistency was applied to the total sample, which consisted of 94 mother-infant dyads to the contents of the Phase 1 (3-6 months) and 61 mother-infant dyads to the contents of Phase 2 (7 to 12 months). The data were collected through the analysis of mother-infant interaction based on filming of dyads and application of the parameters to be validated according to the child's age. Data were organized in a spreadsheet and then converted to computer applications for statistical analysis. The judgments of clarity/relevance indicated no modifications to be made in the instruments. The reliability test showed an almost perfect agreement between judges (0.8 ≤ Kappa ≥ 1.0); only the item 2 of Phase 1 showed substantial agreement (0.6 ≤ Kappa ≥ 0.79). The internal consistency for Phase 1 had alpha = 0.84, and Phase 2, alpha = 0.74. This demonstrates the reliability of the instruments. The results suggest adequacy as to content validity of the instruments created for both age groups, demonstrating the relevance of the content of enunciative signs of language acquisition.
Bias correction for selecting the minimal-error classifier from many machine learning models.
Ding, Ying; Tang, Shaowu; Liao, Serena G; Jia, Jia; Oesterreich, Steffi; Lin, Yan; Tseng, George C
2014-11-15
Supervised machine learning is commonly applied in genomic research to construct a classifier from the training data that is generalizable to predict independent testing data. When test datasets are not available, cross-validation is commonly used to estimate the error rate. Many machine learning methods are available, and it is well known that no universally best method exists in general. It has been a common practice to apply many machine learning methods and report the method that produces the smallest cross-validation error rate. Theoretically, such a procedure produces a selection bias. Consequently, many clinical studies with moderate sample sizes (e.g. n = 30-60) risk reporting a falsely small cross-validation error rate that could not be validated later in independent cohorts. In this article, we illustrated the probabilistic framework of the problem and explored the statistical and asymptotic properties. We proposed a new bias correction method based on learning curve fitting by inverse power law (IPL) and compared it with three existing methods: nested cross-validation, weighted mean correction and Tibshirani-Tibshirani procedure. All methods were compared in simulation datasets, five moderate size real datasets and two large breast cancer datasets. The result showed that IPL outperforms the other methods in bias correction with smaller variance, and it has an additional advantage to extrapolate error estimates for larger sample sizes, a practical feature to recommend whether more samples should be recruited to improve the classifier and accuracy. An R package 'MLbias' and all source files are publicly available. tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
English, Devin; Bowleg, Lisa; del Río-González, Ana Maria; Tschann, Jeanne M.; Agans, Robert; Malebranche, David J
2017-01-01
Objectives Although social science research has examined police and law enforcement-perpetrated discrimination against Black men using policing statistics and implicit bias studies, there is little quantitative evidence detailing this phenomenon from the perspective of Black men. Consequently, there is a dearth of research detailing how Black men’s perspectives on police and law enforcement-related stress predict negative physiological and psychological health outcomes. This study addresses these gaps with the qualitative development and quantitative test of the Police and Law Enforcement (PLE) scale. Methods In Study 1, we employed thematic analysis on transcripts of individual qualitative interviews with 90 Black men to assess key themes and concepts and develop quantitative items. In Study 2, we used 2 focus groups comprised of 5 Black men each (n=10), intensive cognitive interviewing with a separate sample of Black men (n=15), and piloting with another sample of Black men (n=13) to assess the ecological validity of the quantitative items. For study 3, we analyzed data from a sample of 633 Black men between the ages of 18 and 65 to test the factor structure of the PLE, as we all as its concurrent validity and convergent/discriminant validity. Results Qualitative analyses and confirmatory factor analyses suggested that a 5-item, 1-factor measure appropriately represented respondents’ experiences of police/law enforcement discrimination. As hypothesized, the PLE was positively associated with measures of racial discrimination and depressive symptoms. Conclusions Preliminary evidence suggests that the PLE is a reliable and valid measure of Black men’s experiences of discrimination with police/law enforcement. PMID:28080104
English, Devin; Bowleg, Lisa; Del Río-González, Ana Maria; Tschann, Jeanne M; Agans, Robert P; Malebranche, David J
2017-04-01
Although social science research has examined police and law enforcement-perpetrated discrimination against Black men using policing statistics and implicit bias studies, there is little quantitative evidence detailing this phenomenon from the perspective of Black men. Consequently, there is a dearth of research detailing how Black men's perspectives on police and law enforcement-related stress predict negative physiological and psychological health outcomes. This study addresses these gaps with the qualitative development and quantitative test of the Police and Law Enforcement (PLE) Scale. In Study 1, we used thematic analysis on transcripts of individual qualitative interviews with 90 Black men to assess key themes and concepts and develop quantitative items. In Study 2, we used 2 focus groups comprised of 5 Black men each (n = 10), intensive cognitive interviewing with a separate sample of Black men (n = 15), and piloting with another sample of Black men (n = 13) to assess the ecological validity of the quantitative items. For Study 3, we analyzed data from a sample of 633 Black men between the ages of 18 and 65 to test the factor structure of the PLE, as we all as its concurrent validity and convergent/discriminant validity. Qualitative analyses and confirmatory factor analyses suggested that a 5-item, 1-factor measure appropriately represented respondents' experiences of police/law enforcement discrimination. As hypothesized, the PLE was positively associated with measures of racial discrimination and depressive symptoms. Preliminary evidence suggests that the PLE is a reliable and valid measure of Black men's experiences of discrimination with police/law enforcement. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Parastar, Hadi; Mostafapour, Sara; Azimi, Gholamhasan
2016-01-01
Comprehensive two-dimensional gas chromatography and flame ionization detection combined with unfolded-partial least squares is proposed as a simple, fast and reliable method to assess the quality of gasoline and to detect its potential adulterants. The data for the calibration set are first baseline corrected using a two-dimensional asymmetric least squares algorithm. The number of significant partial least squares components to build the model is determined using the minimum value of root-mean square error of leave-one out cross validation, which was 4. In this regard, blends of gasoline with kerosene, white spirit and paint thinner as frequently used adulterants are used to make calibration samples. Appropriate statistical parameters of regression coefficient of 0.996-0.998, root-mean square error of prediction of 0.005-0.010 and relative error of prediction of 1.54-3.82% for the calibration set show the reliability of the developed method. In addition, the developed method is externally validated with three samples in validation set (with a relative error of prediction below 10.0%). Finally, to test the applicability of the proposed strategy for the analysis of real samples, five real gasoline samples collected from gas stations are used for this purpose and the gasoline proportions were in range of 70-85%. Also, the relative standard deviations were below 8.5% for different samples in the prediction set. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Lotan, Tamara L.; Wei, Wei; Morais, Carlos L.; Hawley, Sarah T.; Fazli, Ladan; Hurtado-Coll, Antonio; Troyer, Dean; McKenney, Jesse K.; Simko, Jeffrey; Carroll, Peter R.; Gleave, Martin; Lance, Raymond; Lin, Daniel W.; Nelson, Peter S.; Thompson, Ian M.; True, Lawrence D.; Feng, Ziding; Brooks, James D.
2015-01-01
Background PTEN is the most commonly deleted tumor suppressor gene in primary prostate cancer (PCa) and its loss is associated with poor clinical outcomes and ERG gene rearrangement. Objective We tested whether PTEN loss is associated with shorter recurrence-free survival (RFS) in surgically treated PCa patients with known ERG status. Design, setting, and participants A genetically validated, automated PTEN immunohistochemistry (IHC) protocol was used for 1275 primary prostate tumors from the Canary Foundation retrospective PCa tissue microarray cohort to assess homogeneous (in all tumor tissue sampled) or heterogeneous (in a subset of tumor tissue sampled) PTEN loss. ERG status as determined by a genetically validated IHC assay was available for a subset of 938 tumors. Outcome measurements and statistical analysis Associations between PTEN and ERG status were assessed using Fisher’s exact test. Kaplan-Meier and multivariate weighted Cox proportional models for RFS were constructed. Results and limitations When compared to intact PTEN, homogeneous (hazard ratio [HR] 1.66, p = 0.001) but not heterogeneous (HR 1.24, p = 0.14) PTEN loss was significantly associated with shorter RFS in multivariate models. Among ERG-positive tumors, homogeneous (HR 3.07, p < 0.0001) but not heterogeneous (HR 1.46, p = 0.10) PTEN loss was significantly associated with shorter RFS. Among ERG-negative tumors, PTEN did not reach significance for inclusion in the final multivariate models. The interaction term for PTEN and ERG status with respect to RFS did not reach statistical significance (p = 0.11) for the current sample size. Conclusions These data suggest that PTEN is a useful prognostic biomarker and that there is no statistically significant interaction between PTEN and ERG status for RFS. Patient summary We found that loss of the PTEN tumor suppressor gene in prostate tumors as assessed by tissue staining is correlated with shorter time to prostate cancer recurrence after radical prostatectomy. PMID:27617307
Gerber, Madelyn M.; Hampel, Heather; Schulz, Nathan P.; Fernandez, Soledad; Wei, Lai; Zhou, Xiao-Ping; de la Chapelle, Albert; Toland, Amanda Ewart
2012-01-01
Background Tumors frequently exhibit loss of tumor suppressor genes or allelic gains of activated oncogenes. A significant proportion of cancer susceptibility loci in the mouse show somatic losses or gains consistent with the presence of a tumor susceptibility or resistance allele. Thus, allele-specific somatic gains or losses at loci may demarcate the presence of resistance or susceptibility alleles. The goal of this study was to determine if previously mapped susceptibility loci for colorectal cancer show evidence of allele-specific somatic events in colon tumors. Methods We performed quantitative genotyping of 16 single nucleotide polymorphisms (SNPs) showing statistically significant association with colorectal cancer in published genome-wide association studies (GWAS). We genotyped 194 paired normal and colorectal tumor DNA samples and 296 paired validation samples to investigate these SNPs for allele-specific somatic gains and losses. We combined analysis of our data with published data for seven of these SNPs. Results No statistically significant evidence for allele-specific somatic selection was observed for the tested polymorphisms in the discovery set. The rs6983267 variant, which has shown preferential loss of the non-risk T allele and relative gain of the risk G allele in previous studies, favored relative gain of the G allele in the combined discovery and validation samples (corrected p-value = 0.03). When we combined our data with published allele-specific imbalance data for this SNP, the G allele of rs6983267 showed statistically significant evidence of relative retention (p-value = 2.06×10−4). Conclusions Our results suggest that the majority of variants identified as colon cancer susceptibility alleles through GWAS do not exhibit somatic allele-specific imbalance in colon tumors. Our data confirm previously published results showing allele-specific imbalance for rs6983267. These results indicate that allele-specific imbalance of cancer susceptibility alleles may not be a common phenomenon in colon cancer. PMID:22629442
Xu, Yifan; Sun, Jiayang; Carter, Rebecca R; Bogie, Kath M
2014-05-01
Stereophotogrammetric digital imaging enables rapid and accurate detailed 3D wound monitoring. This rich data source was used to develop a statistically validated model to provide personalized predictive healing information for chronic wounds. 147 valid wound images were obtained from a sample of 13 category III/IV pressure ulcers from 10 individuals with spinal cord injury. Statistical comparison of several models indicated the best fit for the clinical data was a personalized mixed-effects exponential model (pMEE), with initial wound size and time as predictors and observed wound size as the response variable. Random effects capture personalized differences. Other models are only valid when wound size constantly decreases. This is often not achieved for clinical wounds. Our model accommodates this reality. Two criteria to determine effective healing time outcomes are proposed: r-fold wound size reduction time, t(r-fold), is defined as the time when wound size reduces to 1/r of initial size. t(δ) is defined as the time when the rate of the wound healing/size change reduces to a predetermined threshold δ < 0. Healing rate differs from patient to patient. Model development and validation indicates that accurate monitoring of wound geometry can adaptively predict healing progression and that larger wounds heal more rapidly. Accuracy of the prediction curve in the current model improves with each additional evaluation. Routine assessment of wounds using detailed stereophotogrammetric imaging can provide personalized predictions of wound healing time. Application of a valid model will help the clinical team to determine wound management care pathways. Published by Elsevier Ltd.
Jean-Pierre, Pascal; Fundakowski, Christopher; Perez, Enrique; Jean-Pierre, Shadae E; Jean-Pierre, Ashley R; Melillo, Angelica B; Libby, Rachel; Sargi, Zoukaa
2013-02-01
Cancer and its treatments are associated with psychological distress that can negatively impact self-perception, psychosocial functioning, and quality of life. Patients with head and neck cancers (HNC) are particularly susceptible to psychological distress. This study involved a cross-validation of the Measure of Body Apperception (MBA) for HNC patients. One hundred and twenty-two English-fluent HNC patients between 20 and 88 years of age completed the MBA on a Likert scale ranging from "1 = disagree" to "4 = agree." We assessed the latent structure and internal consistency reliability of the MBA using Principal Components Analysis (PCA) and Cronbach's coefficient alpha (α), respectively. We determined convergent and divergent validities of the MBA using correlations with the Hospital Anxiety and Depression Scale (HADS), observer disfigurement rating, and patients' clinical and demographic variables. The PCA revealed a coherent set of items that explained 38 % of the variance. The Kaiser-Meyer-Olkin measure of sampling adequacy was 0.73 and the Bartlett's test of sphericity was statistically significant (χ (2) (28) = 253.64; p < 0.001), confirming the suitability of the data for dimension reduction analysis. The MBA had good internal consistency reliability (α = 0.77) and demonstrated adequate convergent and divergent validities based on statistically significant moderate correlations with the HADS (p < 0.01) and observer rating of disfigurement (p < 0.026) and nonstatistically significant correlations with patients' clinical and demographic variables: tumor location, age at diagnosis, and birth place (all p (s) > 0.05). The MBA is a valid and reliable screening measure of body apperception for HNC patients.
Sales, C; Cervera, M I; Gil, R; Portolés, T; Pitarch, E; Beltran, J
2017-02-01
The novel atmospheric pressure chemical ionization (APCI) source has been used in combination with gas chromatography (GC) coupled to hybrid quadrupole time-of-flight (QTOF) mass spectrometry (MS) for determination of volatile components of olive oil, enhancing its potential for classification of olive oil samples according to their quality using a metabolomics-based approach. The full-spectrum acquisition has allowed the detection of volatile organic compounds (VOCs) in olive oil samples, including Extra Virgin, Virgin and Lampante qualities. A dynamic headspace extraction with cartridge solvent elution was applied. The metabolomics strategy consisted of three different steps: a full mass spectral alignment of GC-MS data using MzMine 2.0, a multivariate analysis using Ez-Info and the creation of the statistical model with combinations of responses for molecular fragments. The model was finally validated using blind samples, obtaining an accuracy in oil classification of 70%, taking the official established method, "PANEL TEST", as reference. Copyright © 2016 Elsevier Ltd. All rights reserved.
Sequential Tests of Multiple Hypotheses Controlling Type I and II Familywise Error Rates
Bartroff, Jay; Song, Jinlin
2014-01-01
This paper addresses the following general scenario: A scientist wishes to perform a battery of experiments, each generating a sequential stream of data, to investigate some phenomenon. The scientist would like to control the overall error rate in order to draw statistically-valid conclusions from each experiment, while being as efficient as possible. The between-stream data may differ in distribution and dimension but also may be highly correlated, even duplicated exactly in some cases. Treating each experiment as a hypothesis test and adopting the familywise error rate (FWER) metric, we give a procedure that sequentially tests each hypothesis while controlling both the type I and II FWERs regardless of the between-stream correlation, and only requires arbitrary sequential test statistics that control the error rates for a given stream in isolation. The proposed procedure, which we call the sequential Holm procedure because of its inspiration from Holm’s (1979) seminal fixed-sample procedure, shows simultaneous savings in expected sample size and less conservative error control relative to fixed sample, sequential Bonferroni, and other recently proposed sequential procedures in a simulation study. PMID:25092948
Convergent validity of alternative MMPI-2 personality disorder scales.
Hicklin, J; Widiger, T A
2000-12-01
The Morey, Waugh, and Blashfield (1985) MMPI (Hathaway et al., 1989) personality disorder scales provided a significant contribution to personality disorder research and assessment. However, the subsequent revisions to the MMPI and the multiple revisions to the diagnostic criteria sets that have since occurred may have justified comparable revisions to these scales. Somwaru and Ben-Porath (1995) selected a substantially different set of items from the MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989) to assess Diagnostic and Statistical Manual of Mental Disorders (4th ed.; American Psychiatric Association, 1994) personality disorder diagnostic criteria. In our study, we compared the convergent validity of these alternative MMPI-2 personality disorder scales with respect to 3 self-report measures of personality disorder symptomatology in a sample of 82 psychiatric outpatients. The results suggested that Somwaru and Ben-Porath's scales are as valid as the original Morey et al. scales and might be even more valid for the assessment of borderline, antisocial, and schizoid personality disorder symptomatology.
Validating two questions in the Force Concept Inventory with subquestions
NASA Astrophysics Data System (ADS)
Yasuda, Jun-ichiro; Taniguchi, Masa-aki
2013-06-01
In this study, we evaluate the structural validity of Q.16 and Q.7 in the Force Concept Inventory (FCI). We address whether respondents who answer Q.16 and Q.7 correctly actually have an understanding of the concepts of physics tested in the questions. To examine respondents’ levels of understanding, we use subquestions that test them on concepts believed to be required to answer the actual FCI questions. Our sample size comprises 111 respondents; we derive false-positive ratios for prelearners and postlearners and then statistically test the difference between them. We find a difference at the 0.05 significance level for both Q.16 and Q.7, implying that it is possible for postlearners to answer both questions without an understanding of the concepts of physics tested in the questions; therefore, the structures of Q.16 and Q.7 are invalid. In this study, we only evaluate the validity of these two FCI questions; we do not assess the validity of previous studies that have compared total FCI scores.
Zietze, Stefan; Müller, Rainer H; Brecht, René
2008-03-01
In order to set up a batch-to-batch-consistency analytical scheme for N-glycosylation analysis, several sample preparation steps including enzyme digestions and fluorophore labelling and two HPLC-methods were established. The whole method scheme was standardized, evaluated and validated according to the requirements on analytical testing in early clinical drug development by usage of a recombinant produced reference glycoprotein (RGP). The standardization of the methods was performed by clearly defined standard operation procedures. During evaluation of the methods, the major interest was in the loss determination of oligosaccharides within the analytical scheme. Validation of the methods was performed with respect to specificity, linearity, repeatability, LOD and LOQ. Due to the fact that reference N-glycan standards were not available, a statistical approach was chosen to derive accuracy from the linearity data. After finishing the validation procedure, defined limits for method variability could be calculated and differences observed in consistency analysis could be separated into significant and incidental ones.
Suminski, Richard R; Robertson, Robert J; Goss, Fredric L; Olvera, Norma
2008-08-01
Whether the translation of verbal descriptors from English to Spanish affects the validity of the Children's OMNI Scale of Perceived Exertion is not known, so the validity of a Spanish version of the OMNI was examined with 32 boys and 36 girls (9 to 12 years old) for whom Spanish was the primary language. Oxygen consumption, ventilation, respiratory rate, respiratory exchange ratio, heart rate, and ratings of perceived exertion for the overall body (RPE-O) were measured during an incremental treadmill test. All response values displayed significant linear increases across test stages. The linear regression analyses indicated RPE-O values were distributed as positive linear functions of oxygen consumption, ventilation, respiratory rate, respiratory exchange ratio, heart rate, and percent of maximal oxygen consumption. All regression models were statistically significant. The Spanish OMNI Scale is valid for estimating exercise effort during walking and running amongst Hispanic youth whose primary language is Spanish.
Saunders, Ruth P.; McIver, Kerry L.; Dowda, Marsha; Pate, Russell R.
2013-01-01
Objective Scales used to measure selected social-cognitive beliefs and motives for physical activity were tested among boys and girls. Methods Covariance modeling was applied to responses obtained from large multi-ethnic samples of students in the fifth and sixth grades. Results Theoretically and statistically sound models were developed, supporting the factorial validity of the scales in all groups. Multi-group longitudinal invariance was confirmed between boys and girls, overweight and normal weight students, and non-Hispanic black and white children. The construct validity of the scales was supported by hypothesized convergent and discriminant relationships within a measurement model that included correlations with physical activity (MET • min/day) measured by an accelerometer. Conclusions Scores from the scales provide valid assessments of selected beliefs and motives that are putative mediators of change in physical activity among boys and girls, as they begin the understudied transition from the fifth grade into middle school, when physical activity naturally declines. PMID:23459310
Dishman, Rod K; Saunders, Ruth P; McIver, Kerry L; Dowda, Marsha; Pate, Russell R
2013-06-01
Scales used to measure selected social-cognitive beliefs and motives for physical activity were tested among boys and girls. Covariance modeling was applied to responses obtained from large multi-ethnic samples of students in the fifth and sixth grades. Theoretically and statistically sound models were developed, supporting the factorial validity of the scales in all groups. Multi-group longitudinal invariance was confirmed between boys and girls, overweight and normal weight students, and non-Hispanic black and white children. The construct validity of the scales was supported by hypothesized convergent and discriminant relationships within a measurement model that included correlations with physical activity (MET • min/day) measured by an accelerometer. Scores from the scales provide valid assessments of selected beliefs and motives that are putative mediators of change in physical activity among boys and girls, as they begin the understudied transition from the fifth grade into middle school, when physical activity naturally declines.
The Outcome and Assessment Information Set (OASIS): A Review of Validity and Reliability
O’CONNOR, MELISSA; DAVITT, JOAN K.
2015-01-01
The Outcome and Assessment Information Set (OASIS) is the patient-specific, standardized assessment used in Medicare home health care to plan care, determine reimbursement, and measure quality. Since its inception in 1999, there has been debate over the reliability and validity of the OASIS as a research tool and outcome measure. A systematic literature review of English-language articles identified 12 studies published in the last 10 years examining the validity and reliability of the OASIS. Empirical findings indicate the validity and reliability of the OASIS range from low to moderate but vary depending on the item studied. Limitations in the existing research include: nonrepresentative samples; inconsistencies in methods used, items tested, measurement, and statistical procedures; and the changes to the OASIS itself over time. The inconsistencies suggest that these results are tentative at best; additional research is needed to confirm the value of the OASIS for measuring patient outcomes, research, and quality improvement. PMID:23216513
NASA Astrophysics Data System (ADS)
Çalik, Muammer; Coll, Richard Kevin
2012-08-01
In this paper, we describe the Scientific Habits of Mind Survey (SHOMS) developed to explore public, science teachers', and scientists' understanding of habits of mind (HoM). The instrument contained 59 items, and captures the seven SHOM identified by Gauld. The SHOM was validated by administration to two cohorts of pre-service science teachers: primary science teachers with little science background or interest (n = 145), and secondary school science teachers (who also were science graduates) with stronger science knowledge (n = 145). Face validity was confirmed by the use of a panel of experts and a pilot study employing participants similar in demographics to the intended sample. To confirm convergent and discriminant validity, confirmatory factor analysis and evaluation of the reliability were calculated. Statistical data and other data gathered from interviews suggest that the SHOMS will prove to be a useful tool for educators and researchers who wish to investigate HoM for a variety of participants.
DOT National Transportation Integrated Search
1979-03-01
There are several conditions that can influence the calculation of the statistical validity of a test battery such as that used to selected Air Traffic Control Specialists. Two conditions of prime importance to statistical validity are recruitment pr...
Data splitting for artificial neural networks using SOM-based stratified sampling.
May, R J; Maier, H R; Dandy, G C
2010-03-01
Data splitting is an important consideration during artificial neural network (ANN) development where hold-out cross-validation is commonly employed to ensure generalization. Even for a moderate sample size, the sampling methodology used for data splitting can have a significant effect on the quality of the subsets used for training, testing and validating an ANN. Poor data splitting can result in inaccurate and highly variable model performance; however, the choice of sampling methodology is rarely given due consideration by ANN modellers. Increased confidence in the sampling is of paramount importance, since the hold-out sampling is generally performed only once during ANN development. This paper considers the variability in the quality of subsets that are obtained using different data splitting approaches. A novel approach to stratified sampling, based on Neyman sampling of the self-organizing map (SOM), is developed, with several guidelines identified for setting the SOM size and sample allocation in order to minimize the bias and variance in the datasets. Using an example ANN function approximation task, the SOM-based approach is evaluated in comparison to random sampling, DUPLEX, systematic stratified sampling, and trial-and-error sampling to minimize the statistical differences between data sets. Of these approaches, DUPLEX is found to provide benchmark performance with good model performance, with no variability. The results show that the SOM-based approach also reliably generates high-quality samples and can therefore be used with greater confidence than other approaches, especially in the case of non-uniform datasets, with the benefit of scalability to perform data splitting on large datasets. Copyright 2009 Elsevier Ltd. All rights reserved.
Wang, Ling-jia; Kissler, Hermann J; Wang, Xiaojun; Cochet, Olivia; Krzystyniak, Adam; Misawa, Ryosuke; Golab, Karolina; Tibudan, Martin; Grzanka, Jakub; Savari, Omid; Grose, Randall; Kaufman, Dixon B; Millis, Michael; Witkowski, Piotr
2015-01-01
Pancreatic islet mass, represented by islet equivalent (IEQ), is the most important parameter in decision making for clinical islet transplantation. To obtain IEQ, the sample of islets is routinely counted manually under a microscope and discarded thereafter. Islet purity, another parameter in islet processing, is routinely acquired by estimation only. In this study, we validated our digital image analysis (DIA) system developed using the software of Image Pro Plus for islet mass and purity assessment. Application of the DIA allows to better comply with current good manufacturing practice (cGMP) standards. Human islet samples were captured as calibrated digital images for the permanent record. Five trained technicians participated in determination of IEQ and purity by manual counting method and DIA. IEQ count showed statistically significant correlations between the manual method and DIA in all sample comparisons (r >0.819 and p < 0.0001). Statistically significant difference in IEQ between both methods was found only in High purity 100μL sample group (p = 0.029). As far as purity determination, statistically significant differences between manual assessment and DIA measurement was found in High and Low purity 100μL samples (p<0.005), In addition, islet particle number (IPN) and the IEQ/IPN ratio did not differ statistically between manual counting method and DIA. In conclusion, the DIA used in this study is a reliable technique in determination of IEQ and purity. Islet sample preserved as a digital image and results produced by DIA can be permanently stored for verification, technical training and islet information exchange between different islet centers. Therefore, DIA complies better with cGMP requirements than the manual counting method. We propose DIA as a quality control tool to supplement the established standard manual method for islets counting and purity estimation. PMID:24806436
Papadakaki, Maria; Prokopiadou, Dimitra; Petridou, Eleni; Kogevinas, Manolis; Lionis, Christos
2012-06-01
The current article aims to translate the PREMIS (Physician Readiness to Manage Intimate Partner Violence) survey into the Greek language and test its validity and reliability in a sample of primary care physicians. The validation study was conducted in 2010 and involved all the general practitioners serving two adjacent prefectures of Greece (n = 80). Maximum-likelihood factor analysis (MLF) was used to extract key survey factors. The instrument was further assessed for the following psychometric properties: (a) scale reliability, (b) item-specific reliability, (c) test-retest reliability, (d) scale construct validity, and (e) internal predictive validity. The MLF analysis of 23 opinion items revealed a seven-factor solution (preparation, constraint, workplace issues, screening, self-efficacy, alcohol/drugs, victim understanding), which was statistically sound (p = .293). Most of the newly derived scales displayed satisfactory internal consistency (α ≥ .60), high item-specific reliability, strong construct, and internal predictive validity (F = 2.82; p = .004), and high repeatability when retested with 20 individuals (intraclass correlation coefficient [ICC] > .70). The tool was found appropriate to facilitate the identification of competence deficits and the evaluation of training initiatives.
Liu, Xuan; Ramella-Roman, Jessica C.; Huang, Yong; Guo, Yuan; Kang, Jin U.
2013-01-01
In this study, we proposed a generic speckle simulation for optical coherence tomography (OCT) signal, by convolving the point spread function (PSF) of the OCT system with the numerically synthesized random sample field. We validated our model and used the simulation method to study the statistical properties of cross-correlation coefficients (XCC) between Ascans which have been recently applied in transverse motion analysis by our group. The results of simulation show that over sampling is essential for accurate motion tracking; exponential decay of OCT signal leads to an under estimate of motion which can be corrected; lateral heterogeneity of sample leads to an over estimate of motion for a few pixels corresponding to the structural boundary. PMID:23456001
Bryan, Craig J; David Rudd, M; Wertenberger, Evelyn; Etienne, Neysa; Ray-Sannerud, Bobbie N; Morrow, Chad E; Peterson, Alan L; Young-McCaughon, Stacey
2014-04-01
Newer approaches for understanding suicidal behavior suggest the assessment of suicide-specific beliefs and cognitions may improve the detection and prediction of suicidal thoughts and behaviors. The Suicide Cognitions Scale (SCS) was developed to measure suicide-specific beliefs, but it has not been tested in a military setting. Data were analyzed from two separate studies conducted at three military mental health clinics (one U.S. Army, two U.S. Air Force). Participants included 175 active duty Army personnel with acute suicidal ideation and/or a recent suicide attempt referred for a treatment study (Sample 1) and 151 active duty Air Force personnel receiving routine outpatient mental health care (Sample 2). In both samples, participants completed self-report measures and clinician-administered interviews. Follow-up suicide attempts were assessed via clinician-administered interview for Sample 1. Statistical analyses included confirmatory factor analysis, between-group comparisons by history of suicidality, and generalized regression modeling. Two latent factors were confirmed for the SCS: Unloveability and Unbearability. Each demonstrated good internal consistency, convergent validity, and divergent validity. Both scales significantly predicted current suicidal ideation (βs >0.316, ps <0.002) and significantly differentiated suicide attempts from nonsuicidal self-injury and control groups (F(6, 286)=9.801, p<0.001). Both scales significantly predicted future suicide attempts (AORs>1.07, ps <0.050) better than other risk factors. Self-report methodology, small sample sizes, predominantly male samples. The SCS is a reliable and valid measure that predicts suicidal ideation and suicide attempts among military personnel better than other well-established risk factors. Copyright © 2014 Elsevier B.V. All rights reserved.
Limberg, Brian J; Johnstone, Kevin; Filloon, Thomas; Catrenich, Carl
2016-09-01
Using United States Pharmacopeia-National Formulary (USP-NF) general method <1223> guidance, the Soleris(®) automated system and reagents (Nonfermenting Total Viable Count for bacteria and Direct Yeast and Mold for yeast and mold) were validated, using a performance equivalence approach, as an alternative to plate counting for total microbial content analysis using five representative microbes: Staphylococcus aureus, Bacillus subtilis, Pseudomonas aeruginosa, Candida albicans, and Aspergillus brasiliensis. Detection times (DTs) in the alternative automated system were linearly correlated to CFU/sample (R(2) = 0.94-0.97) with ≥70% accuracy per USP General Chapter <1223> guidance. The LOD and LOQ of the automated system were statistically similar to the traditional plate count method. This system was significantly more precise than plate counting (RSD 1.2-2.9% for DT, 7.8-40.6% for plate counts), was statistically comparable to plate counting with respect to variations in analyst, vial lots, and instruments, and was robust when variations in the operating detection thresholds (dTs; ±2 units) were used. The automated system produced accurate results, was more precise and less labor-intensive, and met or exceeded criteria for a valid alternative quantitative method, consistent with USP-NF general method <1223> guidance.
Onwujekwe, Obinna; Fox-Rushby, Julia; Hanson, Kara
2008-01-01
This study examines whether making question formats better fit the cultural context of markets would improve the construct validity of estimates of willingness to pay (WTP). WTP for insecticide-treated mosquito nets was elicited using the bidding game, binary with follow-up (BWFU), and a novel structured haggling technique (SH) that mimicked price taking in market places in the study area. The results show that different question formats generated different distributions of WTP. Following a comparison of alternative models for each question format, construct validity was compared using the most consistently appropriate model across question formats for the positive WTP values, in this case, ordinary least squares. Three criteria (the number of statistically significant explanatory variables that had the anticipated sign, the value of the adjusted R(2), and the proportion that were statistically significant With the anticipated sign) used to assess the relative performance of each question format indicated that SH performed best and BWFU worst. However, differences in the levels of income, education, and percentage of household heads responding to the different question formats across the samples complicate this conclusion. Hence, the results suggest that the SH technique is worthy of further investigation and use.
Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein
2017-01-01
Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials. PMID:29445703
Shakeri, Mohammad-Taghi; Taghipour, Ali; Sadeghi, Masoumeh; Nezami, Hossein; Amirabadizadeh, Ali-Reza; Bonakchi, Hossein
2017-01-01
Background: Writing, designing, and conducting a clinical trial research proposal has an important role in achieving valid and reliable findings. Thus, this study aimed at critically appraising fundamental information in approved clinical trial research proposals in Mashhad University of Medical Sciences (MUMS) from 2008 to 2014. Methods: This cross-sectional study was conducted on all 935 approved clinical trial research proposals in MUMS from 2008 to 2014. A valid and reliable as well as comprehensive, simple, and usable checklist in sessions with biostatisticians and methodologists, consisting of 11 main items as research tool, were used. Agreement rate between the reviewers of the proposals, who were responsible for data collection, was assessed during 3 sessions, and Kappa statistics was calculated at the last session as 97%. Results: More than 60% of the research proposals had a methodologist consultant, moreover, type of study or study design had been specified in almost all of them (98%). Appropriateness of study aims with hypotheses was not observed in a significant number of research proposals (585 proposals, 62.6%). The required sample size for 66.8% of the approved proposals was based on a sample size formula; however, in 25% of the proposals, sample size formula was not in accordance with the study design. Data collection tool was not selected appropriately in 55.2% of the approved research proposals. Type and method of randomization were unknown in 21% of the proposals and dealing with missing data had not been described in most of them (98%). Inclusion and exclusion criteria were (92%) fully and adequately explained. Moreover, 44% and 31% of the research proposals were moderate and weak in rank, respectively, with respect to the correctness of the statistical analysis methods. Conclusion: Findings of the present study revealed that a large portion of the approved proposals were highly biased or ambiguous with respect to randomization, blinding, dealing with missing data, data collection tool, sampling methods, and statistical analysis. Thus, it is essential to consult and collaborate with a methodologist in all parts of a proposal to control the possible and specific biases in clinical trials.
Preparing for the first meeting with a statistician.
De Muth, James E
2008-12-15
Practical statistical issues that should be considered when performing data collection and analysis are reviewed. The meeting with a statistician should take place early in the research development before any study data are collected. The process of statistical analysis involves establishing the research question, formulating a hypothesis, selecting an appropriate test, sampling correctly, collecting data, performing tests, and making decisions. Once the objectives are established, the researcher can determine the characteristics or demographics of the individuals required for the study, how to recruit volunteers, what type of data are needed to answer the research question(s), and the best methods for collecting the required information. There are two general types of statistics: descriptive and inferential. Presenting data in a more palatable format for the reader is called descriptive statistics. Inferential statistics involve making an inference or decision about a population based on results obtained from a sample of that population. In order for the results of a statistical test to be valid, the sample should be representative of the population from which it is drawn. When collecting information about volunteers, researchers should only collect information that is directly related to the study objectives. Important information that a statistician will require first is an understanding of the type of variables involved in the study and which variables can be controlled by researchers and which are beyond their control. Data can be presented in one of four different measurement scales: nominal, ordinal, interval, or ratio. Hypothesis testing involves two mutually exclusive and exhaustive statements related to the research question. Statisticians should not be replaced by computer software, and they should be consulted before any research data are collected. When preparing to meet with a statistician, the pharmacist researcher should be familiar with the steps of statistical analysis and consider several questions related to the study to be conducted.
Sotardi, Valerie A
2018-05-01
Educational measures of anxiety focus heavily on students' experiences with tests yet overlook other assessment contexts. In this research, two brief multiscale questionnaires were developed and validated to measure trait evaluation anxiety (MTEA-12) and state evaluation anxiety (MSEA-12) for use in various assessment contexts in non-clinical, educational settings. The research included a cross-sectional analysis of self-report data using authentic assessment settings in which evaluation anxiety was measured. Instruments were tested using a validation sample of 241 first-year university students in New Zealand. Scale development included component structures for state and trait scales based on existing theoretical frameworks. Analyses using confirmatory factor analysis and descriptive statistics indicate that the scales are reliable and structurally valid. Multivariate general linear modeling using subscales from the MTEA-12, MSEA-12, and student grades suggest adequate criterion-related validity. Initial predictive validity in which one relevant MTEA-12 factor explained between 21% and 54% of the variance in three MSEA-12 factors. Results document MTEA-12 and MSEA-12 as reliable measures of trait and state dimensions of evaluation anxiety for test and writing contexts. Initial estimates suggest the scales as having promising validity, and recommendations for further validation are outlined.
Tavoli, Azadeh; Melyani, Mahdiyeh; Bakhtiari, Maryam; Ghaedi, Gholam Hossein; Montazeri, Ali
2009-01-01
Background The Brief Fear of Negative Evaluation Scale (BFNE) is a commonly used instrument to measure social anxiety. This study aimed to translate and to test the reliability and validity of the BFNE in Iran. Methods The English language version of the BFNE was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 235 students with (n = 33, clinical group) and without social phobia (n = 202, non-clinical group). In addition to the BFNE, two standard instruments were used to measure social phobia severity: the Social Phobia Inventory (SPIN), and the Social Interaction Anxiety Scale (SIAS). All participants completed a brief background information questionnaire, the SPIN, the SIAS and the BFNE scales. Statistical analysis was performed to test the reliability and validity of the BFNE. Results In all 235 students were studied (111 male and 124 female). The mean age for non-clinical group was 22.2 (SD = 2.1) years and for clinical sample it was 22.4 (SD = 1.8) years. Cronbach's alpha coefficient (to test reliability) was acceptable for both non-clinical and clinical samples (α = 0.90 and 0.82 respectively). In addition, 3-week test-retest reliability was performed in non-clinical sample and the intraclass correlation coefficient (ICC) was quite high (ICC = 0.71). Validity as performed using convergent and discriminant validity showed satisfactory results. The questionnaire correlated well with established measures of social phobia such as the SPIN (r = 0.43, p < 0.001) and the SIAS (r = 0.54, p < 0.001). Also the BFNE discriminated well between men and women with and without social phobia in the expected direction. Factor analysis supported a two-factor solution corresponding to positive and reverse-worded items. Conclusion This validation study of the Iranian version of BFNE proved that it is an acceptable, reliable and valid measure of social phobia. However, since the scale showed a two-factor structure and this does not confirm to the theoretical basis for the BFNE, thus we suggest the use of the BFNE-II when it becomes available in Iran. The validation study of the BFNE-II is in progress. PMID:19589161
Nateghi, Roshanak; Guikema, Seth D; Quiring, Steven M
2011-12-01
This article compares statistical methods for modeling power outage durations during hurricanes and examines the predictive accuracy of these methods. Being able to make accurate predictions of power outage durations is valuable because the information can be used by utility companies to plan their restoration efforts more efficiently. This information can also help inform customers and public agencies of the expected outage times, enabling better collective response planning, and coordination of restoration efforts for other critical infrastructures that depend on electricity. In the long run, outage duration estimates for future storm scenarios may help utilities and public agencies better allocate risk management resources to balance the disruption from hurricanes with the cost of hardening power systems. We compare the out-of-sample predictive accuracy of five distinct statistical models for estimating power outage duration times caused by Hurricane Ivan in 2004. The methods compared include both regression models (accelerated failure time (AFT) and Cox proportional hazard models (Cox PH)) and data mining techniques (regression trees, Bayesian additive regression trees (BART), and multivariate additive regression splines). We then validate our models against two other hurricanes. Our results indicate that BART yields the best prediction accuracy and that it is possible to predict outage durations with reasonable accuracy. © 2011 Society for Risk Analysis.
NASA Astrophysics Data System (ADS)
Lutz, Norbert W.; Bernard, Monique
2018-02-01
We recently suggested a new paradigm for statistical analysis of thermal heterogeneity in (semi-)aqueous materials by 1H NMR spectroscopy, using water as a temperature probe. Here, we present a comprehensive in silico and in vitro validation that demonstrates the ability of this new technique to provide accurate quantitative parameters characterizing the statistical distribution of temperature values in a volume of (semi-)aqueous matter. First, line shape parameters of numerically simulated water 1H NMR spectra are systematically varied to study a range of mathematically well-defined temperature distributions. Then, corresponding models based on measured 1H NMR spectra of agarose gel are analyzed. In addition, dedicated samples based on hydrogels or biological tissue are designed to produce temperature gradients changing over time, and dynamic NMR spectroscopy is employed to analyze the resulting temperature profiles at sub-second temporal resolution. Accuracy and consistency of the previously introduced statistical descriptors of temperature heterogeneity are determined: weighted median and mean temperature, standard deviation, temperature range, temperature mode(s), kurtosis, skewness, entropy, and relative areas under temperature curves. Potential and limitations of this method for quantitative analysis of thermal heterogeneity in (semi-)aqueous materials are discussed in view of prospective applications in materials science as well as biology and medicine.
Identification of altered pathways in breast cancer based on individualized pathway aberrance score.
Shi, Sheng-Hong; Zhang, Wei; Jiang, Jing; Sun, Long
2017-08-01
The objective of the present study was to identify altered pathways in breast cancer based on the individualized pathway aberrance score (iPAS) method combined with the normal reference (nRef). There were 4 steps to identify altered pathways using the iPAS method: Data preprocessing conducted by the robust multi-array average (RMA) algorithm; gene-level statistics based on average Z ; pathway-level statistics according to iPAS; and a significance test dependent on 1 sample Wilcoxon test. The altered pathways were validated by calculating the changed percentage of each pathway in tumor samples and comparing them with pathways from differentially expressed genes (DEGs). A total of 688 altered pathways with P<0.01 were identified, including kinesin (KIF)- and polo-like kinase (PLK)-mediated events. When the percentage of change reached 50%, 310 pathways were involved in the total 688 altered pathways, which may validate the present results. In addition, there were 324 DEGs and 155 common genes between DEGs and pathway genes. DEGs and common genes were enriched in the same 9 significant terms, which also were members of altered pathways. The iPAS method was suitable for identifying altered pathways in breast cancer. Altered pathways (such as KIF and PLK mediated events) were important for understanding breast cancer mechanisms and for the future application of customized therapeutic decisions.
Guttersrud, Øystein; Petterson, Kjell Sverre
2015-10-01
The present study validates a revised scale measuring individuals' level of the 'engagement in dietary behaviour' aspect of 'critical nutrition literacy' and describes how background factors affect this aspect of Norwegian tenth-grade students' nutrition literacy. Data were gathered electronically during a field trial of a standardised sample test in science. Test items and questionnaire constructs were distributed evenly across four electronic field-test booklets. Data management and analysis were performed using the RUMM2030 item analysis package and the IBM SPSS Statistics 20 statistical software package. Students responded on computers at school. Seven hundred and forty tenth-grade students at twenty-seven randomly sampled public schools were enrolled in the field-test study. The engagement in dietary behaviour scale and the self-efficacy in science scale were distributed to 178 of these students. The dietary behaviour scale and the self-efficacy in science scale came out as valid, reliable and well-targeted instruments usable for the construction of measurements. Girls and students with high self-efficacy reported higher engagement in dietary behaviour than other students. Socio-economic status and scientific literacy - measured as ability in science by applying an achievement test - did not correlate significantly different from zero with students' engagement in dietary behaviour.
Poor Validity of the DSM-IV Schizoid Personality Disorder Construct as a Diagnostic Category.
Hummelen, Benjamin; Pedersen, Geir; Wilberg, Theresa; Karterud, Sigmund
2015-06-01
This study sought to evaluate the construct validity of schizoid personality disorder (SZPD) by investigating a sample of 2,619 patients from the Norwegian Network of Personality-Focused Treatment Programs by a variety of statistical techniques. Nineteen patients (0.7%) reached the diagnostic threshold of SZPD. Results from the factor analyses indicated that SZPD consists of three factors: social detachment, withdrawal, and restricted affectivity/ anhedonia. Overall, internal consistency and diagnostic efficiency were poor and best for the criteria that belong to the social detachment factor. These findings pose serious questions about the clinical utility of SZPD as a diagnostic category. On the other hand, the three factors were in concordance with findings from previous studies and with the trait model for personality disorders in DSM-5, supporting the validity of SZPD as a dimensional construct. The authors recommend that SZPD should be deleted as a diagnostic category in future editions of DSM-5.
A Possible Tool for Checking Errors in the INAA Results, Based on Neutron Data and Method Validation
NASA Astrophysics Data System (ADS)
Cincu, Em.; Grigore, Ioana Manea; Barbos, D.; Cazan, I. L.; Manu, V.
2008-08-01
This work presents preliminary results of a new type of possible application in the INAA experiments of elemental analysis, useful to check errors occurred during investigation of unknown samples; it relies on the INAA method validation experiments and accuracy of the neutron data from the literature. The paper comprises 2 sections, the first one presents—in short—the steps of the experimental tests carried out for INAA method validation and for establishing the `ACTIVA-N' laboratory performance, which is-at the same time-an illustration of the laboratory evolution on the way to get performance. Section 2 presents our recent INAA results on CRMs, of which interpretation opens discussions about the usefulness of using a tool for checking possible errors, different from the usual statistical procedures. The questionable aspects and the requirements to develop a practical checking tool are discussed.
Using entropy measures to characterize human locomotion.
Leverick, Graham; Szturm, Tony; Wu, Christine Q
2014-12-01
Entropy measures have been widely used to quantify the complexity of theoretical and experimental dynamical systems. In this paper, the value of using entropy measures to characterize human locomotion is demonstrated based on their construct validity, predictive validity in a simple model of human walking and convergent validity in an experimental study. Results show that four of the five considered entropy measures increase meaningfully with the increased probability of falling in a simple passive bipedal walker model. The same four entropy measures also experienced statistically significant increases in response to increasing age and gait impairment caused by cognitive interference in an experimental study. Of the considered entropy measures, the proposed quantized dynamical entropy (QDE) and quantization-based approximation of sample entropy (QASE) offered the best combination of sensitivity to changes in gait dynamics and computational efficiency. Based on these results, entropy appears to be a viable candidate for assessing the stability of human locomotion.
Murphy, Brett; Lilienfeld, Scott; Skeem, Jennifer; Edens, John
2016-01-01
Researchers are vigorously debating whether psychopathic personality includes seemingly adaptive traits, especially social and physical boldness. In a large sample (N=1565) of adult offenders, we examined the incremental validity of two operationalizations of boldness (Fearless Dominance traits in the Psychopathy Personality Inventory, Lilienfeld & Andrews, 1996; Boldness traits in the Triarchic Model of Psychopathy, Patrick et al, 2009), above and beyond other characteristics of psychopathy, in statistically predicting scores on four psychopathy-related measures, including the Psychopathy Checklist-Revised (PCL-R). The incremental validity added by boldness traits in predicting the PCL-R’s representation of psychopathy was especially pronounced for interpersonal traits (e.g., superficial charm, deceitfulness). Our analyses, however, revealed unexpected sex differences in the relevance of these traits to psychopathy, with boldness traits exhibiting reduced importance for psychopathy in women. We discuss the implications of these findings for measurement models of psychopathy. PMID:26866795
Validation of Physics Standardized Test Items
NASA Astrophysics Data System (ADS)
Marshall, Jill
2008-10-01
The Texas Physics Assessment Team (TPAT) examined the Texas Assessment of Knowledge and Skills (TAKS) to determine whether it is a valid indicator of physics preparation for future course work and employment, and of the knowledge and skills needed to act as an informed citizen in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam using standard statistical methods employed by test developers (factor analysis and Item Response Theory). Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing..
Clark, Matthew T.; Calland, James Forrest; Enfield, Kyle B.; Voss, John D.; Lake, Douglas E.; Moorman, J. Randall
2017-01-01
Background Charted vital signs and laboratory results represent intermittent samples of a patient’s dynamic physiologic state and have been used to calculate early warning scores to identify patients at risk of clinical deterioration. We hypothesized that the addition of cardiorespiratory dynamics measured from continuous electrocardiography (ECG) monitoring to intermittently sampled data improves the predictive validity of models trained to detect clinical deterioration prior to intensive care unit (ICU) transfer or unanticipated death. Methods and findings We analyzed 63 patient-years of ECG data from 8,105 acute care patient admissions at a tertiary care academic medical center. We developed models to predict deterioration resulting in ICU transfer or unanticipated death within the next 24 hours using either vital signs, laboratory results, or cardiorespiratory dynamics from continuous ECG monitoring and also evaluated models using all available data sources. We calculated the predictive validity (C-statistic), the net reclassification improvement, and the probability of achieving the difference in likelihood ratio χ2 for the additional degrees of freedom. The primary outcome occurred 755 times in 586 admissions (7%). We analyzed 395 clinical deteriorations with continuous ECG data in the 24 hours prior to an event. Using only continuous ECG measures resulted in a C-statistic of 0.65, similar to models using only laboratory results and vital signs (0.63 and 0.69 respectively). Addition of continuous ECG measures to models using conventional measurements improved the C-statistic by 0.01 and 0.07; a model integrating all data sources had a C-statistic of 0.73 with categorical net reclassification improvement of 0.09 for a change of 1 decile in risk. The difference in likelihood ratio χ2 between integrated models with and without cardiorespiratory dynamics was 2158 (p value: <0.001). Conclusions Cardiorespiratory dynamics from continuous ECG monitoring detect clinical deterioration in acute care patients and improve performance of conventional models that use only laboratory results and vital signs. PMID:28771487
Moss, Travis J; Clark, Matthew T; Calland, James Forrest; Enfield, Kyle B; Voss, John D; Lake, Douglas E; Moorman, J Randall
2017-01-01
Charted vital signs and laboratory results represent intermittent samples of a patient's dynamic physiologic state and have been used to calculate early warning scores to identify patients at risk of clinical deterioration. We hypothesized that the addition of cardiorespiratory dynamics measured from continuous electrocardiography (ECG) monitoring to intermittently sampled data improves the predictive validity of models trained to detect clinical deterioration prior to intensive care unit (ICU) transfer or unanticipated death. We analyzed 63 patient-years of ECG data from 8,105 acute care patient admissions at a tertiary care academic medical center. We developed models to predict deterioration resulting in ICU transfer or unanticipated death within the next 24 hours using either vital signs, laboratory results, or cardiorespiratory dynamics from continuous ECG monitoring and also evaluated models using all available data sources. We calculated the predictive validity (C-statistic), the net reclassification improvement, and the probability of achieving the difference in likelihood ratio χ2 for the additional degrees of freedom. The primary outcome occurred 755 times in 586 admissions (7%). We analyzed 395 clinical deteriorations with continuous ECG data in the 24 hours prior to an event. Using only continuous ECG measures resulted in a C-statistic of 0.65, similar to models using only laboratory results and vital signs (0.63 and 0.69 respectively). Addition of continuous ECG measures to models using conventional measurements improved the C-statistic by 0.01 and 0.07; a model integrating all data sources had a C-statistic of 0.73 with categorical net reclassification improvement of 0.09 for a change of 1 decile in risk. The difference in likelihood ratio χ2 between integrated models with and without cardiorespiratory dynamics was 2158 (p value: <0.001). Cardiorespiratory dynamics from continuous ECG monitoring detect clinical deterioration in acute care patients and improve performance of conventional models that use only laboratory results and vital signs.
2013-01-01
Background In recent years response rates on telephone surveys have been declining. Rates for the behavioral risk factor surveillance system (BRFSS) have also declined, prompting the use of new methods of weighting and the inclusion of cell phone sampling frames. A number of scholars and researchers have conducted studies of the reliability and validity of the BRFSS estimates in the context of these changes. As the BRFSS makes changes in its methods of sampling and weighting, a review of reliability and validity studies of the BRFSS is needed. Methods In order to assess the reliability and validity of prevalence estimates taken from the BRFSS, scholarship published from 2004–2011 dealing with tests of reliability and validity of BRFSS measures was compiled and presented by topics of health risk behavior. Assessments of the quality of each publication were undertaken using a categorical rubric. Higher rankings were achieved by authors who conducted reliability tests using repeated test/retest measures, or who conducted tests using multiple samples. A similar rubric was used to rank validity assessments. Validity tests which compared the BRFSS to physical measures were ranked higher than those comparing the BRFSS to other self-reported data. Literature which undertook more sophisticated statistical comparisons was also ranked higher. Results Overall findings indicated that BRFSS prevalence rates were comparable to other national surveys which rely on self-reports, although specific differences are noted for some categories of response. BRFSS prevalence rates were less similar to surveys which utilize physical measures in addition to self-reported data. There is very little research on reliability and validity for some health topics, but a great deal of information supporting the validity of the BRFSS data for others. Conclusions Limitations of the examination of the BRFSS were due to question differences among surveys used as comparisons, as well as mode of data collection differences. As the BRFSS moves to incorporating cell phone data and changing weighting methods, a review of reliability and validity research indicated that past BRFSS landline only data were reliable and valid as measured against other surveys. New analyses and comparisons of BRFSS data which include the new methodologies and cell phone data will be needed to ascertain the impact of these changes on estimates in the future. PMID:23522349
Nelson, Melissa C; Lytle, Leslie A
2009-04-01
Sweetened beverage and fast-food intake have been identified as important targets for obesity prevention. However, there are few brief dietary assessment tools available to evaluate these behaviors among adolescents. The objective of this research was to examine reliability and validity of a 22-item dietary screener assessing adolescent consumption of specific energy-containing and non-energy-containing beverages (nine items) and fast food (13 items). The screener was administered to adolescents (ages 11 to 18 years) recruited from the Minneapolis/St Paul, MN, metro region. One sample of adolescents completed test-retest reliability of the screener (n=33, primarily white adolescents). Another adolescent sample completed the screener along with three 24-hour dietary recalls to assess criterion validity (n=59 white adolescents). Test-retest assessments were completed approximately 7 to 14 days apart, and agreement between the two administrations of the screener was substantial, with most items yielding Spearman correlations and kappa statistics that were >0.60. When compared to the gold standard dietary recall data, findings indicate that the validity of the screener items assessing adolescents' intake of regular soda, sports drinks, milk, and water was fair. However, the differential assessment periods captured by the two methods (ie, 1 month for the screener vs 3 days for the recalls) posed challenges in analysis and made it impossible to assess the validity of some screener items. Overall while these screener items largely represent reliable measures with fair validity, our findings highlight the challenges inherent in the validation of brief dietary assessment tools.
Molaeinezhad, Mitra; Roudsari, Robab Latifnejad; Yousefy, Alireza; Salehi, Mehrdad; Khoei, Effat Merghati
2014-01-01
Background: Vaginismus is considered as one of the most common female psychosexual dysfunctions. Although the importance of using a multidisciplinary approach for assessment of vaginal penetration disorder is emphasized, the paucity of instruments for this purpose is clear. We designed a study to develop and investigate the psychometric properties of a multidimensional vaginal penetration disorder questionnaire (MVPDQ), thereby assisting specialists for clinical assessment of women with lifelong vaginismus (LLV). Materials and Methods: MVPDQ was developed using the findings from a thematic qualitative research conducted with 20 unconsummated couples from a former study, which was followed by an extensive literature review. Then, during a cross-sectional design, a consecutive sample of 214 women, who were diagnosed as LLV based on Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV-TR criteria completed MVPDQ and additional questions regarding their demographic and sexual history. Validation measures and reliability were tested by exploratory factor analysis and Cronbach's alpha coefficient via Statistical Package for the Social Sciences (SPSS) version 16. Results: After conducting exploratory factor analysis, MVPDQ emerged with 72 items and 9 dimensions: Catastrophic cognitions and tightening, helplessness, marital adjustment, hypervigilance, avoidance, penetration motivation, sexual information, genital incompatibility, and optimism. Subscales of MVPDQ showed a significant reliability that varied between 0.70 and 0.87 and results of test–retest were satisfactory. Conclusion: The present study shows that MVPDQ is a valid and reliable self-report questionnaire for clinical assessment of women complaining of LLV. This instrument may assist specialists to make a clinical judgment and plan appropriately for clinical management. PMID:25097607
Trakman, Gina Louise; Forsyth, Adrienne; Hoye, Russell; Belski, Regina
2018-01-01
The Nutrition for Sport Knowledge Questionnaire (NSKQ) is an 89-item, valid and reliable measure of sports nutrition knowledge (SNK). It takes 25 min to complete and has been subject to low completion and response rates. The aim of this study was to develop an abridged version of the NSKQ (A-NSKQ) and compare response rates, completion rates and NK scores of the NSKQ and A-NSKQ. Rasch analysis was used for the questionnaire validation. The sample ( n = 181) was the same sample that was used in the validation of the full-length NSKQ. Construct validity was assessed using the known-group comparisons method. Temporal stability was assessed using the test-retest reliability method. NK assessment was cross-sectional; responses were collected electronically from members of one non-elite Australian football (AF) and netball club, using Qualtrics Software (Qualtrics, Provo, UT). Validation - The A-NSKQ has 37 items that assess general ( n = 17) and sports ( n = 20) nutrition knowledge (NK). Both sections are unidimensional (Perc5% = 2.84% [general] and 3.41% [sport]). Both sections fit the Rasch Model (overall-interaction statistic mean (SD) = - 0.15 ± 0.96 [general] and 0.22 ± 1.11 [sport]; overall-person interaction statistic mean (SD) = - 0.11 ± 0.61 [general] and 0.08 ± 0.73 [sport]; Chi-Square probability = 0.308 [general] and 0.283 [sport]). Test-retest reliability was confirmed ( r = 0.8, P < 0.001 [general] and r = 0.7, P < 0.001 [sport]). Construct validity was demonstrated (nutrition students = 77% versus non-nutrition students = 60%, P < 0.001 [general] and nutrition students = 60% versus non-nutrition students = 40%, P < 0.001 [sport]. Assessment of NK - 177 usable survey responses from were returned. Response rates were low (7%) but completion rates were high (85%). NK scores on the A-NSKQ (46%) are comparable to results obtained in similar cohorts on the NSKQ (49%). The A-NSKQ took on average 12 min to complete, which is around half the time taken to complete the NSKQ (25 min). The A-NSKQ is a valid and reliable, brief questionnaire designed to assess general NK (GNK) and SNK.
Dealing with dietary measurement error in nutritional cohort studies.
Freedman, Laurence S; Schatzkin, Arthur; Midthune, Douglas; Kipnis, Victor
2011-07-20
Dietary measurement error creates serious challenges to reliably discovering new diet-disease associations in nutritional cohort studies. Such error causes substantial underestimation of relative risks and reduction of statistical power for detecting associations. On the basis of data from the Observing Protein and Energy Nutrition Study, we recommend the following approaches to deal with these problems. Regarding data analysis of cohort studies using food-frequency questionnaires, we recommend 1) using energy adjustment for relative risk estimation; 2) reporting estimates adjusted for measurement error along with the usual relative risk estimates, whenever possible (this requires data from a relevant, preferably internal, validation study in which participants report intakes using both the main instrument and a more detailed reference instrument such as a 24-hour recall or multiple-day food record); 3) performing statistical adjustment of relative risks, based on such validation data, if they exist, using univariate (only for energy-adjusted intakes such as densities or residuals) or multivariate regression calibration. We note that whereas unadjusted relative risk estimates are biased toward the null value, statistical significance tests of unadjusted relative risk estimates are approximately valid. Regarding study design, we recommend increasing the sample size to remedy loss of power; however, it is important to understand that this will often be an incomplete solution because the attenuated signal may be too small to distinguish from unmeasured confounding in the model relating disease to reported intake. Future work should be devoted to alleviating the problem of signal attenuation, possibly through the use of improved self-report instruments or by combining dietary biomarkers with self-report instruments.
Smith, William Pastor
2013-09-01
The primary purpose of this two-phased study was to examine the structural validity and statistical utility of a racism scale specific to Black men who have sex with men (MSM) who resided in the Washington, DC, metropolitan area and Baltimore, Maryland. Phase I involved pretesting a 10-item racism measure with 20 Black MSM. Based on pretest findings, the scale was adapted into a 21-item racism scale for use in collecting data on 166 respondents in Phase II. Exploratory factor analysis of the 21-item racism scale resulted in a 19-item, two-factor solution. The two factors or subscales were the following: General Racism and Relationships and Racism. Confirmatory factor analysis was used in testing construct validity of the factored racism scale. Specifically, the two racism factors were combined with three homophobia factors into a confirmatory factor analysis model. Based on a summary of the fit indices, both comparative and incremental were equal to .90, suggesting an adequate convergence of the racism and homophobia dimensions into a single social oppression construct. Statistical utility of the two racism subscales was demonstrated when regression analysis revealed that the gay-identified men versus bisexual-identified men in the sample were more likely to experience increased racism within the context of intimate relationships and less likely to be exposed to repeated experiences of general racism. Overall, the findings in this study highlight the importance of continuing to explore the psychometric properties of a racism scale that accounts for the unique psychosocial concerns experienced by Black MSM.
Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data.
Houssaïni, Allal; Assoumou, Lambert; Marcelin, Anne Geneviève; Molina, Jean Michel; Calvez, Vincent; Flandre, Philippe
2012-01-01
Background. Many statistical models have been tested to predict phenotypic or virological response from genotypic data. A statistical framework called Super Learner has been introduced either to compare different methods/learners (discrete Super Learner) or to combine them in a Super Learner prediction method. Methods. The Jaguar trial is used to apply the Super Learner framework. The Jaguar study is an "add-on" trial comparing the efficacy of adding didanosine to an on-going failing regimen. Our aim was also to investigate the impact on the use of different cross-validation strategies and different loss functions. Four different repartitions between training set and validations set were tested through two loss functions. Six statistical methods were compared. We assess performance by evaluating R(2) values and accuracy by calculating the rates of patients being correctly classified. Results. Our results indicated that the more recent Super Learner methodology of building a new predictor based on a weighted combination of different methods/learners provided good performance. A simple linear model provided similar results to those of this new predictor. Slight discrepancy arises between the two loss functions investigated, and slight difference arises also between results based on cross-validated risks and results from full dataset. The Super Learner methodology and linear model provided around 80% of patients correctly classified. The difference between the lower and higher rates is around 10 percent. The number of mutations retained in different learners also varys from one to 41. Conclusions. The more recent Super Learner methodology combining the prediction of many learners provided good performance on our small dataset.
Update of Standard Practices for New Method Validation in Forensic Toxicology.
Wille, Sarah M R; Coucke, Wim; De Baere, Thierry; Peters, Frank T
2017-01-01
International agreement concerning validation guidelines is important to obtain quality forensic bioanalytical research and routine applications as it all starts with the reporting of reliable analytical data. Standards for fundamental validation parameters are provided in guidelines as those from the US Food and Drug Administration (FDA), the European Medicines Agency (EMA), the German speaking Gesellschaft fur Toxikologie und Forensische Chemie (GTFCH) and the Scientific Working Group of Forensic Toxicology (SWGTOX). These validation parameters include selectivity, matrix effects, method limits, calibration, accuracy and stability, as well as other parameters such as carryover, dilution integrity and incurred sample reanalysis. It is, however, not easy for laboratories to implement these guidelines into practice as these international guidelines remain nonbinding protocols, that depend on the applied analytical technique, and that need to be updated according the analyst's method requirements and the application type. In this manuscript, a review of the current guidelines and literature concerning bioanalytical validation parameters in a forensic context is given and discussed. In addition, suggestions for the experimental set-up, the pros and cons of statistical approaches and adequate acceptance criteria for the validation of bioanalytical applications are given. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
2009-01-01
Background Symptom-based surveys suggest that the prevalence of gastrointestinal diseases is lower in China than in Western countries. The aim of this study was to validate a methodology for the epidemiological investigation of gastrointestinal symptoms and endoscopic findings in China. Methods A randomized, stratified, multi-stage sampling methodology was used to select 18 000 adults aged 18-80 years from Shanghai, Beijing, Xi'an, Wuhan and Guangzhou. Participants from Shanghai were invited to provide blood samples and undergo upper gastrointestinal endoscopy. All participants completed Chinese versions of the Reflux Disease Questionnaire (RDQ) and the modified Rome II questionnaire; 20% were also invited to complete the 36-item Short Form Health Survey (SF-36) and Epworth Sleepiness Scale (ESS). The psychometric properties of the questionnaires were evaluated statistically. Results The study was completed by 16 091 individuals (response rate: 89.4%), with 3219 (89.4% of those invited) completing the SF-36 and ESS. All 3153 participants in Shanghai provided blood samples and 1030 (32.7%) underwent endoscopy. Cronbach's alpha coefficients were 0.89, 0.89, 0.80 and 0.91, respectively, for the RDQ, modified Rome II questionnaire, ESS and SF-36, supporting internal consistency. Factor analysis supported construct validity of all questionnaire dimensions except SF-36 psychosocial dimensions. Conclusion This population-based study has great potential to characterize the relationship between gastrointestinal symptoms and endoscopic findings in China. PMID:19925662
Yan, Xiaoyan; Wang, Rui; Zhao, Yanfang; Ma, Xiuqiang; Fang, Jiqian; Yan, Hong; Kang, Xiaoping; Yin, Ping; Hao, Yuantao; Li, Qiang; Dent, John; Sung, Joseph; Zou, Duowu; Johansson, Saga; Halling, Katarina; Liu, Wenbin; He, Jia
2009-11-19
Symptom-based surveys suggest that the prevalence of gastrointestinal diseases is lower in China than in Western countries. The aim of this study was to validate a methodology for the epidemiological investigation of gastrointestinal symptoms and endoscopic findings in China. A randomized, stratified, multi-stage sampling methodology was used to select 18,000 adults aged 18-80 years from Shanghai, Beijing, Xi'an, Wuhan and Guangzhou. Participants from Shanghai were invited to provide blood samples and undergo upper gastrointestinal endoscopy. All participants completed Chinese versions of the Reflux Disease Questionnaire (RDQ) and the modified Rome II questionnaire; 20% were also invited to complete the 36-item Short Form Health Survey (SF-36) and Epworth Sleepiness Scale (ESS). The psychometric properties of the questionnaires were evaluated statistically. The study was completed by 16,091 individuals (response rate: 89.4%), with 3219 (89.4% of those invited) completing the SF-36 and ESS. All 3153 participants in Shanghai provided blood samples and 1030 (32.7%) underwent endoscopy. Cronbach's alpha coefficients were 0.89, 0.89, 0.80 and 0.91, respectively, for the RDQ, modified Rome II questionnaire, ESS and SF-36, supporting internal consistency. Factor analysis supported construct validity of all questionnaire dimensions except SF-36 psychosocial dimensions. This population-based study has great potential to characterize the relationship between gastrointestinal symptoms and endoscopic findings in China.
Tork, Hanan; Dassen, Theo; Lohrmann, Christa
2009-02-01
This paper is a report of a study to examine the psychometric properties of the Care Dependency Scale for Paediatrics in Germany and Egypt and to compare the care dependency of school-age children in both countries. Cross-cultural differences in care dependency of older adults have been documented in the literature, but little is known about the differences and similarities with regard to children's care dependency in different cultures. A convenience sample of 258 school-aged children from Germany and Egypt participated in the study in 2005. The reliability of the Care Dependency Scale for Paediatrics was assessed in terms of internal consistency and interrater reliability. Factor analysis (principal component analysis) was employed to verify the construct validity. A Visual Analogue Scale was used to investigate the criterion-related validity. Good internal consistency was detected both for the Arabic and German versions. Factor analysis revealed one factor for both versions. A Pearson's correlation between the Care Dependency Scale for Paediatrics and Visual Analogue Scale was statistically significant for both versions indicating criterion-related validity. Statistically significant differences between the participants were detected regarding the mean sum score on the Care Dependency Scale for Paediatrics. The Care Dependency Scale for Paediatrics is a reliable and valid tool for assessing the care dependency of children and is recommended for assessing the care dependency of children from different ethnic origins. Differences in care dependency between German and Egyptian children were detected, which might be due to cultural differences.
Evren, Cuneyt; Dalbudak, Ercan; Topcu, Merve; Kutlu, Nilay; Evren, Bilge; Pontes, Halley M
2018-07-01
The main aims of the current study were to test the factor structure, reliability and validity of the nine-item Internet Gaming Disorder Scale-Short Form (IGDS9-SF), a standardized measure to assess symptoms and prevalence of Internet Gaming Disorder (IGD). In the present study participants were assessed with the IGDS9-SF, nine-item Internet Gaming Disorder Scale (IGDS) and the Young's Internet Addiction Test-Short Form (YIAT-SF). Confirmatory factor analyzes demonstrated that the factor structure (i.e., the dimensional structure) of the IGDS9-SF was satisfactory. The scale was also reliable (i.e., internally consistent with a Cronbach's alpha of 0.89) and showed adequate convergent and criterion-related validity, as indicated by statistically significant positive correlations between average time daily spent playing games during last year, IGDS and YIAT-SF scores. By applying the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) threshold for diagnosing IGD (e.g., endorsing at least five criteria), it was found that the prevalence of disordered gamers ranged from 0.96% (whole sample) to 2.57% (e-sports players). These findings support the Turkish version of the IGDS9-SF as a valid and reliable tool for determining the extent of IGD-related problems among young adults and for the purposes of early IGD diagnosis in clinical settings and similar research. Copyright © 2018 Elsevier B.V. All rights reserved.
Validation of the Arabic Version of the Internet Gaming Disorder-20 Test.
Hawi, Nazir S; Samaha, Maya
2017-04-01
In recent years, researchers have been trying to shed light on gaming addiction and its association with different psychiatric disorders and psychological determinants. The latest edition version of the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) included in its Section 3 Internet Gaming Disorder (IGD) as a condition for further empirical study and proposed nine criteria for the diagnosis of IGD. The 20-item Internet Gaming Disorder (IGD-20) Test was developed as a valid and reliable tool to assess gaming addiction based on the nine criteria set by the DSM-5. The aim of this study is to validate an Arabic version of the IGD-20 Test. The Arabic version of IGD-20 will not only help in identifying Arabic-speaking pathological gamers but also stimulate cross-cultural studies that could contribute to an area in need of more research for insight and treatment. After a process of translation and back-translation and with the participation of a sizable sample of Arabic-speaking adolescents, the present study conducted a psychometric validation of the IGD-20 Test. Our confirmatory factor analysis showed the validity of the Arabic version of the IGD-20 Test. The one-factor model of the Arabic IGD-20 Test had very good psychometric properties, and it fitted the sample data extremely well. In addition, correlation analysis between the IGD-20 Test and the daily duration on weekdays and weekends gameplay revealed significant positive relationships that warranted a criterion-related validation. Thus, the Arabic version of the IGD-20 Test is a valid and reliable measure of IGD among Arabic-speaking populations.
López-Ortega, Mariana; Torres-Castro, Sara; Rosas-Carrasco, Oscar
2016-12-09
The Satisfaction with Life Scale (SWLS) has been widely used and has proven to be a valid and reliable instrument for assessing satisfaction with life in diverse population groups, however, research on satisfaction with life and validation of different measuring instruments in Mexican adults is still lacking. The objective was to evaluate the psychometric properties of the Satisfaction with Life Scale (SWLS) in a representative sample of Mexican adults. This is a methodological study to evaluate a satisfaction with life scale in a sample of 13,220 Mexican adults 50 years of age or older from the 2012 Mexican Health and Aging Study. The scale's reliability (internal consistency) was analysed using Cronbach's alpha and inter-item correlations. An exploratory factor analysis was also performed. Known-groups validity was evaluated comparing good-health and bad-health participants. Comorbidity, perceived financial situation, self-reported general health, depression symptoms, and social support were included to evaluate the validity between these measures and the total score of the scale using Spearman's correlations. The analysis of the scale's reliability showed good internal consistency (α = 0.74). The exploratory factor analysis confirmed the existence of a unique factor structure that explained 54% of the variance. SWLS was related to depression, perceived health, financial situation, and social support, and these relations were all statistically significant (P < .01). There was significant difference in life satisfaction between the good- and bad-health groups. Results show good internal consistency and construct validity of the SWLS. These results are comparable with results from previous studies. Meeting the study's objective to validate the scale, the results show that the Spanish version of the SWLS is a reliable and valid measure of satisfaction with life in the Mexican context.
Brunault, Paul; Ballon, Nicolas; Gaillard, Philippe; Réveillère, Christian; Courtois, Robert
2014-01-01
Objective: The concept of food addiction has recently been proposed by applying the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision, criteria for substance dependence to eating behaviour. Food addiction has received increased attention given that it may play a role in binge eating, eating disorders, and the recent increase in obesity prevalence. Currently, there is no psychometrically sound tool for assessing food addiction in French. Our study aimed to test the psychometric properties of a French version of the Yale Food Addiction Scale (YFAS) by establishing its factor structure and construct validity in a nonclinical population. Method: A total of 553 participants were assessed for food addiction (French version of the YFAS) and binge eating behaviour (Bulimic Investigatory Test Edinburgh and Binge Eating Scale). We tested the scale’s factor structure (factor analysis for dichotomous data based on tetrachoric correlation coefficients), internal consistency, and construct validity with measures of binge eating. Results: Our results supported a 1-factor structure, which accounted for 54.1% of the variance. This tool had adequate reliability and high construct validity with measures of binge eating in this population, both in its diagnosis and symptom count version. A 2-factor structure explained an additional 9.1% of the variance, and could differentiate between patients with high, compared with low, levels of insight regarding addiction symptoms. Conclusions: In our study, we validated a psychometrically sound French version of the YFAS, both in its symptom count and diagnostic version. Future studies should validate this tool in clinical samples. PMID:25007281
Komro, Kelli A; Livingston, Melvin D; Kominsky, Terrence K; Livingston, Bethany J; Garrett, Brady A; Molina, Mildred Maldonado; Boyd, Misty L
2015-01-01
Objective: American Indians (AIs) suffer from significant alcohol-related health disparities, and increased risk begins early. This study examined the reliability and validity of measures to be used in a preventive intervention trial. Reliability and validity across racial/ethnic subgroups are crucial to evaluate intervention effectiveness and promote culturally appropriate evidence-based practice. Method: To assess reliability and validity, we used three baseline surveys of high school students participating in a preventive intervention trial within the jurisdictional service area of the Cherokee Nation in northeastern Oklahoma. The 15-minute alcohol risk survey included 16 multi-item scales and one composite score measuring key proximal, primary, and moderating variables. Forty-four percent of the students indicated that they were AI (of whom 82% were Cherokee), including 23% who reported being AI only (n = 435) and 18% both AI and White (n = 352). Forty-seven percent reported being White only (n = 901). Results: Scales were adequately reliable for the full sample and across race/ethnicity defined by AI, AI/White, and White subgroups. Among the full sample, all scales had acceptable internal consistency, with minor variation across race/ethnicity. All scales had extensive to exemplary test–retest reliability and showed minimal variation across race/ethnicity. The eight proximal and two primary outcome scales were each significantly associated with the frequency of alcohol use during the past month in both the cross-sectional and the longitudinal models, providing support for both criterion validity and predictive validity. For most scales, interpretation of the strength of association and statistical significance did not differ between the racial/ethnic subgroups. Conclusions: The results support the reliability and validity of scales of a brief questionnaire measuring risk and protective factors for alcohol use among AI adolescents, primarily members of the Cherokee Nation. PMID:25486402
LaBudde, Robert A; Harnly, James M
2012-01-01
A qualitative botanical identification method (BIM) is an analytical procedure that returns a binary result (1 = Identified, 0 = Not Identified). A BIM may be used by a buyer, manufacturer, or regulator to determine whether a botanical material being tested is the same as the target (desired) material, or whether it contains excessive nontarget (undesirable) material. The report describes the development and validation of studies for a BIM based on the proportion of replicates identified, or probability of identification (POI), as the basic observed statistic. The statistical procedures proposed for data analysis follow closely those of the probability of detection, and harmonize the statistical concepts and parameters between quantitative and qualitative method validation. Use of POI statistics also harmonizes statistical concepts for botanical, microbiological, toxin, and other analyte identification methods that produce binary results. The POI statistical model provides a tool for graphical representation of response curves for qualitative methods, reporting of descriptive statistics, and application of performance requirements. Single collaborator and multicollaborative study examples are given.
Validity of Models for Predicting BRCA1 and BRCA2 Mutations
Parmigiani, Giovanni; Chen, Sining; Iversen, Edwin S.; Friebel, Tara M.; Finkelstein, Dianne M.; Anton-Culver, Hoda; Ziogas, Argyrios; Weber, Barbara L.; Eisen, Andrea; Malone, Kathleen E.; Daling, Janet R.; Hsu, Li; Ostrander, Elaine A.; Peterson, Leif E.; Schildkraut, Joellen M.; Isaacs, Claudine; Corio, Camille; Leondaridis, Leoni; Tomlinson, Gail; Amos, Christopher I.; Strong, Louise C.; Berry, Donald A.; Weitzel, Jeffrey N.; Sand, Sharon; Dutson, Debra; Kerber, Rich; Peshkin, Beth N.; Euhus, David M.
2008-01-01
Background Deleterious mutations of the BRCA1 and BRCA2 genes confer susceptibility to breast and ovarian cancer. At least 7 models for estimating the probabilities of having a mutation are used widely in clinical and scientific activities; however, the merits and limitations of these models are not fully understood. Objective To systematically quantify the accuracy of the following publicly available models to predict mutation carrier status: BRCAPRO, family history assessment tool, Finnish, Myriad, National Cancer Institute, University of Pennsylvania, and Yale University. Design Cross-sectional validation study, using model predictions and BRCA1 or BRCA2 mutation status of patients different from those used to develop the models. Setting Multicenter study across Cancer Genetics Network participating centers. Patients 3 population-based samples of participants in research studies and 8 samples from genetic counseling clinics. Measurements Discrimination between individuals testing positive for a mutation in BRCA1 or BRCA2 from those testing negative, as measured by the c-statistic, and sensitivity and specificity of model predictions. Results The 7 models differ in their predictions. The better-performing models have a c-statistic around 80%. BRCAPRO has the largest c-statistic overall and in all but 2 patient subgroups, although the margin over other models is narrow in many strata. Outside of high-risk populations, all models have high false-negative and false-positive rates across a range of probability thresholds used to refer for mutation testing. Limitation Three recently published models were not included. Conclusions All models identify women who probably carry a deleterious mutation of BRCA1 or BRCA2 with adequate discrimination to support individualized genetic counseling, although discrimination varies across models and populations. PMID:17909205
Lundin, Andreas; Hallgren, Mats; Balliu, Natalja; Forsell, Yvonne
2015-01-01
The alcohol use disorders identification test (AUDIT) and AUDIT-Consumption (AUDIT-C) are commonly used in population surveys but there are few validations studies in the general population. Validity should be estimated in samples close to the targeted population and setting. This study aims to validate AUDIT and AUDIT-C in a general population sample (PART) in Stockholm, Sweden. We used a general population subsample age 20 to 64 that answered a postal questionnaire including AUDIT who later participated in a psychiatric interview (n = 1,093). Interviews using Schedules for Clinical Assessment in Neuropsychiatry was used as criterion standard. Diagnoses were set according to the fourth version of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Agreement between the diagnostic test and criterion standard was measured with area under the receiver operator characteristics curve (AUC). A total of 1,086 (450 men and 636 women) of the interview participants completed AUDIT. There were 96 individuals with DSM-IV-alcohol dependence, 36 DSM-IV-Alcohol Abuse, and 153 Risk drinkers. AUCs were for DSM-IV-alcohol use disorder 0.90 (AUDIT-C 0.85); DSM-IV-dependence 0.94 (AUDIT-C 0.89); risk drinking 0.80 (AUDIT-C 0.80); and any criterion 0.87 (AUDIT-C 0.84). In this general population sample, AUDIT and AUDIT-C performed outstanding or excellent in identifying dependency, risk drinking, alcohol use disorder, any disorder, or risk drinking. Copyright © 2015 by the Research Society on Alcoholism.
Hegde, Sapna; Patodia, Akash; Dixit, Uma
2017-08-01
Demirjian's method has been the most popular and extensively tested radiographic method of age estimation. More recently, Willems' method has been reported to be a better predictor of age. Nolla's and Häävikko's methods have been used to a lesser extent. Very few studies have compared all four methods in non-Indian and Indian populations. Most Indian research is limited by inadequate sample sizes, age structures and grouping and different approaches to statistical analysis. The present study aimed to evaluate and compare the validity of the Demirjian, Willems, Nolla and Häävikko methods in determination of chronological age of 5 to 15 year-old Indian children. In this cross-sectional observational study, four methods were compared for validity in estimating the age of 1200 Indian children aged 5-15 years. Demirjian's method overestimated age by +0.24 ± 0.80, +0.11 ± 0.81 and +0.19 ± 0.80 years in boys, girls and the total sample, respectively. With Willems' method, overestimations of +0.09 ± 0.80, +0.08 ± 0.80 and +0.09 ± 0.80 years were obtained in boys, girls and the total sample, respectively. Nolla's method underestimated age by -0.13 ± 0.80, -0.30 ± 0.82 and -0.20 ± 0.81 years in boys, girls and the total sample, respectively. Häävikko's method underestimated age by -0.17 ± 0.80, -0.29 ± 0.83 and -0.22 ± 0.82 years in boys, girls and the total sample, respectively. Statistically significant differences were observed between dental and chronological ages with all methods (p < 0.001). Significant gender-based differences were observed with all methods except Willems' (p < 0.05). Gender-specific regression formulae were derived for all methods. Willems' method most accurately estimated age, followed by Demirjian's, Nolla's and Häävikko's methods. All four methods could be applicable for estimating age in the present population, mean prediction errors being lower than 0.30 years (3.6 months). Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Wang, Hui; Liu, Tao; Qiu, Quan; Ding, Peng; He, Yan-Hui; Chen, Wei-Qing
2015-01-23
This study aimed to develop and validate a simple risk score for detecting individuals with impaired fasting glucose (IFG) among the Southern Chinese population. A sample of participants aged ≥20 years and without known diabetes from the 2006-2007 Guangzhou diabetes cross-sectional survey was used to develop separate risk scores for men and women. The participants completed a self-administered structured questionnaire and underwent simple clinical measurements. The risk scores were developed by multiple logistic regression analysis. External validation was performed based on three other studies: the 2007 Zhuhai rural population-based study, the 2008-2010 Guangzhou diabetes cross-sectional study and the 2007 Tibet population-based study. Performance of the scores was measured with the Hosmer-Lemeshow goodness-of-fit test and ROC c-statistic. Age, waist circumference, body mass index and family history of diabetes were included in the risk score for both men and women, with the additional factor of hypertension for men. The ROC c-statistic was 0.70 for both men and women in the derivation samples. Risk scores of ≥28 for men and ≥18 for women showed respective sensitivity, specificity, positive predictive value and negative predictive value of 56.6%, 71.7%, 13.0% and 96.0% for men and 68.7%, 60.2%, 11% and 96.0% for women in the derivation population. The scores performed comparably with the Zhuhai rural sample and the 2008-2010 Guangzhou urban samples but poorly in the Tibet sample. The performance of pre-existing USA, Shanghai, and Chengdu risk scores was poorer in our population than in their original study populations. The results suggest that the developed simple IFG risk scores can be generalized in Guangzhou city and nearby rural regions and may help primary health care workers to identify individuals with IFG in their practice.
Wang, Hui; Liu, Tao; Qiu, Quan; Ding, Peng; He, Yan-Hui; Chen, Wei-Qing
2015-01-01
This study aimed to develop and validate a simple risk score for detecting individuals with impaired fasting glucose (IFG) among the Southern Chinese population. A sample of participants aged ≥20 years and without known diabetes from the 2006–2007 Guangzhou diabetes cross-sectional survey was used to develop separate risk scores for men and women. The participants completed a self-administered structured questionnaire and underwent simple clinical measurements. The risk scores were developed by multiple logistic regression analysis. External validation was performed based on three other studies: the 2007 Zhuhai rural population-based study, the 2008–2010 Guangzhou diabetes cross-sectional study and the 2007 Tibet population-based study. Performance of the scores was measured with the Hosmer-Lemeshow goodness-of-fit test and ROC c-statistic. Age, waist circumference, body mass index and family history of diabetes were included in the risk score for both men and women, with the additional factor of hypertension for men. The ROC c-statistic was 0.70 for both men and women in the derivation samples. Risk scores of ≥28 for men and ≥18 for women showed respective sensitivity, specificity, positive predictive value and negative predictive value of 56.6%, 71.7%, 13.0% and 96.0% for men and 68.7%, 60.2%, 11% and 96.0% for women in the derivation population. The scores performed comparably with the Zhuhai rural sample and the 2008–2010 Guangzhou urban samples but poorly in the Tibet sample. The performance of pre-existing USA, Shanghai, and Chengdu risk scores was poorer in our population than in their original study populations. The results suggest that the developed simple IFG risk scores can be generalized in Guangzhou city and nearby rural regions and may help primary health care workers to identify individuals with IFG in their practice. PMID:25625405
Leyrat, Clémence; Caille, Agnès; Foucher, Yohann; Giraudeau, Bruno
2016-01-22
Despite randomization, baseline imbalance and confounding bias may occur in cluster randomized trials (CRTs). Covariate imbalance may jeopardize the validity of statistical inferences if they occur on prognostic factors. Thus, the diagnosis of a such imbalance is essential to adjust statistical analysis if required. We developed a tool based on the c-statistic of the propensity score (PS) model to detect global baseline covariate imbalance in CRTs and assess the risk of confounding bias. We performed a simulation study to assess the performance of the proposed tool and applied this method to analyze the data from 2 published CRTs. The proposed method had good performance for large sample sizes (n =500 per arm) and when the number of unbalanced covariates was not too small as compared with the total number of baseline covariates (≥40% of unbalanced covariates). We also provide a strategy for pre selection of the covariates needed to be included in the PS model to enhance imbalance detection. The proposed tool could be useful in deciding whether covariate adjustment is required before performing statistical analyses of CRTs.
NASA Astrophysics Data System (ADS)
Artrith, Nongnuch; Urban, Alexander; Ceder, Gerbrand
2018-06-01
The atomistic modeling of amorphous materials requires structure sizes and sampling statistics that are challenging to achieve with first-principles methods. Here, we propose a methodology to speed up the sampling of amorphous and disordered materials using a combination of a genetic algorithm and a specialized machine-learning potential based on artificial neural networks (ANNs). We show for the example of the amorphous LiSi alloy that around 1000 first-principles calculations are sufficient for the ANN-potential assisted sampling of low-energy atomic configurations in the entire amorphous LixSi phase space. The obtained phase diagram is validated by comparison with the results from an extensive sampling of LixSi configurations using molecular dynamics simulations and a general ANN potential trained to ˜45 000 first-principles calculations. This demonstrates the utility of the approach for the first-principles modeling of amorphous materials.
A novel multi-target regression framework for time-series prediction of drug efficacy.
Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin
2017-01-18
Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task.
Cabrera-Pivaral, Carlos E; Gutiérrez-Ruvalcaba, Clara Luz; Peralta-Heredia, Irma Concepción; Alonso-Reynoso, Carlos
2008-01-01
The purpose of this work was to measure family physicians' clinical aptitude for the diagnosis and treatment of metabolic syndrome in a representative sample from six Family Medicine Units (UMF) at the Mexican Institute for Social Security (IMSS), in Guadalajara, Jalisco, México. This is a cross-sectional study. A validated and structured instrument was used, with a confidence coefficient (Kuder-Richardson) of 0.95, that was applied to a representative sample of 90 family physicians throughout six UMFs in Guadalajara, between 2003 and 2004. Mann-Whitney's U and Kruskal-Wallis' tests were used to compare two or more groups, and the Perez-Viniegra Test was used to define aptitude development levels. No statistically significant differences were found in aptitude development between the six family medicine units groups and other comparative groups. The generally low level of clinical aptitude, and its indicators, reflects limitations on the part of family physicians at the IMSS in Jalisco to identify and manage metabolic syndrome.
A novel multi-target regression framework for time-series prediction of drug efficacy
Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin
2017-01-01
Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task. PMID:28098186
Organizing the Confusion Surrounding Workaholism: New Structure, Measure, and Validation
Shkoler, Or; Rabenu, Edna; Vasiliu, Cristinel; Sharoni, Gil; Tziner, Aharon
2017-01-01
Since “workaholism” was coined, a considerable body of research was conducted to shed light on its essence. After at least 40 years of studying this important phenomenon, a large variety of definitions, conceptualizations, and measures emerged. In order to try and bring more integration and consensus to this construct, the current research was conducted in two phases. We aimed to formulate a theoretical definitional framework for workaholism, capitalizing upon the Facet Theory Approach. Two basic facets were hypothesized: A. Modalities of workaholism, with three elements: cognitive, emotional, and instrumental; and B. Resources of workaholism with two elements: time and effort. Based on this definitional framework, a structured questionnaire was conceived. In the first phase, the new measure was validated with an Israeli sample comparing two statistical procedures; Factor Analysis (FA) and Smallest Space Analysis (SSA). In the second phase, we aimed to replicate the findings, and to contrast the newly-devised questionnaire with other extant workaholism measures, with a Romanian sample. Theoretical implications and future research suggestions are discussed. PMID:29097989
The development and testing of a qualitative instrument designed to assess critical thinking
NASA Astrophysics Data System (ADS)
Clauson, Cynthia Louisa
This study examined a qualitative approach to assess critical thinking. An instrument was developed that incorporates an assessment process based on Dewey's (1933) concepts of self-reflection and critical thinking as problem solving. The study was designed to pilot test the critical thinking assessment process with writing samples collected from a heterogeneous group of students. The pilot test included two phases. Phase 1 was designed to determine the validity and inter-rater reliability of the instrument using two experts in critical thinking, problem solving, and literacy development. Validity of the instrument was addressed by requesting both experts to respond to ten questions in an interview. The inter-rater reliability was assessed by analyzing the consistency of the two experts' scorings of the 20 writing samples to each other, as well as to my scoring of the same 20 writing samples. Statistical analyses included the Spearman Rho and the Kuder-Richardson (Formula 20). Phase 2 was designed to determine the validity and reliability of the critical thinking assessment process with seven science teachers. Validity was addressed by requesting the teachers to respond to ten questions in a survey and interview. Inter-rater reliability was addressed by comparing the seven teachers' scoring of five writing samples with my scoring of the same five writing samples. Again, the Spearman Rho and the Kuder-Richardson (Formula 20) were used to determine the inter-rater reliability. The validity results suggest that the instrument is helpful as a guide for instruction and provides a systematic method to teach and assess critical thinking while problem solving with students in the classroom. The reliability results show the critical thinking assessment instrument to possess fairly high reliability when used by the experts, but weak reliability when used by classroom teachers. A major conclusion was drawn that teachers, as well as students, would need to receive instruction in critical thinking and in how to use the assessment process in order to gain more consistent interpretations of the six problem-solving steps. Specific changes needing to be made in the instrument to improve the quality are included.
Bujold, M; El Sherif, R; Bush, P L; Johnson-Lafleur, J; Doray, G; Pluye, P
2018-02-01
This mixed methods study content validated the Information Assessment Method for parents (IAM-parent) that allows users to systematically rate and comment on online parenting information. Quantitative data and results: 22,407 IAM ratings were collected; of the initial 32 items, descriptive statistics showed that 10 had low relevance. Qualitative data and results: IAM-based comments were collected, and 20 IAM users were interviewed (maximum variation sample); the qualitative data analysis assessed the representativeness of IAM items, and identified items with problematic wording. Researchers, the program director, and Web editors integrated quantitative and qualitative results, which led to a shorter and clearer IAM-parent. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Data survey on the effect of product features on competitive advantage of selected firms in Nigeria.
Olokundun, Maxwell; Iyiola, Oladele; Ibidunni, Stephen; Falola, Hezekiah; Salau, Odunayo; Amaihian, Augusta; Peter, Fred; Borishade, Taiye
2018-06-01
The main objective of this study was to present a data article that investigates the effect product features on firm's competitive advantage. Few studies have examined how the features of a product could help in driving the competitive advantage of a firm. Descriptive research method was used. Statistical Package for Social Sciences (SPSS 22) was engaged for analysis of one hundred and fifty (150) valid questionnaire which were completed by small business owners registered under small and medium scale enterprises development of Nigeria (SMEDAN). Stratified and simple random sampling techniques were employed; reliability and validity procedures were also confirmed. The field data set is made publicly available to enable critical or extended analysis.
Validation of the Clinical Learning Environment Inventory.
Chan, Dominic S
2003-08-01
One hundred eight preregistration nursing students took part in this survey study, which assessed their perceptions of the clinical learning environment. Statistical data based on the sample confirmed the reliability and validity of the Clinical Learning Environment Inventory (CLEI), which was developed using the concept of classroom learning environment studies. The study also found that there were significant differences between students' actual and preferred perceptions of the clinical learning environments. In terms of the CLEI scales, students preferred a more positive and favorable clinical environment than they perceived as being actually present. The achievement of certain outcomes of clinical field placements might be enhanced by attempting to change the actual clinical environment in ways that make it more congruent with that preferred by the students.
Collins, Anne; Ross, Janine
2017-01-01
We performed a systematic review to identify all original publications describing the asymmetric inheritance of cellular organelles in normal animal eukaryotic cells and to critique the validity and imprecision of the evidence. Searches were performed in Embase, MEDLINE and Pubmed up to November 2015. Screening of titles, abstracts and full papers was performed by two independent reviewers. Data extraction and validity were performed by one reviewer and checked by a second reviewer. Study quality was assessed using the SYRCLE risk of bias tool, for animal studies and by developing validity tools for the experimental model, organelle markers and imprecision. A narrative data synthesis was performed. We identified 31 studies (34 publications) of the asymmetric inheritance of organelles after mitotic or meiotic division. Studies for the asymmetric inheritance of centrosomes (n = 9); endosomes (n = 6), P granules (n = 4), the midbody (n = 3), mitochondria (n = 3), proteosomes (n = 2), spectrosomes (n = 2), cilia (n = 2) and endoplasmic reticulum (n = 2) were identified. Asymmetry was defined and quantified by variable methods. Assessment of the statistical reliability of the results indicated only two studies (7%) were judged to have low concern, the majority of studies (77%) were 'unclear' and five (16%) were judged to have 'high concerns'; the main reasons were low technical repeats (<10). Assessment of model validity indicated that the majority of studies (61%) were judged to be valid, ten studies (32%) were unclear and two studies (7%) were judged to have 'high concerns'; both described 'stem cells' without providing experimental evidence to confirm this (pluripotency and self-renewal). Assessment of marker validity indicated that no studies had low concern, most studies were unclear (96.5%), indicating there were insufficient details to judge if the markers were appropriate. One study had high concern for marker validity due to the contradictory results of two markers for the same organelle. For most studies the validity and imprecision of results could not be confirmed. In particular, data were limited due to a lack of reporting of interassay variability, sample size calculations, controls and functional validation of organelle markers. An evaluation of 16 systematic reviews containing cell assays found that only 50% reported adherence to PRISMA or ARRIVE reporting guidelines and 38% reported a formal risk of bias assessment. 44% of the reviews did not consider how relevant or valid the models were to the research question. 75% reviews did not consider how valid the markers were. 69% of reviews did not consider the impact of the statistical reliability of the results. Future systematic reviews in basic or preclinical research should ensure the rigorous reporting of the statistical reliability of the results in addition to the validity of the methods. Increased awareness of the importance of reporting guidelines and validation tools is needed for the scientific community. PMID:28562636
Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B
2013-03-23
Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
Tomlinson, Alan; Hair, Mario; McFadyen, Angus
2013-10-01
Dry eye is a multifactorial disease which would require a broad spectrum of test measures in the monitoring of its treatment and diagnosis. However, studies have typically reported improvements in individual measures with treatment. Alternative approaches involve multiple, combined outcomes being assessed by different statistical analyses. In order to assess the effect of various statistical approaches to the use of single and combined test measures in dry eye, this review reanalyzed measures from two previous studies (osmolarity, evaporation, tear turnover rate, and lipid film quality). These analyses assessed the measures as single variables within groups, pre- and post-intervention with a lubricant supplement, by creating combinations of these variables and by validating these combinations with the combined sample of data from all groups of dry eye subjects. The effectiveness of single measures and combinations in diagnosis of dry eye was also considered. Copyright © 2013. Published by Elsevier Inc.
Laube, Norbert; Zimmermann, Diana J
2004-01-01
This study was performed to quantify the effect of a 1-week freezer storage of urine on its calcium oxalate crystallization risk. Calcium oxalate is the most common urinary stone material observed in urolithiasis patients in western and affluent countries. The BONN-Risk-Index of calcium oxalate crystallization risk in human urine is determined from a crystallization experiment performed on untreated native urine samples. We tested the influence of a 1-week freezing on the BONN-Risk-Index value as well as the effect of the sample freezing on the urinary osmolality. In vitro crystallization experiments in 49 native urine samples from stone-forming and non-stone forming individuals were performed in order to determine their calcium oxalate crystallization risk according to the BONN-Risk-Index approach. Comparison of the results derived from original sample investigations with those obtained from the thawed aliquots by statistical evaluation shows that i) no significant deviation from linearity between both results exists and ii) both results are identical by statistical means. This is valid for both, the BONN-Risk-Index and the osmolality data. The differences in the BONN-Risk-Index results of both procedures of BONN-Risk-Index determination, however, exceed the clinically acceptable difference. Thus, determination of the urinary calcium oxalate crystallization risk from thawed urine samples cannot be recommended.
Structural parameters of young star clusters: fractal analysis
NASA Astrophysics Data System (ADS)
Hetem, A.
2017-07-01
A unified view of star formation in the Universe demand detailed and in-depth studies of young star clusters. This work is related to our previous study of fractal statistics estimated for a sample of young stellar clusters (Gregorio-Hetem et al. 2015, MNRAS 448, 2504). The structural properties can lead to significant conclusions about the early stages of cluster formation: 1) virial conditions can be used to distinguish warm collapsed; 2) bound or unbound behaviour can lead to conclusions about expansion; and 3) fractal statistics are correlated to the dynamical evolution and age. The technique of error bars estimation most used in the literature is to adopt inferential methods (like bootstrap) to estimate deviation and variance, which are valid only for an artificially generated cluster. In this paper, we expanded the number of studied clusters, in order to enhance the investigation of the cluster properties and dynamic evolution. The structural parameters were compared with fractal statistics and reveal that the clusters radial density profile show a tendency of the mean separation of the stars increase with the average surface density. The sample can be divided into two groups showing different dynamic behaviour, but they have the same dynamic evolution, since the entire sample was revealed as being expanding objects, for which the substructures do not seem to have been completely erased. These results are in agreement with the simulations adopting low surface densities and supervirial conditions.
Holden, Libby; Lee, Christina; Hockey, Richard; Ware, Robert S; Dobson, Annette J
2014-12-01
This study aimed to validate a 6-item 1-factor global measure of social support developed from the Medical Outcomes Study Social Support Survey (MOS-SSS) for use in large epidemiological studies. Data were obtained from two large population-based samples of participants in the Australian Longitudinal Study on Women's Health. The two cohorts were aged 53-58 and 28-33 years at data collection (N = 10,616 and 8,977, respectively). Items selected for the 6-item 1-factor measure were derived from the factor structure obtained from unpublished work using an earlier wave of data from one of these cohorts. Descriptive statistics, including polychoric correlations, were used to describe the abbreviated scale. Cronbach's alpha was used to assess internal consistency and confirmatory factor analysis to assess scale validity. Concurrent validity was assessed using correlations between the new 6-item version and established 19-item version, and other concurrent variables. In both cohorts, the new 6-item 1-factor measure showed strong internal consistency and scale reliability. It had excellent goodness-of-fit indices, similar to those of the established 19-item measure. Both versions correlated similarly with concurrent measures. The 6-item 1-factor MOS-SSS measures global functional social support with fewer items than the established 19-item measure.
Landsgesell, Jonas; Holm, Christian; Smiatek, Jens
2017-02-14
We present a novel method for the study of weak polyelectrolytes and general acid-base reactions in molecular dynamics and Monte Carlo simulations. The approach combines the advantages of the reaction ensemble and the Wang-Landau sampling method. Deprotonation and protonation reactions are simulated explicitly with the help of the reaction ensemble method, while the accurate sampling of the corresponding phase space is achieved by the Wang-Landau approach. The combination of both techniques provides a sufficient statistical accuracy such that meaningful estimates for the density of states and the partition sum can be obtained. With regard to these estimates, several thermodynamic observables like the heat capacity or reaction free energies can be calculated. We demonstrate that the computation times for the calculation of titration curves with a high statistical accuracy can be significantly decreased when compared to the original reaction ensemble method. The applicability of our approach is validated by the study of weak polyelectrolytes and their thermodynamic properties.
Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy
2016-02-01
Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
ERIC Educational Resources Information Center
Hassad, Rossi A.
2009-01-01
This study examined the teaching practices of 227 college instructors of introductory statistics (from the health and behavioral sciences). Using primarily multidimensional scaling (MDS) techniques, a two-dimensional, 10-item teaching practice scale, TISS (Teaching of Introductory Statistics Scale), was developed and validated. The two dimensions…
Ho, Lindsey A; Lange, Ethan M
2010-12-01
Genome-wide association (GWA) studies are a powerful approach for identifying novel genetic risk factors associated with human disease. A GWA study typically requires the inclusion of thousands of samples to have sufficient statistical power to detect single nucleotide polymorphisms that are associated with only modest increases in risk of disease given the heavy burden of a multiple test correction that is necessary to maintain valid statistical tests. Low statistical power and the high financial cost of performing a GWA study remains prohibitive for many scientific investigators anxious to perform such a study using their own samples. A number of remedies have been suggested to increase statistical power and decrease cost, including the utilization of free publicly available genotype data and multi-stage genotyping designs. Herein, we compare the statistical power and relative costs of alternative association study designs that use cases and screened controls to study designs that are based only on, or additionally include, free public control genotype data. We describe a novel replication-based two-stage study design, which uses free public control genotype data in the first stage and follow-up genotype data on case-matched controls in the second stage that preserves many of the advantages inherent when using only an epidemiologically matched set of controls. Specifically, we show that our proposed two-stage design can substantially increase statistical power and decrease cost of performing a GWA study while controlling the type-I error rate that can be inflated when using public controls due to differences in ancestry and batch genotype effects.
Silber, J H; Fridman, M; DiPaola, R S; Erder, M H; Pauly, M V; Fox, K R
1998-07-01
If patients could be ranked according to their projected need for supportive care therapy, then more efficient and less costly treatment algorithms might be developed. This work reports on the construction of a model of neutropenia, dose reduction, or delay that rank-orders patients according to their need for costly supportive care such as granulocyte growth factors. A case series and consecutive sample of patients treated for breast cancer were studied. Patients had received standard-dose adjuvant chemotherapy for early-stage nonmetastatic breast cancer and were treated by four medical oncologists. Using 95 patients and validated with 80 additional patients, development models were constructed to predict one or more of the following events: neutropenia (absolute neutrophil count [ANC] < or = 250/microL), dose reduction > or = 15% of that scheduled, or treatment delay > or = 7 days. Two approaches to modeling were attempted. The pretreatment approach used only pretreatment predictors such as chemotherapy regimen and radiation history; the conditional approach included, in addition, blood count information obtained in the first cycle of treatment. The pretreatment model was unsuccessful at predicting neutropenia, dose reduction, or delay (c-statistic = 0.63). Conditional models were good predictors of subsequent events after cycle 1 (c-statistic = 0.87 and 0.78 for development and validation samples, respectively). The depth of the first-cycle ANC was an excellent predictor of events in subsequent cycles (P = .0001 to .004). Chemotherapy plus radiation also increased the risk of subsequent events (P = .0011 to .0901). Decline in hemoglobin (HGB) level during the first cycle of therapy was a significant predictor of events in the development study (P = .0074 and .0015), and although the trend was similar in the validation study, HGB decline failed to reach statistical significance. It is possible to rank patients according to their need of supportive care based on blood counts observed in the first cycle of therapy. Such rankings may aid in the choice of appropriate supportive care for patients with early-stage breast cancer.
Accounting for selection bias in association studies with complex survey data.
Wirth, Kathleen E; Tchetgen Tchetgen, Eric J
2014-05-01
Obtaining representative information from hidden and hard-to-reach populations is fundamental to describe the epidemiology of many sexually transmitted diseases, including HIV. Unfortunately, simple random sampling is impractical in these settings, as no registry of names exists from which to sample the population at random. However, complex sampling designs can be used, as members of these populations tend to congregate at known locations, which can be enumerated and sampled at random. For example, female sex workers may be found at brothels and street corners, whereas injection drug users often come together at shooting galleries. Despite the logistical appeal, complex sampling schemes lead to unequal probabilities of selection, and failure to account for this differential selection can result in biased estimates of population averages and relative risks. However, standard techniques to account for selection can lead to substantial losses in efficiency. Consequently, researchers implement a variety of strategies in an effort to balance validity and efficiency. Some researchers fully or partially account for the survey design, whereas others do nothing and treat the sample as a realization of the population of interest. We use directed acyclic graphs to show how certain survey sampling designs, combined with subject-matter considerations unique to individual exposure-outcome associations, can induce selection bias. Finally, we present a novel yet simple maximum likelihood approach for analyzing complex survey data; this approach optimizes statistical efficiency at no cost to validity. We use simulated data to illustrate this method and compare it with other analytic techniques.
Thrasher, James F.; Reid, Jessica L.; Hammond, David
2016-01-01
Abstract Introduction: Studies examining cigarette package pictorial health warning label (HWL) content have primarily used designs that do not allow determination of effectiveness after repeated, naturalistic exposure. This research aimed to determine the predictive and external validity of a pre-market evaluation study of pictorial HWLs. Methods: Data were analyzed from: (1) a pre-market convenience sample of 544 adult smokers who participated in field experiments in Mexico City before pictorial HWL implementation (September 2010); and (2) a post-market population-based representative sample of 1765 adult smokers in the Mexican administration of the International Tobacco Control Policy Evaluation Survey after pictorial HWL implementation. Participants in both samples rated six HWLs that appeared on cigarette packs, and also ranked HWLs with four different themes. Mixed effects models were estimated for each sample to assess ratings of relative effectiveness for the six HWLs, and to assess which HWL themes were ranked as the most effective. Results: Pre- and post-market data showed similar relative ratings across the six HWLs, with the least and most effective HWLs consistently differentiated from other HWLs. Models predicting rankings of HWL themes in post-market sample indicated: (1) pictorial HWLs were ranked as more effective than text-only HWLs; (2) HWLs with both graphic and “lived experience” content outperformed symbolic content; and, (3) testimonial content significantly outperformed didactic content. Pre-market data showed a similar pattern of results, but with fewer statistically significant findings. Conclusions: The study suggests well-designed pre-market studies can have predictive and external validity, helping regulators select HWL content. PMID:26377516
Huang, Li-Ling; Thrasher, James F; Reid, Jessica L; Hammond, David
2016-05-01
Studies examining cigarette package pictorial health warning label (HWL) content have primarily used designs that do not allow determination of effectiveness after repeated, naturalistic exposure. This research aimed to determine the predictive and external validity of a pre-market evaluation study of pictorial HWLs. Data were analyzed from: (1) a pre-market convenience sample of 544 adult smokers who participated in field experiments in Mexico City before pictorial HWL implementation (September 2010); and (2) a post-market population-based representative sample of 1765 adult smokers in the Mexican administration of the International Tobacco Control Policy Evaluation Survey after pictorial HWL implementation. Participants in both samples rated six HWLs that appeared on cigarette packs, and also ranked HWLs with four different themes. Mixed effects models were estimated for each sample to assess ratings of relative effectiveness for the six HWLs, and to assess which HWL themes were ranked as the most effective. Pre- and post-market data showed similar relative ratings across the six HWLs, with the least and most effective HWLs consistently differentiated from other HWLs. Models predicting rankings of HWL themes in post-market sample indicated: (1) pictorial HWLs were ranked as more effective than text-only HWLs; (2) HWLs with both graphic and "lived experience" content outperformed symbolic content; and, (3) testimonial content significantly outperformed didactic content. Pre-market data showed a similar pattern of results, but with fewer statistically significant findings. The study suggests well-designed pre-market studies can have predictive and external validity, helping regulators select HWL content. © The Author 2015. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Städler, Thomas; Haubold, Bernhard; Merino, Carlos; Stephan, Wolfgang; Pfaffelhuber, Peter
2009-01-01
Using coalescent simulations, we study the impact of three different sampling schemes on patterns of neutral diversity in structured populations. Specifically, we are interested in two summary statistics based on the site frequency spectrum as a function of migration rate, demographic history of the entire substructured population (including timing and magnitude of specieswide expansions), and the sampling scheme. Using simulations implementing both finite-island and two-dimensional stepping-stone spatial structure, we demonstrate strong effects of the sampling scheme on Tajima's D (DT) and Fu and Li's D (DFL) statistics, particularly under specieswide (range) expansions. Pooled samples yield average DT and DFL values that are generally intermediate between those of local and scattered samples. Local samples (and to a lesser extent, pooled samples) are influenced by local, rapid coalescence events in the underlying coalescent process. These processes result in lower proportions of external branch lengths and hence lower proportions of singletons, explaining our finding that the sampling scheme affects DFL more than it does DT. Under specieswide expansion scenarios, these effects of spatial sampling may persist up to very high levels of gene flow (Nm > 25), implying that local samples cannot be regarded as being drawn from a panmictic population. Importantly, many data sets on humans, Drosophila, and plants contain signatures of specieswide expansions and effects of sampling scheme that are predicted by our simulation results. This suggests that validating the assumption of panmixia is crucial if robust demographic inferences are to be made from local or pooled samples. However, future studies should consider adopting a framework that explicitly accounts for the genealogical effects of population subdivision and empirical sampling schemes. PMID:19237689
Kupek, Emil; de Assis, Maria Alice A
2016-09-01
External validation of food recall over 24 h in schoolchildren is often restricted to eating events in schools and is based on direct observation as the reference method. The aim of this study was to estimate the dietary intake out of school, and consequently the bias in such research design based on only part-time validated food recall, using multiple imputation (MI) conditioned on the information on child age, sex, BMI, family income, parental education and the school attended. The previous-day, web-based questionnaire WebCAAFE, structured as six meals/snacks and thirty-two foods/beverage, was answered by a sample of 7-11-year-old Brazilian schoolchildren (n 602) from five public schools. Food/beverage intake recalled by children was compared with the records provided by trained observers during school meals. Sensitivity analysis was performed with artificial data emulating those recalled by children on WebCAAFE in order to evaluate the impact of both differential and non-differential bias. Estimated bias was within ±30 % interval for 84·4 % of the thirty-two foods/beverages evaluated in WebCAAFE, and half of the latter reached statistical significance (P<0·05). Rarely (<3 %) consumed dietary items were often under-reported (fish/seafood, vegetable soup, cheese bread, French fries), whereas some of those most frequently reported (meat, bread/biscuits, fruits) showed large overestimation. Compared with the analysis restricted to fully validated data, MI reduced differential bias in sensitivity analysis but the bias still remained large in most cases. MI provided a suitable statistical framework for part-time validation design of dietary intake over six daily eating events.
Measurement issues in research on social support and health.
Dean, K; Holst, E; Kreiner, S; Schoenborn, C; Wilson, R
1994-01-01
STUDY OBJECTIVE--The aims were: (1) to identify methodological problems that may explain the inconsistencies and contradictions in the research evidence on social support and health, and (2) to validate a frequently used measure of social support in order to determine whether or not it could be used in multivariate analyses of population data in research on social support and health. DESIGN AND METHODS--Secondary analysis of data collected in a cross sectional survey of a multistage cluster sample of the population of the United States, designed to study relationships in behavioural, social support and health variables. Statistical models based on item response theory and graph theory were used to validate the measure of social support to be used in subsequent analyses. PARTICIPANTS--Data on 1755 men and women aged 20 to 64 years were available for the scale validation. RESULTS--Massive evidence of item bias was found for all items of a group membership subscale. The most serious problems were found in relationship to an item measuring membership in work related groups. Using that item in the social network scale in multivariate analyses would distort findings on the statistical effects of education, employment status, and household income. Evidence of item bias was also found for a sociability subscale. When marital status was included to create what is called an intimate contacts subscale, the confounding grew worse. CONCLUSIONS--The composite measure of social network is not valid and would seriously distort the findings of analyses attempting to study relationships between the index and other variables. The findings show that valid measurement is a methodological issue that must be addressed in scientific research on population health. PMID:8189179
Bernard, Marie-Agnès; Bénichou, Jacques; Blin, Patrick; Weill, Alain; Bégaud, Bernard; Abouelfath, Abdelilah; Moore, Nicholas; Fourrier-Réglat, Annie
2012-06-01
To determine healthcare claim patterns associated using nonsteroidal anti-inflammatory drugs (NSAIDs) for rheumatoid arthritis (RA). The CADEUS study randomly identified NSAID users within the French health insurance database. One-year claims data were extracted, and NSAID indication was obtained from prescribers. Logistic regression was used in a development sample to identify claim patterns predictive of RA and models applied to a validation sample. Analyses were stratified on the dispensation of immunosuppressive agents or specific antirheumatism treatment, and the area under the receiver operating characteristic curve was used to estimate discriminant power. NSAID indication was provided for 26,259 of the 45,217 patients included in the CADEUS cohort; it was RA for 956 patients. Two models were constructed using the development sample (n = 13,143), stratifying on the dispensation of an immunosuppressive agent or specific antirheumatism treatment. Discriminant power was high for both models (AUC > 0.80) and was not statistically different from that found when applied to the validation sample (n = 13,116). The models derived from this study may help to identify patients prescribed NSAIDs who are likely to have RA in claims databases without medical data such as treatment indication. Copyright © 2012 John Wiley & Sons, Ltd.
ICS-II USA research design and methodology.
Rana, H; Andersen, R M; Nakazono, T T; Davidson, P L
1997-05-01
The purpose of the WHO-sponsored International Collaborative Study of Oral Health Outcomes (ICS-II) was to provide policy-markers and researchers with detailed, reliable, and valid data on the oral health situation in their countries or regions, together with comparative data from other dental care delivery systems. ICS-II used a cross-sectional design with no explicit control groups or experimental interventions. A standardized methodology was developed and tested for collecting and analyzing epidemiological, sociocultural, economic, and delivery system data. Respondent information was obtained by household interviews, and clinical examinations were conducted by calibrated oral epidemiologists. Discussed are the sampling design characteristics for the USA research locations, response rates, samples size for interview and oral examination data, weighting procedures, and statistical methods. SUDAAN was used to adjust variance calculations, since complex sampling designs were used.
Wei, Yi; Gadaria-Rathod, Neha; Epstein, Seth; Asbell, Penny
2013-12-23
To provide standard operating procedures (SOPs) for measuring tear inflammatory cytokine concentrations and to validate the resulting profile as a minimally invasive objective metric and biomarker of ocular surface inflammation for use in multicenter clinical trials on dry eye disease (DED). Standard operating procedures were established and then validated with cytokine standards, quality controls, and masked tear samples collected from local and distant clinical sites. The concentrations of the inflammatory cytokines in tears were quantified using a high-sensitivity human cytokine multiplex kit. A panel of inflammatory cytokines was initially investigated, from which four key inflammatory cytokines (IL-1β, IL-6, INF-γ, and TNF-α) were chosen. Results with cytokine standards statistically satisfied the manufacturer's quality control criteria. Results with pooled tear samples were highly reproducible and reliable with tear volumes ranging from 4 to 10 μL. Incorporation of the SOPs into clinical trials was subsequently validated. Tear samples were collected at a distant clinical site, stored, and shipped to our Biomarker Laboratory, where a masked analysis of the four tear cytokines was successfully performed. Tear samples were also collected from a feasibility study on DED. Inflammatory cytokine concentrations were decreased in tears of subjects who received anti-inflammatory treatment. Standard operating procedures for human tear cytokine assessment suitable for multicenter clinical trials were established. Tear cytokine profiling using these SOPs may provide objective metrics useful for diagnosing, classifying, and analyzing treatment efficacy in inflammatory conditions of the ocular surface, which may further elucidate the mechanisms involved in the pathogenesis of ocular surface disease.
Pupek, Alex; Matthewson, Beverly; Whitman, Erin; Fullarton, Rachel; Chen, Yu
2017-08-28
The pneumatic tube system (PTS) is commonly used in modern clinical laboratories to provide quick specimen delivery. However, its impact on sample integrity and laboratory testing results are still debatable. In addition, each PTS installation and configuration is unique to its institution. We sought to validate our Swisslog PTS by comparing routine chemistry, hematology, coagulation and blood gas test results and sample integrity indices between duplicate samples transported either manually or by PTS. Duplicate samples were delivered to the core laboratory manually by human courier or via the Swisslog PTS. Head-to-head comparisons of 48 routine chemistry, hematology, coagulation and blood gas laboratory tests, and three sample integrity indices were conducted on 41 healthy volunteers and 61 adult patients. The PTS showed no impact on sample hemolysis, lipemia, or icterus indices (all p<0.05). Although alkaline phosphatase, total bilirubin and hemoglobin reached statistical significance (p=0.009, 0.027 and 0.012, respectively), all had very low average bias which ranged from 0.01% to 2%. Potassium, total hemoglobin and percent deoxyhemoglobin were statistically significant for the neonatal capillary tube study (p=0.011, 0.033 and 0.041, respectively) but no biases greater than ±4% were identified for these parameters. All observed differences of these 48 laboratory tests were not clinically significant. The modern PTS investigated in this study is acceptable for reliable sample delivery for routine chemistry, hematology, coagulation and blood gas (in syringe and capillary tube) laboratory tests.
Psychometric properties of the Generalized Anxiety Disorder Inventory in a Canadian sample.
Henderson, Leigh C; Antony, Martin M; Koerner, Naomi
2014-05-01
The Generalized Anxiety Disorder Inventory is a recently developed self-report measure that assesses symptoms of generalized anxiety disorder. Its psychometric properties have not been investigated further since its original development. The current study investigated its psychometric properties in a Canadian student/community sample. Exploratory principal component analysis replicated the original three-component structure. The total scale and subscales demonstrated excellent internal consistency reliability (α = 0.84-0.94) and correlated strongly with the Penn State Worry Questionnaire (r = 0.41-0.74, all ps <0.001) and Generalized Anxiety Disorder-7 (r = 0.55-0.84, all ps <0.001). However, only the total scale and cognitive subscale (r = 0.48-0.49, all ps <0.05) significantly predicted generalized anxiety disorder diagnosis established by diagnostic interview. The somatic subscale in particular may require revision to improve predictive validity. Revision may also be necessary given changes in required somatic symptoms for generalized anxiety disorder diagnostic criteria in more recent versions of the Diagnostic and Statistical Manual of Mental Disorders (i.e. although major changes occurred from Diagnostic and Statistical Manual of Mental Disorders-III-R to Diagnostic and Statistical Manual of Mental Disorders-IV, changes in Diagnostic and Statistical Manual of Mental Disorders-5 were minimal) and the possibility of changes in the upcoming 11th revision of the International Classification of Diseases.
Estimating and comparing microbial diversity in the presence of sequencing errors
Chiu, Chun-Huo
2016-01-01
Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This approach aims to compare diversity estimates for equally-large or equally-complete samples; it is based on the seamless rarefaction and extrapolation sampling curves of Hill numbers, specifically for q = 0, 1 and 2. (2) An asymptotic approach refers to the comparison of the estimated asymptotic diversity profiles. That is, this approach compares the estimated profiles for complete samples or samples whose size tends to be sufficiently large. It is based on statistical estimation of the true Hill number of any order q ≥ 0. In the two approaches, replacing the spurious singleton count by our estimated count, we can greatly remove the positive biases associated with diversity estimates due to spurious singletons and also make fair comparisons across microbial communities, as illustrated in our simulation results and in applying our method to analyze sequencing data from viral metagenomes. PMID:26855872
ASM Based Synthesis of Handwritten Arabic Text Pages
Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
ASM Based Synthesis of Handwritten Arabic Text Pages.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
Validation of the Schizotypal Personality Questionnaire-Brief Form in adolescents.
Fonseca-Pedrero, Eduardo; Paíno-Piñeiro, Mercedes; Lemos-Giráldez, Serafín; Villazón-García, Ursula; Muñiz, José
2009-06-01
The main objective of the study was to validate the Schizotypal Personality Questionnaire-Brief (SPQ-B) in a sample of non-clinical adolescents. In addition, the schizotypal personality structure and differences in the dimensions of schizotypy according to gender and age are analyzed. The sample comprises 1683 students, 818 males (48.6%), with a mean age of 15.9 years (SD=1.2). The results showed that the SPQ-B had adequate psychometric properties. Internal consistency of the subscales and total score ranged from 0.61 to 0.81. Confirmatory factor analyses indicated that the three-factor model (positive, negative, and disorganized) and the four-factor model (positive, paranoid, negative, and disorganized) fit reasonably well in comparison to the remaining models. With regard to gender and age, statistically significant differences were found due to age but not to gender. In line with previous literature, the results confirmed the multi-factor structure of the schizotypal personality in non-clinical adolescent populations. Future studies could use the SPQ-B as a screening self-report of rapid and efficient application for the detection of adolescents vulnerable to the development of schizophrenia-spectrum disorders in the general population, in genetically high-risk samples and in clinical studies.
NASA Astrophysics Data System (ADS)
Little, David L., II
Ongoing changes in values, pedagogy, and curriculum concerning sustainability education necessitate that strong curricular elements are identified in sustainability education. However, quantitative research in sustainability education is largely undeveloped or relies on outdated instruments. In part, this is because no widespread quantitative instrument for measuring related educational outcomes has been developed for the field, though their development is pivotal for future efforts in sustainability education related to STEM majors. This research study details the creation, evaluation, and validation of an instrument -- the STEM Sustainability Engagement Instrument (STEMSEI) -- designed to measure sustainability engagement in post-secondary STEM majors. The study was conducted in three phases, using qualitative methods in phase 1, a concurrent mixed methods design in phase 2, and a sequential mixed methods design in phase 3. The STEMSEI was able to successfully predict statistically significant differences in the sample (n= 1017) that were predicted by prior research in environmental education. The STEMSEI also revealed statistically significant differences between STEM majors' sustainability engagement with a large effect size (.203 ≤ eta2 ≤ .211). As hypothesized, statistically significant differences were found on the environmental scales across gender and present religion. With respect to gender, self-perceived measures of emotional engagement with environmental sustainability was higher with females while males had higher measures in cognitive engagement with respect to knowing information related to environmental sustainability. With respect to present religion, self-perceived measures of general engagement and emotional engagement in environmental sustainability were higher for non-Christians as compared to Christians. On the economic scales, statistically significant differences were found across gender. Specifically, measures of males' self-perceived cognitive engagement in knowing information related to economic sustainability were greater than those of females. Future research should establish the generalizability of these results and further test the validity of the STEMSEI.
Vigli, Georgia; Philippidis, Angelos; Spyros, Apostolos; Dais, Photis
2003-09-10
A combination of (1)H NMR and (31)P NMR spectroscopy and multivariate statistical analysis was used to classify 192 samples from 13 types of vegetable oils, namely, hazelnut, sunflower, corn, soybean, sesame, walnut, rapeseed, almond, palm, groundnut, safflower, coconut, and virgin olive oils from various regions of Greece. 1,2-Diglycerides, 1,3-diglycerides, the ratio of 1,2-diglycerides to total diglycerides, acidity, iodine value, and fatty acid composition determined upon analysis of the respective (1)H NMR and (31)P NMR spectra were selected as variables to establish a classification/prediction model by employing discriminant analysis. This model, obtained from the training set of 128 samples, resulted in a significant discrimination among the different classes of oils, whereas 100% of correct validated assignments for 64 samples were obtained. Different artificial mixtures of olive-hazelnut, olive-corn, olive-sunflower, and olive-soybean oils were prepared and analyzed by (1)H NMR and (31)P NMR spectroscopy. Subsequent discriminant analysis of the data allowed detection of adulteration as low as 5% w/w, provided that fresh virgin olive oil samples were used, as reflected by their high 1,2-diglycerides to total diglycerides ratio (D > or = 0.90).
NASA Astrophysics Data System (ADS)
Schratz, Patrick; Herrmann, Tobias; Brenning, Alexander
2017-04-01
Computational and statistical prediction methods such as the support vector machine have gained popularity in remote-sensing applications in recent years and are often compared to more traditional approaches like maximum-likelihood classification. However, the accuracy assessment of such predictive models in a spatial context needs to account for the presence of spatial autocorrelation in geospatial data by using spatial cross-validation and bootstrap strategies instead of their now more widely used non-spatial equivalent. The R package sperrorest by A. Brenning [IEEE International Geoscience and Remote Sensing Symposium, 1, 374 (2012)] provides a generic interface for performing (spatial) cross-validation of any statistical or machine-learning technique available in R. Since spatial statistical models as well as flexible machine-learning algorithms can be computationally expensive, parallel computing strategies are required to perform cross-validation efficiently. The most recent major release of sperrorest therefore comes with two new features (aside from improved documentation): The first one is the parallelized version of sperrorest(), parsperrorest(). This function features two parallel modes to greatly speed up cross-validation runs. Both parallel modes are platform independent and provide progress information. par.mode = 1 relies on the pbapply package and calls interactively (depending on the platform) parallel::mclapply() or parallel::parApply() in the background. While forking is used on Unix-Systems, Windows systems use a cluster approach for parallel execution. par.mode = 2 uses the foreach package to perform parallelization. This method uses a different way of cluster parallelization than the parallel package does. In summary, the robustness of parsperrorest() is increased with the implementation of two independent parallel modes. A new way of partitioning the data in sperrorest is provided by partition.factor.cv(). This function gives the user the possibility to perform cross-validation at the level of some grouping structure. As an example, in remote sensing of agricultural land uses, pixels from the same field contain nearly identical information and will thus be jointly placed in either the test set or the training set. Other spatial sampling resampling strategies are already available and can be extended by the user.
Ashrafi-Rizi, Hasan; Ramezani, Amir; Koupaei, Hamed Aghajani; Kazempour, Zahra
2014-12-01
Media and Information literacy (MIL) enables people to interpret and make informed judgments as users of information and media, as well as to become skillful creators and producers of information and media messages in their own right. The purpose of this research was to determine the amount of Media and Information Literacy among Isfahan University of Medical Sciences' students using Iranian Media and Information Literacy Questionnaire (IMILQ). This is an applied analytical survey research in which the data were collected by a researcher made questionnaire, provided based on specialists' viewpoints and valid scientific works. Its validity and reliability were confirmed by Library and Information Sciences specialists and Cronbach's alpha (r=0.89) respectively. Statistical population consisted of all students in Isfahan University of Medical Sciences (6000 cases) and the samples were 361. Sampling method was random stratified sampling. Data were analyzed by descriptive and inferential statistics. The findings showed that the mean level of Media and Information Literacy among Isfahan University of Medical Sciences' students was 3.34±0.444 (higher than average). The highest mean was promotion of scientific degree with 3.84±0.975 and the lowest mean was difficulties in starting research with 2.50±1.08. There was significant difference between educational degree, college type and family's income and amount of Media and Information Literacy. The results showed that the students didn't have enough skills in starting the research, defining the research subject as well as confining the research subject. In general, all students and education practitioners should pay special attention to factors affecting in improving Media and Information Literacy as a main capability in using printed and electronic media.
Ashrafi-rizi, Hasan; Ramezani, Amir; Koupaei, Hamed Aghajani; Kazempour, Zahra
2014-01-01
Introduction: Media and Information literacy (MIL) enables people to interpret and make informed judgments as users of information and media, as well as to become skillful creators and producers of information and media messages in their own right. The purpose of this research was to determine the amount of Media and Information Literacy among Isfahan University of Medical Sciences’ students using Iranian Media and Information Literacy Questionnaire (IMILQ). Methods: This is an applied analytical survey research in which the data were collected by a researcher made questionnaire, provided based on specialists’ viewpoints and valid scientific works. Its validity and reliability were confirmed by Library and Information Sciences specialists and Cronbach’s alpha (r=0.89) respectively. Statistical population consisted of all students in Isfahan University of Medical Sciences (6000 cases) and the samples were 361. Sampling method was random stratified sampling. Data were analyzed by descriptive and inferential statistics. Results: The findings showed that the mean level of Media and Information Literacy among Isfahan University of Medical Sciences’ students was 3.34±0.444 (higher than average). The highest mean was promotion of scientific degree with 3.84±0.975 and the lowest mean was difficulties in starting research with 2.50±1.08. There was significant difference between educational degree, college type and family’s income and amount of Media and Information Literacy. Conclusion: The results showed that the students didn’t have enough skills in starting the research, defining the research subject as well as confining the research subject. In general, all students and education practitioners should pay special attention to factors affecting in improving Media and Information Literacy as a main capability in using printed and electronic media. PMID:25684848
Lotan, Tamara L; Wei, Wei; Morais, Carlos L; Hawley, Sarah T; Fazli, Ladan; Hurtado-Coll, Antonio; Troyer, Dean; McKenney, Jesse K; Simko, Jeffrey; Carroll, Peter R; Gleave, Martin; Lance, Raymond; Lin, Daniel W; Nelson, Peter S; Thompson, Ian M; True, Lawrence D; Feng, Ziding; Brooks, James D
2016-06-01
PTEN is the most commonly deleted tumor suppressor gene in primary prostate cancer (PCa) and its loss is associated with poor clinical outcomes and ERG gene rearrangement. We tested whether PTEN loss is associated with shorter recurrence-free survival (RFS) in surgically treated PCa patients with known ERG status. A genetically validated, automated PTEN immunohistochemistry (IHC) protocol was used for 1275 primary prostate tumors from the Canary Foundation retrospective PCa tissue microarray cohort to assess homogeneous (in all tumor tissue sampled) or heterogeneous (in a subset of tumor tissue sampled) PTEN loss. ERG status as determined by a genetically validated IHC assay was available for a subset of 938 tumors. Associations between PTEN and ERG status were assessed using Fisher's exact test. Kaplan-Meier and multivariate weighted Cox proportional models for RFS were constructed. When compared to intact PTEN, homogeneous (hazard ratio [HR] 1.66, p = 0.001) but not heterogeneous (HR 1.24, p = 0.14) PTEN loss was significantly associated with shorter RFS in multivariate models. Among ERG-positive tumors, homogeneous (HR 3.07, p < 0.0001) but not heterogeneous (HR 1.46, p = 0.10) PTEN loss was significantly associated with shorter RFS. Among ERG-negative tumors, PTEN did not reach significance for inclusion in the final multivariate models. The interaction term for PTEN and ERG status with respect to RFS did not reach statistical significance ( p = 0.11) for the current sample size. These data suggest that PTEN is a useful prognostic biomarker and that there is no statistically significant interaction between PTEN and ERG status for RFS. We found that loss of the PTEN tumor suppressor gene in prostate tumors as assessed by tissue staining is correlated with shorter time to prostate cancer recurrence after radical prostatectomy.
Chan, Yvonne L; Schanzenbach, David; Hickerson, Michael J
2014-09-01
Methods that integrate population-level sampling from multiple taxa into a single community-level analysis are an essential addition to the comparative phylogeographic toolkit. Detecting how species within communities have demographically tracked each other in space and time is important for understanding the effects of future climate and landscape changes and the resulting acceleration of extinctions, biological invasions, and potential surges in adaptive evolution. Here, we present a statistical framework for such an analysis based on hierarchical approximate Bayesian computation (hABC) with the goal of detecting concerted demographic histories across an ecological assemblage. Our method combines population genetic data sets from multiple taxa into a single analysis to estimate: 1) the proportion of a community sample that demographically expanded in a temporally clustered pulse and 2) when the pulse occurred. To validate the accuracy and utility of this new approach, we use simulation cross-validation experiments and subsequently analyze an empirical data set of 32 avian populations from Australia that are hypothesized to have expanded from smaller refugia populations in the late Pleistocene. The method can accommodate data set heterogeneity such as variability in effective population size, mutation rates, and sample sizes across species and exploits the statistical strength from the simultaneous analysis of multiple species. This hABC framework used in a multitaxa demographic context can increase our understanding of the impact of historical climate change by determining what proportion of the community responded in concert or independently and can be used with a wide variety of comparative phylogeographic data sets as biota-wide DNA barcoding data sets accumulate. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Effects of wages on smoking decisions of current and past smokers.
Du, Juan; Leigh, J Paul
2015-08-01
We used longitudinal data and instrumental variables (IVs) in a prospective design to test for the causal effects of wages on smoking prevalence among current and past smokers. Nationally representative U.S. data were drawn from the 1999-2009 waves of the Panel Study of Income Dynamics. Our overall sample was restricted to full time employed persons, aged 21-65 years. We excluded part time workers and youths because smoking and wage correlations would be complicated by labor supply decisions. We excluded adult never smokers because people rarely begin smoking after the age of 20 years. IVs were created with state-level minimum wages and unionization rates. We analyzed subsamples of men, women, the less educated, the more educated, quitters, and backsliders. Validity and strength of instruments within the IV analysis were conducted with the Sargan-Hansen J statistic and F tests. We found some evidence that low wages lead to more smoking in the overall sample and substantial evidence for men, persons with high school educations or less (<13 years of schooling), and quitters. Results indicated that 10% increases in wages lead to 5.5 and 4.6 percentage point decreases in smoking for men and the less educated; they also increased the average chance of quitting among base-year smokers from 17.0% to 20.4%. Statistical tests suggested that IVs were strong and valid in most samples. Subjects' other family income, including spouses' wages, was entered as a control variable. Increases in an individual's wages, independent of other income, decreased the prevalence of smoking among current and past smokers. Copyright © 2015 Elsevier Inc. All rights reserved.
The validation of a home food inventory.
Fulkerson, Jayne A; Nelson, Melissa C; Lytle, Leslie; Moe, Stacey; Heitzler, Carrie; Pasch, Keryn E
2008-11-04
Home food inventories provide an efficient method for assessing home food availability; however, few are validated. The present study's aim was to develop and validate a home food inventory that is easily completed by research participants in their homes and includes a comprehensive range of both healthful and less healthful foods that are associated with obesity. A home food inventory (HFI) was developed and tested with two samples. Sample 1 included 51 adult participants and six trained research staff who independently completed the HFI in participants' homes. Sample 2 included 342 families in which parents completed the HFI and the Diet History Questionnaire (DHQ) and students completed three 24-hour dietary recall interviews. HFI items assessed 13 major food categories as well as two categories assessing ready-access to foods in the kitchen and the refrigerator. An obesogenic household food availability score was also created. To assess criterion validity, participants' and research staffs' assessment of home food availability were compared (staff = gold standard). Criterion validity was evaluated with kappa, sensitivity, and specificity. Construct validity was assessed with correlations of five HFI major food category scores with servings of the same foods and associated nutrients from the DHQ and dietary recalls. Kappa statistics for all 13 major food categories and the two ready-access categories ranged from 0.61 to 0.83, indicating substantial agreement. Sensitivity ranged from 0.69 to 0.89, and specificity ranged from 0.86 to 0.95. Spearman correlations between staff and participant major food category scores ranged from 0.71 to 0.97. Correlations between the HFI scores and food group servings and nutrients on the DHQ (parents) were all significant (p < .05) while about half of associations between the HFI and dietary recall interviews (adolescents) were significant (p < .05). The obesogenic home food availability score was significantly associated (p < .05) with energy intake of both parents and adolescents. This new home food inventory is valid, participant-friendly, and may be useful for community-based behavioral nutrition and obesity prevention research. The inventory builds on previous measures by including a wide range of healthful and less healthful foods rather than foods targeted for a specific intervention.
Duarte, Elisabeth Carmen; Garcia, Leila Posenato; de Araújo, Wildo Navegantes; Velez, Maria P
2017-12-02
Zika infection during pregnancy (ZIKVP) is known to be associated with adverse outcomes. Studies on this matter involve both rare outcomes and rare exposures and methodological choices are not straightforward. Cohort studies will surely offer more robust evidences, but their efficiency must be enhanced. We aim to contribute to the debate on sample selection strategies in cohort studies to assess outcomes associated with ZKVP. A study can be statistically more efficient than another if its estimates are more accurate (precise and valid), even if the studies involve the same number of subjects. Sample size and specific design strategies can enhance or impair the statistical efficiency of a study, depending on how the subjects are distributed in subgroups pertinent to the analysis. In most ZIKVP cohort studies to date there is an a priori identification of the source population (pregnant women, regardless of their exposure status) which is then sampled or included in its entirety (census). Subsequently, the group of pregnant women is classified according to exposure (presence or absence of ZIKVP), respecting the exposed:unexposed ratio in the source population. We propose that the sample selection be done from the a priori identification of groups of pregnant women exposed and unexposed to ZIKVP. This method will allow for an oversampling (even 100%) of the pregnant women with ZKVP and a optimized sampling from the general population of pregnant women unexposed to ZIKVP, saving resources in the unexposed group and improving the expected number of incident cases (outcomes) overall. We hope that this proposal will broaden the methodological debate on the improvement of statistical power and protocol harmonization of cohort studies that aim to evaluate the association between Zika infection during pregnancy and outcomes for the offspring, as well as those with similar objectives.
Chang, Hsiu-Ju; Wu, Chiung-Jane; Chen, Tzen Wen; Cheng, Andrew Tai Ann; Lin, Kuan-Chia; Rong, Jiin-Ru; Lee, Hsin-Chien
2011-05-01
Although prior research has proposed that several risk factors are conceptually and positively related to suicidal behavior, researchers have also suggested that suicide may be multifaceted. The Life Attitude Schedule (LAS) measures a broad range of suicide-related behaviors, including life-enhancing and life-threatening behaviors. This study aimed to translate the LAS into Chinese and evaluate the psychometric properties of the new version (LAS-C). A cross-sectional and descriptive design was used. Data were collected from high schools in the city of Taipei in northern Taiwan. A convenience sample of 1492 high school students was recruited from five high schools in Taipei. We used the Multi-Health Systems (MHS) translation policy to guide the translation process. Reliability was evaluated by internal consistency (represented by Cronbach's α coefficients) and test-retest (represented by intraclass correlation). Validity was demonstrated by content, convergent, divergent, concurrent, and contrast group comparison. Confirmatory factor analysis was further used to examine the theoretical model and to support construct validity. The Cronbach's α coefficient for the whole scale of the LAS-C and its subscales ranged from 0.70 to 0.91. The Intraclass Correlation Coefficient (ICC) ranged from 0.76 to 0.89 on the whole scale and its subscales, and were all statistically significant, at least at the p<0.05 level, indicating good stability over a three-week period. Validity was supported by a Content Validity Index (CVI) of 0.99, convergent, divergent, current, and contrast group comparison validity. Confirmatory factor analysis supported the theoretical model, further providing solid evidence of construct validity. The LAS-C has proper psychometric properties. Future studies must be conducted to shorten the items to form a briefer version. Copyright © 2010 Elsevier Ltd. All rights reserved.
Rekleiti, Maria; Souliotis, Kyriakos; Sarafis, Pavlos; Kyriazis, Ioannis; Tsironi, Maria
2018-06-01
The present study focuses on studying the validity and reliability of the Greek edition of DQOL-BCI. DQOL-BCI includes 15 questions-elements that are evaluated on a 5-grade scale like Likert and two general form-shapes. The translation process was conducted in conformity with the guidelines of EuroQol group. A non-random sample of 65 people-patients diagnosed with diabetes I and II was selected. The questionnaire that was used to collect the data was the translated version of DQOL-BCI, and included the demographic characteristics of the interviewees. The content validity of DQOL-BCI was re-examined from a team of five experts (expert panel) for qualitative and quantitative performance. The completion of the questionnaire was done via a personal interview. The sample consisted of 58 people (35 men and 23 women, 59.9 ± 10.9 years). The translation of the questionnaire was found appropriate in accordance to the peculiarities of the Greek language and culture. The largest deviation of values is observed in QOL1 (1.71) in comparison to QOL6 (2.98). The difference between the standard deviations is close to 0.6. The statistics results of the tests showed satisfactory content validity and high construct validity, while the high values for Cronbach alpha index (0.95) reveal high reliability and internal consistency. The Greek version of DQOL-BCI has acceptable psychometric properties and appears to demonstrate high internal reliability and satisfactory construct validity, which allows its use as an important tool in evaluating the quality of life of diabetic patients in relation to their health. Copyright © 2018. Published by Elsevier B.V.
A systematic review of the quality of homeopathic clinical trials
Jonas, Wayne B; Anderson, Rachel L; Crawford, Cindy C; Lyons, John S
2001-01-01
Background While a number of reviews of homeopathic clinical trials have been done, all have used methods dependent on allopathic diagnostic classifications foreign to homeopathic practice. In addition, no review has used established and validated quality criteria allowing direct comparison of the allopathic and homeopathic literature. Methods In a systematic review, we compared the quality of clinical-trial research in homeopathy to a sample of research on conventional therapies using a validated and system-neutral approach. All clinical trials on homeopathic treatments with parallel treatment groups published between 1945–1995 in English were selected. All were evaluated with an established set of 33 validity criteria previously validated on a broad range of health interventions across differing medical systems. Criteria covered statistical conclusion, internal, construct and external validity. Reliability of criteria application is greater than 0.95. Results 59 studies met the inclusion criteria. Of these, 79% were from peer-reviewed journals, 29% used a placebo control, 51% used random assignment, and 86% failed to consider potentially confounding variables. The main validity problems were in measurement where 96% did not report the proportion of subjects screened, and 64% did not report attrition rate. 17% of subjects dropped out in studies where this was reported. There was practically no replication of or overlap in the conditions studied and most studies were relatively small and done at a single-site. Compared to research on conventional therapies the overall quality of studies in homeopathy was worse and only slightly improved in more recent years. Conclusions Clinical homeopathic research is clearly in its infancy with most studies using poor sampling and measurement techniques, few subjects, single sites and no replication. Many of these problems are correctable even within a "holistic" paradigm given sufficient research expertise, support and methods. PMID:11801202
Bowden, Jack; Del Greco M, Fabiola; Minelli, Cosetta; Davey Smith, George; Sheehan, Nuala A; Thompson, John R
2016-12-01
: MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We demonstrate our proposed approach for a two-sample summary data MR analysis to estimate the causal effect of low-density lipoprotein on heart disease risk. A high value of IGX2 close to 1 indicates that dilution does not materially affect the standard MR-Egger analyses for these data. : Care must be taken to assess the NOME assumption via the IGX2 statistic before implementing standard MR-Egger regression in the two-sample summary data context. If IGX2 is sufficiently low (less than 90%), inferences from the method should be interpreted with caution and adjustment methods considered. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.
PCA as a practical indicator of OPLS-DA model reliability.
Worley, Bradley; Powers, Robert
Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.
Jeddi, Fatemeh Rangraz; Farzandipoor, Mehrdad; Arabfard, Masoud; Hosseini, Azam Haj Mohammad
2014-04-01
The purpose of this study was investigating situation and presenting a conceptual model for clinical governance information system by using UML in two sample hospitals. However, use of information is one of the fundamental components of clinical governance; but unfortunately, it does not pay much attention to information management. A cross sectional study was conducted in October 2012- May 2013. Data were gathered through questionnaires and interviews in two sample hospitals. Face and content validity of the questionnaire has been confirmed by experts. Data were collected from a pilot hospital and reforms were carried out and Final questionnaire was prepared. Data were analyzed by descriptive statistics and SPSS 16 software. With the scenario derived from questionnaires, UML diagrams are presented by using Rational Rose 7 software. The results showed that 32.14 percent Indicators of the hospitals were calculated. Database was not designed and 100 percent of the hospital's clinical governance was required to create a database. Clinical governance unit of hospitals to perform its mission, do not have access to all the needed indicators. Defining of Processes and drawing of models and creating of database are essential for designing of information systems.
Jeddi, Fatemeh Rangraz; Farzandipoor, Mehrdad; Arabfard, Masoud; Hosseini, Azam Haj Mohammad
2016-04-01
The purpose of this study was investigating situation and presenting a conceptual model for clinical governance information system by using UML in two sample hospitals. However, use of information is one of the fundamental components of clinical governance; but unfortunately, it does not pay much attention to information management. A cross sectional study was conducted in October 2012- May 2013. Data were gathered through questionnaires and interviews in two sample hospitals. Face and content validity of the questionnaire has been confirmed by experts. Data were collected from a pilot hospital and reforms were carried out and Final questionnaire was prepared. Data were analyzed by descriptive statistics and SPSS 16 software. With the scenario derived from questionnaires, UML diagrams are presented by using Rational Rose 7 software. The results showed that 32.14 percent Indicators of the hospitals were calculated. Database was not designed and 100 percent of the hospital's clinical governance was required to create a database. Clinical governance unit of hospitals to perform its mission, do not have access to all the needed indicators. Defining of Processes and drawing of models and creating of database are essential for designing of information systems.
K(3)EDTA Vacuum Tubes Validation for Routine Hematological Testing.
Lima-Oliveira, Gabriel; Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Poli, Giovanni; Solero, Giovanni Pietro; Picheth, Geraldo; Guidi, Gian Cesare
2012-01-01
Background and Objective. Some in vitro diagnostic devices (e.g, blood collection vacuum tubes and syringes for blood analyses) are not validated before the quality laboratory managers decide to start using or to change the brand. Frequently, the laboratory or hospital managers select the vacuum tubes for blood collection based on cost considerations or on relevance of a brand. The aim of this study was to validate two dry K(3)EDTA vacuum tubes of different brands for routine hematological testing. Methods. Blood specimens from 100 volunteers in two different K(3)EDTA vacuum tubes were collected by a single, expert phlebotomist. The routine hematological testing was done on Advia 2120i hematology system. The significance of the differences between samples was assessed by paired Student's t-test after checking for normality. The level of statistical significance was set at P < 0.05. Results and Conclusions. Different brand's tubes evaluated can represent a clinically relevant source of variations only on mean platelet volume (MPV) and platelet distribution width (PDW). Basically, our validation will permit the laboratory or hospital managers to select the brand's vacuum tubes validated according to him/her technical or economical reasons for routine hematological tests.
K3EDTA Vacuum Tubes Validation for Routine Hematological Testing
Lima-Oliveira, Gabriel; Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Poli, Giovanni; Solero, Giovanni Pietro; Picheth, Geraldo; Guidi, Gian Cesare
2012-01-01
Background and Objective. Some in vitro diagnostic devices (e.g, blood collection vacuum tubes and syringes for blood analyses) are not validated before the quality laboratory managers decide to start using or to change the brand. Frequently, the laboratory or hospital managers select the vacuum tubes for blood collection based on cost considerations or on relevance of a brand. The aim of this study was to validate two dry K3EDTA vacuum tubes of different brands for routine hematological testing. Methods. Blood specimens from 100 volunteers in two different K3EDTA vacuum tubes were collected by a single, expert phlebotomist. The routine hematological testing was done on Advia 2120i hematology system. The significance of the differences between samples was assessed by paired Student's t-test after checking for normality. The level of statistical significance was set at P < 0.05. Results and Conclusions. Different brand's tubes evaluated can represent a clinically relevant source of variations only on mean platelet volume (MPV) and platelet distribution width (PDW). Basically, our validation will permit the laboratory or hospital managers to select the brand's vacuum tubes validated according to him/her technical or economical reasons for routine hematological tests. PMID:22888448
Corr, Philip J; Cooper, Andrew J
2016-11-01
We report the development and validation of a questionnaire measure of the revised reinforcement sensitivity theory (rRST) of personality. Starting with qualitative responses to defensive and approach scenarios modeled on typical rodent ethoexperimental situations, exploratory and confirmatory factor analyses (CFAs) revealed a robust 6-factor structure: 2 unitary defensive factors, fight-flight-freeze system (FFFS; related to fear) and the behavioral inhibition system (BIS; related to anxiety); and 4 behavioral approach system (BAS) factors (Reward Interest, Goal-Drive Persistence, Reward Reactivity, and Impulsivity). Theoretically motivated thematic facets were employed to sample the breadth of defensive space, comprising FFFS (Flight, Freeze, and Active Avoidance) and BIS (Motor Planning Interruption, Worry, Obsessive Thoughts, and Behavioral Disengagement). Based on theoretical considerations, and statistically confirmed, a separate scale for Defensive Fight was developed. Validation evidence for the 6-factor structure came from convergent and discriminant validity shown by correlations with existing personality scales. We offer the Reinforcement Sensitivity Theory of Personality Questionnaire to facilitate future research specifically on rRST and, more broadly, on approach-avoidance theories of personality. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
García-Carpintero, María Ángeles; Rodríguez-Santero, Javier; Porcel-Gálvez, Ana María
To design and validate a specific instrument to detect exercised and suffered in the relations of young couples in violence. Descriptive study of validation clinimetric. Stratified by sex and area of knowledge, which was adopted as inclusion criteria have or have had any relationship. The sample consisted of 447 subjects. We obtained the Multidimensional Scale Dating Violence (EMVN), 32 items with three dimensions: physical and sexual assault, behavior control (cyberbullying, surveillance and harassment) and abuse psicoemocional (disparagement and domination), as a victim or as aggressor. No statistically significant differences were found between the violence exerted and the violence suffered, but it was based on sex. The EMVN is a valid and reliable scale that measures the different elements of violence in couples of young people and you can suppose a resource for the comprehensive detection of violent behaviors in dating relationships that are established among young people. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
NASA Astrophysics Data System (ADS)
Dutton, Gregory
Forensic science is a collection of applied disciplines that draws from all branches of science. A key question in forensic analysis is: to what degree do a piece of evidence and a known reference sample share characteristics? Quantification of similarity, estimation of uncertainty, and determination of relevant population statistics are of current concern. A 2016 PCAST report questioned the foundational validity and the validity in practice of several forensic disciplines, including latent fingerprints, firearms comparisons and DNA mixture interpretation. One recommendation was the advancement of objective, automated comparison methods based on image analysis and machine learning. These concerns parallel the National Institute of Justice's ongoing R&D investments in applied chemistry, biology and physics. NIJ maintains a funding program spanning fundamental research with potential for forensic application to the validation of novel instruments and methods. Since 2009, NIJ has funded over 179M in external research to support the advancement of accuracy, validity and efficiency in the forensic sciences. An overview of NIJ's programs will be presented, with examples of relevant projects from fluid dynamics, 3D imaging, acoustics, and materials science.
ERIC Educational Resources Information Center
Osler, James Edward, II
2015-01-01
This monograph provides an epistemological rational for the Accumulative Manifold Validation Analysis [also referred by the acronym "AMOVA"] statistical methodology designed to test psychometric instruments. This form of inquiry is a form of mathematical optimization in the discipline of linear stochastic modelling. AMOVA is an in-depth…
Probability of Detection (POD) as a statistical model for the validation of qualitative methods.
Wehling, Paul; LaBudde, Robert A; Brunelle, Sharon L; Nelson, Maria T
2011-01-01
A statistical model is presented for use in validation of qualitative methods. This model, termed Probability of Detection (POD), harmonizes the statistical concepts and parameters between quantitative and qualitative method validation. POD characterizes method response with respect to concentration as a continuous variable. The POD model provides a tool for graphical representation of response curves for qualitative methods. In addition, the model allows comparisons between candidate and reference methods, and provides calculations of repeatability, reproducibility, and laboratory effects from collaborative study data. Single laboratory study and collaborative study examples are given.
[Methodological design of the National Health and Nutrition Survey 2016].
Romero-Martínez, Martín; Shamah-Levy, Teresa; Cuevas-Nasu, Lucía; Gómez-Humarán, Ignacio Méndez; Gaona-Pineda, Elsa Berenice; Gómez-Acosta, Luz María; Rivera-Dommarco, Juan Ángel; Hernández-Ávila, Mauricio
2017-01-01
Describe the design methodology of the halfway health and nutrition national survey (Ensanut-MC) 2016. The Ensanut-MC is a national probabilistic survey whose objective population are the inhabitants of private households in Mexico. The sample size was determined to make inferences on the urban and rural areas in four regions. Describes main design elements: target population, topics of study, sampling procedure, measurement procedure and logistics organization. A final sample of 9 479 completed household interviews, and a sample of 16 591 individual interviews. The response rate for households was 77.9%, and the response rate for individuals was 91.9%. The Ensanut-MC probabilistic design allows valid statistical inferences about interest parameters for Mexico´s public health and nutrition, specifically on overweight, obesity and diabetes mellitus. Updated information also supports the monitoring, updating and formulation of new policies and priority programs.
MO-G-12A-01: Quantitative Imaging Metrology: What Should Be Assessed and How?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giger, M; Petrick, N; Obuchowski, N
The first two symposia in the Quantitative Imaging Track focused on 1) the introduction of quantitative imaging (QI) challenges and opportunities, and QI efforts of agencies and organizations such as the RSNA, NCI, FDA, and NIST, and 2) the techniques, applications, and challenges of QI, with specific examples from CT, PET/CT, and MR. This third symposium in the QI Track will focus on metrology and its importance in successfully advancing the QI field. While the specific focus will be on QI, many of the concepts presented are more broadly applicable to many areas of medical physics research and applications. Asmore » such, the topics discussed should be of interest to medical physicists involved in imaging as well as therapy. The first talk of the session will focus on the introduction to metrology and why it is critically important in QI. The second talk will focus on appropriate methods for technical performance assessment. The third talk will address statistically valid methods for algorithm comparison, a common problem not only in QI but also in other areas of medical physics. The final talk in the session will address strategies for publication of results that will allow statistically valid meta-analyses, which is critical for combining results of individual studies with typically small sample sizes in a manner that can best inform decisions and advance the field. Learning Objectives: Understand the importance of metrology in the QI efforts. Understand appropriate methods for technical performance assessment. Understand methods for comparing algorithms with or without reference data (i.e., “ground truth”). Understand the challenges and importance of reporting results in a manner that allows for statistically valid meta-analyses.« less
Manterola, Carlos; Torres, Rodrigo; Burgos, Luis; Vial, Manuel; Pineda, Viviana
2006-07-01
Surgery is a curative treatment for gastric cancer (GC). As relapse is frequent, adjuvant therapies such as postoperative chemo radiotherapy have been tried. In Chile, some hospitals adopted Macdonald's study as a protocol for the treatment of GC. To determine methodological quality and internal and external validity of the Macdonald study. Three instruments were applied that assess methodological quality. A critical appraisal was done and the internal and external validity of the methodological quality was analyzed with two scales: MINCIR (Methodology and Research in Surgery), valid for therapy studies and CONSORT (Consolidated Standards of Reporting Trials), valid for randomized controlled trials (RCT). Guides and scales were applied by 5 researchers with training in clinical epidemiology. The reader's guide verified that the Macdonald study was not directed to answer a clearly defined question. There was random assignment, but the method used is not described and the patients were not considered until the end of the study (36% of the group with surgery plus chemo radiotherapy did not complete treatment). MINCIR scale confirmed a multicentric RCT, not blinded, with an unclear randomized sequence, erroneous sample size estimation, vague objectives and no exclusion criteria. CONSORT system proved the lack of working hypothesis and specific objectives as well as an absence of exclusion criteria and identification of the primary variable, an imprecise estimation of sample size, ambiguities in the randomization process, no blinding, an absence of statistical adjustment and the omission of a subgroup analysis. The instruments applied demonstrated methodological shortcomings that compromise the internal and external validity of the.
Khalili, Robabe; Sirati Nir, Masoud; Ebadi, Abbas; Tavallai, Abbas; Habibi, Mehdi
2017-04-01
The Cohen Perceived Stress Scale is being used widely in various countries. The present study evaluated the validity and reliability of the Cohen 10-item Perceived Stress Scale (PSS-10) in assessing tension headache, migraine, and stress-related diseases in Iran. This study is a methodological and cross-sectional descriptive investigation of 100 patients with chronic headache admitted to the pain clinic of Baqiyatallah Educational and Therapeutic Center. Convenience sampling was used for subject selection. PSS psychometric properties were evaluated in two stages. First, the standard scale was translated. Then, the face validity, content, and construct of the translated version were determined. The average age of participants was 38 years with a standard deviation (SD) of 13.2. As for stress levels, 12% were within the normal range, 36% had an intermediate level, and 52% had a high level of stress. The face validity and scale content were remarkable, and the KMO coefficient was 0.82. Bartlett's test yielded 0.327 which was statistically significant (p<0.0001) representing the quality of the sample. In factor analysis of the scale, the two elements of "coping" and "distress" were determined. A Cronbach's Alpha coefficient of 0.72 was obtained. This confirmed the remarkable internal consistency and stability of the scale through repeated measure tests (0.93). The Persian PSS-10 has good internal consistency and reliability. The availability of a validated Persian PSS-10 would indicate a link between stress and chronic headache. Copyright © 2017 Elsevier B.V. All rights reserved.
Wen, Kuang-Yi; Gustafson, David H; Hawkins, Robert P; Brennan, Patricia F; Dinauer, Susan; Johnson, Pauley R; Siegler, Tracy
2010-01-01
To develop and validate the Readiness for Implementation Model (RIM). This model predicts a healthcare organization's potential for success in implementing an interactive health communication system (IHCS). The model consists of seven weighted factors, with each factor containing five to seven elements. Two decision-analytic approaches, self-explicated and conjoint analysis, were used to measure the weights of the RIM with a sample of 410 experts. The RIM model with weights was then validated in a prospective study of 25 IHCS implementation cases. Orthogonal main effects design was used to develop 700 conjoint-analysis profiles, which varied on seven factors. Each of the 410 experts rated the importance and desirability of the factors and their levels, as well as a set of 10 different profiles. For the prospective 25-case validation, three time-repeated measures of the RIM scores were collected for comparison with the implementation outcomes. Two of the seven factors, 'organizational motivation' and 'meeting user needs,' were found to be most important in predicting implementation readiness. No statistically significant difference was found in the predictive validity of the two approaches (self-explicated and conjoint analysis). The RIM was a better predictor for the 1-year implementation outcome than the half-year outcome. The expert sample, the order of the survey tasks, the additive model, and basing the RIM cut-off score on experience are possible limitations of the study. The RIM needs to be empirically evaluated in institutions adopting IHCS and sustaining the system in the long term.
Nedjat, Saharnaz; Montazeri, Ali; Holakouie, Kourosh; Mohammad, Kazem; Majdzadeh, Reza
2008-03-21
The objective of the current study was to translate and validate the Iranian version of the WHOQOL-BREF. A forward-backward translation procedure was followed to develop the Iranian version of the questionnaire. A stratified random sample of individuals aged 18 and over completed the questionnaire in Tehran, Iran. Psychometric properties of the instrument including reliability (internal consistency, and test-retest analysis), validity (known groups' comparison and convergent validity), and items' correlation with their hypothesized domains were assessed. In all 1164 individuals entered into the study. The mean age of the participants was 36.6 (SD = 13.2) years, and the mean years of their formal education was 10.7 (SD = 4.4). In general the questionnaire received well and all domains met the minimum reliability standards (Cronbach's alpha and intra-class correlation > 0.7), except for social relationships (alpha = 0.55). Performing known groups' comparison analysis, the results indicated that the questionnaire discriminated well between subgroups of the study samples differing in their health status. Since the WHOQOL-BREF demonstrated statistically significant correlation with the Iranian version of the SF-36 as expected, the convergent validity of the questionnaire was found to be desirable. Correlation matrix also showed satisfactory results in all domains except for social relationships. This study has provided some preliminary evidence of the reliability and validity of the WHOQOL-BREF to be used in Iran, though further research is required to challenge the problems of reliability in one of the dimensions and the instrument's factor structure.
Marucci, Gianluca; Pezzotti, Patrizio; Pozio, Edoardo
2009-02-23
To control Trichinella spp. infection in the European Union, all slaughtered pigs should be tested by one of the approved digestion methods described in EU directive 2075/2005. The aim of the present work was to evaluate, by a ring trial, the sensitivity of the digestion method used at the National Reference Laboratories for Parasites (NRLP). These Laboratories are responsible for the quality of the detection method in their own country. Of the 27 EU countries, only three (Hungary, Luxembourg and Malta) did not participate in the ring trial. Each participating laboratory received 10 samples of 100g of minced pork containing 3-5 larvae (3 samples), 10-20 larvae (3 samples), 30-50 larvae (3 samples), and one negative control. In each positive sample, there were living Trichinella spiralis larvae without the collagen capsule, obtained by partial artificial digestion of muscle tissue from infected mice. No false positive sample was found in any laboratories, whereas nine laboratories (37.5%) failed to detect some positive samples with the percentage of false negatives ranging from 11 to 100%. The variation between expected and reported larval counts observed among the participating laboratories was statistically significant. There was a direct correlation between the consistency of the results and the use of a validated/accredited digestion method. Conversely, there was no correlation between the consistency of the results and the number of digestions performed yearly by the NRLP. These results support the importance of validating the test.
NASA Astrophysics Data System (ADS)
Singh, Gurjeet; Panda, Rabindra K.; Mohanty, Binayak P.; Jana, Raghavendra B.
2016-05-01
Strategic ground-based sampling of soil moisture across multiple scales is necessary to validate remotely sensed quantities such as NASA's Soil Moisture Active Passive (SMAP) product. In the present study, in-situ soil moisture data were collected at two nested scale extents (0.5 km and 3 km) to understand the trend of soil moisture variability across these scales. This ground-based soil moisture sampling was conducted in the 500 km2 Rana watershed situated in eastern India. The study area is characterized as sub-humid, sub-tropical climate with average annual rainfall of about 1456 mm. Three 3x3 km square grids were sampled intensively once a day at 49 locations each, at a spacing of 0.5 km. These intensive sampling locations were selected on the basis of different topography, soil properties and vegetation characteristics. In addition, measurements were also made at 9 locations around each intensive sampling grid at 3 km spacing to cover a 9x9 km square grid. Intensive fine scale soil moisture sampling as well as coarser scale samplings were made using both impedance probes and gravimetric analyses in the study watershed. The ground-based soil moisture samplings were conducted during the day, concurrent with the SMAP descending overpass. Analysis of soil moisture spatial variability in terms of areal mean soil moisture and the statistics of higher-order moments, i.e., the standard deviation, and the coefficient of variation are presented. Results showed that the standard deviation and coefficient of variation of measured soil moisture decreased with extent scale by increasing mean soil moisture.
Virtual Model Validation of Complex Multiscale Systems: Applications to Nonlinear Elastostatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oden, John Tinsley; Prudencio, Ernest E.; Bauman, Paul T.
We propose a virtual statistical validation process as an aid to the design of experiments for the validation of phenomenological models of the behavior of material bodies, with focus on those cases in which knowledge of the fabrication process used to manufacture the body can provide information on the micro-molecular-scale properties underlying macroscale behavior. One example is given by models of elastomeric solids fabricated using polymerization processes. We describe a framework for model validation that involves Bayesian updates of parameters in statistical calibration and validation phases. The process enables the quanti cation of uncertainty in quantities of interest (QoIs) andmore » the determination of model consistency using tools of statistical information theory. We assert that microscale information drawn from molecular models of the fabrication of the body provides a valuable source of prior information on parameters as well as a means for estimating model bias and designing virtual validation experiments to provide information gain over calibration posteriors.« less
Hagen, Inger Hilde; Svindseth, Marit Følsvik; Nesset, Erik; Orner, Roderick; Iversen, Valentina Cabral
2018-03-27
The experience of having their new-borns admitted to an intensive care unit (NICU) can be extremely distressing. Subsequent risk of post-incident-adjustment difficulties are increased for parents, siblings, and affected families. Patient and next of kin satisfaction surveys provide key indicators of quality in health care. Methodically constructed and validated survey tools are in short supply and parents' experiences of care in Neonatal Intensive Care Units is under-researched. This paper reports a validation of the Neonatal Satisfaction Survey (NSS-8) in six Norwegian NICUs. Parents' survey returns were collected using the Neonatal Satisfaction Survey (NSS-13). Data quality and psychometric properties were systematically assessed using exploratory factor analysis, tests of internal consistency, reliability, construct, convergent and discriminant validity. Each set of hospital returns were subjected to an apostasy analysis before an overall satisfaction rate was calculated. The survey sample of 568 parents represents 45% of total eligible population for the period of the study. Missing data accounted for 1,1% of all returns. Attrition analysis shows congruence between sample and total population. Exploratory factor analysis identified eight factors of concern to parents,"Care and Treatment", "Doctors", "Visits", "Information", "Facilities", "Parents' Anxiety", "Discharge" and "Sibling Visits". All factors showed satisfactory internal consistency, good reliability (Cronbach's alpha ranged from 0.70-0.94). For the whole scale of 51 items α 0.95. Convergent validity using Spearman's rank between the eight factors and question measuring overall satisfaction was significant on all factors. Discriminant validity was established for all factors. Overall satisfaction rates ranged from 86 to 90% while for each of the eight factors measures of satisfaction varied between 64 and 86%. The NSS-8 questionnaire is a valid and reliable scale for measuring parents' assessment of quality of care in NICU. Statistical analysis confirms the instrument's capacity to gauge parents' experiences of NICU. Further research is indicated to validate the survey questionnaire in other Nordic countries and beyond.
Khalid, Tanzeela; White, Paul; De Lacy Costello, Ben; Persad, Raj; Ewen, Richard; Johnson, Emmanuel; Probert, Chris S.; Ratcliffe, Norman
2013-01-01
There is a need to reduce the number of cystoscopies on patients with haematuria. Presently there are no reliable biomarkers to screen for bladder cancer. In this paper, we evaluate a new simple in–house fabricated, GC-sensor device in the diagnosis of bladder cancer based on volatiles. Sensor outputs from 98 urine samples were used to build and test diagnostic models. Samples were taken from 24 patients with transitional (urothelial) cell carcinoma (age 27-91 years, median 71 years) and 74 controls presenting with urological symptoms, but without a urological malignancy (age 29-86 years, median 64 years); results were analysed using two statistical approaches to assess the robustness of the methodology. A two-group linear discriminant analysis method using a total of 9 time points (which equates to 9 biomarkers) correctly assigned 24/24 (100%) of cancer cases and 70/74 (94.6%) controls. Under leave-one-out cross-validation 23/24 (95.8%) of cancer cases were correctly predicted with 69/74 (93.2%) of controls. For partial least squares discriminant analysis, the correct leave-one-out cross-validation prediction values were 95.8% (cancer cases) and 94.6% (controls). These data are an improvement on those reported by other groups studying headspace gases and also superior to current clinical techniques. This new device shows potential for the diagnosis of bladder cancer, but the data must be reproduced in a larger study. PMID:23861976