Sample records for reliability statistics

  1. Making statistical inferences about software reliability

    NASA Technical Reports Server (NTRS)

    Miller, Douglas R.

    1988-01-01

    Failure times of software undergoing random debugging can be modelled as order statistics of independent but nonidentically distributed exponential random variables. Using this model inferences can be made about current reliability and, if debugging continues, future reliability. This model also shows the difficulty inherent in statistical verification of very highly reliable software such as that used by digital avionics in commercial aircraft.

  2. Statistical modelling of software reliability

    NASA Technical Reports Server (NTRS)

    Miller, Douglas R.

    1991-01-01

    During the six-month period from 1 April 1991 to 30 September 1991 the following research papers in statistical modeling of software reliability appeared: (1) A Nonparametric Software Reliability Growth Model; (2) On the Use and the Performance of Software Reliability Growth Models; (3) Research and Development Issues in Software Reliability Engineering; (4) Special Issues on Software; and (5) Software Reliability and Safety.

  3. Statistics-related and reliability-physics-related failure processes in electronics devices and products

    NASA Astrophysics Data System (ADS)

    Suhir, E.

    2014-05-01

    The well known and widely used experimental reliability "passport" of a mass manufactured electronic or a photonic product — the bathtub curve — reflects the combined contribution of the statistics-related and reliability-physics (physics-of-failure)-related processes. When time progresses, the first process results in a decreasing failure rate, while the second process associated with the material aging and degradation leads to an increased failure rate. An attempt has been made in this analysis to assess the level of the reliability physics-related aging process from the available bathtub curve (diagram). It is assumed that the products of interest underwent the burn-in testing and therefore the obtained bathtub curve does not contain the infant mortality portion. It has been also assumed that the two random processes in question are statistically independent, and that the failure rate of the physical process can be obtained by deducting the theoretically assessed statistical failure rate from the bathtub curve ordinates. In the carried out numerical example, the Raleigh distribution for the statistical failure rate was used, for the sake of a relatively simple illustration. The developed methodology can be used in reliability physics evaluations, when there is a need to better understand the roles of the statistics-related and reliability-physics-related irreversible random processes in reliability evaluations. The future work should include investigations on how powerful and flexible methods and approaches of the statistical mechanics can be effectively employed, in addition to reliability physics techniques, to model the operational reliability of electronic and photonic products.

  4. Statistical modeling of software reliability

    NASA Technical Reports Server (NTRS)

    Miller, Douglas R.

    1992-01-01

    This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.

  5. The reliability of cause-of-death coding in The Netherlands.

    PubMed

    Harteloh, Peter; de Bruin, Kim; Kardaun, Jan

    2010-08-01

    Cause-of-death statistics are a major source of information for epidemiological research or policy decisions. Information on the reliability of these statistics is important for interpreting trends in time or differences between populations. Variations in coding the underlying cause of death could hinder the attribution of observed differences to determinants of health. Therefore we studied the reliability of cause-of-death statistics in The Netherlands. We performed a double coding study. Death certificates from the month of May 2005 were coded again in 2007. Each death certificate was coded manually by four coders. Reliability was measured by calculating agreement between coders (intercoder agreement) and by calculating the consistency of each individual coder in time (intracoder agreement). Our analysis covered an amount of 10,833 death certificates. The intercoder agreement of four coders on the underlying cause of death was 78%. In 2.2% of the cases coders agreed on a change of the code assigned in 2005. The (mean) intracoder agreement of four coders was 89%. Agreement was associated with the specificity of the ICD-10 code (chapter, three digits, four digits), the age of the deceased, the number of coders and the number of diseases reported on the death certificate. The reliability of cause-of-death statistics turned out to be high (>90%) for major causes of death such as cancers and acute myocardial infarction. For chronic diseases, such as diabetes and renal insufficiency, reliability was low (<70%). The reliability of cause-of-death statistics varies by ICD-10 code/chapter. A statistical office should provide coders with (additional) rules for coding diseases with a low reliability and evaluate these rules regularly. Users of cause-of-death statistics should exercise caution when interpreting causes of death with a low reliability. Studies of reliability should take into account the number of coders involved and the number of codes on a death certificate.

  6. Surveys Assessing Students' Attitudes toward Statistics: A Systematic Review of Validity and Reliability

    ERIC Educational Resources Information Center

    Nolan, Meaghan M.; Beran, Tanya; Hecker, Kent G.

    2012-01-01

    Students with positive attitudes toward statistics are likely to show strong academic performance in statistics courses. Multiple surveys measuring students' attitudes toward statistics exist; however, a comparison of the validity and reliability of interpretations based on their scores is needed. A systematic review of relevant electronic…

  7. Application of a truncated normal failure distribution in reliability testing

    NASA Technical Reports Server (NTRS)

    Groves, C., Jr.

    1968-01-01

    Statistical truncated normal distribution function is applied as a time-to-failure distribution function in equipment reliability estimations. Age-dependent characteristics of the truncated function provide a basis for formulating a system of high-reliability testing that effectively merges statistical, engineering, and cost considerations.

  8. Texture and haptic cues in slant discrimination: reliability-based cue weighting without statistically optimal cue combination

    NASA Astrophysics Data System (ADS)

    Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.

    2005-05-01

    A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.

  9. The Estimation of the IRT Reliability Coefficient and Its Lower and Upper Bounds, with Comparisons to CTT Reliability Statistics

    ERIC Educational Resources Information Center

    Kim, Seonghoon; Feldt, Leonard S.

    2010-01-01

    The primary purpose of this study is to investigate the mathematical characteristics of the test reliability coefficient rho[subscript XX'] as a function of item response theory (IRT) parameters and present the lower and upper bounds of the coefficient. Another purpose is to examine relative performances of the IRT reliability statistics and two…

  10. An overview of the mathematical and statistical analysis component of RICIS

    NASA Technical Reports Server (NTRS)

    Hallum, Cecil R.

    1987-01-01

    Mathematical and statistical analysis components of RICIS (Research Institute for Computing and Information Systems) can be used in the following problem areas: (1) quantification and measurement of software reliability; (2) assessment of changes in software reliability over time (reliability growth); (3) analysis of software-failure data; and (4) decision logic for whether to continue or stop testing software. Other areas of interest to NASA/JSC where mathematical and statistical analysis can be successfully employed include: math modeling of physical systems, simulation, statistical data reduction, evaluation methods, optimization, algorithm development, and mathematical methods in signal processing.

  11. Ambiguities and conflicting results: the limitations of the kappa statistic in establishing the interrater reliability of the Irish nursing minimum data set for mental health: a discussion paper.

    PubMed

    Morris, Roisin; MacNeela, Padraig; Scott, Anne; Treacy, Pearl; Hyde, Abbey; O'Brien, Julian; Lehwaldt, Daniella; Byrne, Anne; Drennan, Jonathan

    2008-04-01

    In a study to establish the interrater reliability of the Irish Nursing Minimum Data Set (I-NMDS) for mental health difficulties relating to the choice of reliability test statistic were encountered. The objective of this paper is to highlight the difficulties associated with testing interrater reliability for an ordinal scale using a relatively homogenous sample and the recommended kw statistic. One pair of mental health nurses completed the I-NMDS for mental health for a total of 30 clients attending a mental health day centre over a two-week period. Data was analysed using the kw and percentage agreement statistics. A total of 34 of the 38 I-NMDS for mental health variables with lower than acceptable levels of kw reliability scores achieved acceptable levels of reliability according to their percentage agreement scores. The study findings implied that, due to the homogeneity of the sample, low variability within the data resulted in the 'base rate problem' associated with the use of kw statistic. Conclusions point to the interpretation of kw in tandem with percentage agreement scores. Suggestions that kw scores were low due to chance agreement and that one should strive to use a study sample with known variability are queried.

  12. Selection and Reporting of Statistical Methods to Assess Reliability of a Diagnostic Test: Conformity to Recommended Methods in a Peer-Reviewed Journal

    PubMed Central

    Park, Ji Eun; Han, Kyunghwa; Sung, Yu Sub; Chung, Mi Sun; Koo, Hyun Jung; Yoon, Hee Mang; Choi, Young Jun; Lee, Seung Soo; Kim, Kyung Won; Shin, Youngbin; An, Suah; Cho, Hyo-Min

    2017-01-01

    Objective To evaluate the frequency and adequacy of statistical analyses in a general radiology journal when reporting a reliability analysis for a diagnostic test. Materials and Methods Sixty-three studies of diagnostic test accuracy (DTA) and 36 studies reporting reliability analyses published in the Korean Journal of Radiology between 2012 and 2016 were analyzed. Studies were judged using the methodological guidelines of the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA), and COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative. DTA studies were evaluated by nine editorial board members of the journal. Reliability studies were evaluated by study reviewers experienced with reliability analysis. Results Thirty-one (49.2%) of the 63 DTA studies did not include a reliability analysis when deemed necessary. Among the 36 reliability studies, proper statistical methods were used in all (5/5) studies dealing with dichotomous/nominal data, 46.7% (7/15) of studies dealing with ordinal data, and 95.2% (20/21) of studies dealing with continuous data. Statistical methods were described in sufficient detail regarding weighted kappa in 28.6% (2/7) of studies and regarding the model and assumptions of intraclass correlation coefficient in 35.3% (6/17) and 29.4% (5/17) of studies, respectively. Reliability parameters were used as if they were agreement parameters in 23.1% (3/13) of studies. Reproducibility and repeatability were used incorrectly in 20% (3/15) of studies. Conclusion Greater attention to the importance of reporting reliability, thorough description of the related statistical methods, efforts not to neglect agreement parameters, and better use of relevant terminology is necessary. PMID:29089821

  13. Reliability approach to rotating-component design. [fatigue life and stress concentration

    NASA Technical Reports Server (NTRS)

    Kececioglu, D. B.; Lalli, V. R.

    1975-01-01

    A probabilistic methodology for designing rotating mechanical components using reliability to relate stress to strength is explained. The experimental test machines and data obtained for steel to verify this methodology are described. A sample mechanical rotating component design problem is solved by comparing a deterministic design method with the new design-by reliability approach. The new method shows that a smaller size and weight can be obtained for specified rotating shaft life and reliability, and uses the statistical distortion-energy theory with statistical fatigue diagrams for optimum shaft design. Statistical methods are presented for (1) determining strength distributions for steel experimentally, (2) determining a failure theory for stress variations in a rotating shaft subjected to reversed bending and steady torque, and (3) relating strength to stress by reliability.

  14. Uncertainties in obtaining high reliability from stress-strength models

    NASA Technical Reports Server (NTRS)

    Neal, Donald M.; Matthews, William T.; Vangel, Mark G.

    1992-01-01

    There has been a recent interest in determining high statistical reliability in risk assessment of aircraft components. The potential consequences are identified of incorrectly assuming a particular statistical distribution for stress or strength data used in obtaining the high reliability values. The computation of the reliability is defined as the probability of the strength being greater than the stress over the range of stress values. This method is often referred to as the stress-strength model. A sensitivity analysis was performed involving a comparison of reliability results in order to evaluate the effects of assuming specific statistical distributions. Both known population distributions, and those that differed slightly from the known, were considered. Results showed substantial differences in reliability estimates even for almost nondetectable differences in the assumed distributions. These differences represent a potential problem in using the stress-strength model for high reliability computations, since in practice it is impossible to ever know the exact (population) distribution. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability. An alternative reliability computation procedure is examined involving determination of a lower bound on the reliability values using extreme value distributions. This procedure reduces the possibility of obtaining nonconservative reliability estimates. Results indicated the method can provide conservative bounds when computing high reliability.

  15. A systematic review of statistical methods used to test for reliability of medical instruments measuring continuous variables.

    PubMed

    Zaki, Rafdzah; Bulgiba, Awang; Nordin, Noorhaire; Azina Ismail, Noor

    2013-06-01

    Reliability measures precision or the extent to which test results can be replicated. This is the first ever systematic review to identify statistical methods used to measure reliability of equipment measuring continuous variables. This studyalso aims to highlight the inappropriate statistical method used in the reliability analysis and its implication in the medical practice. In 2010, five electronic databases were searched between 2007 and 2009 to look for reliability studies. A total of 5,795 titles were initially identified. Only 282 titles were potentially related, and finally 42 fitted the inclusion criteria. The Intra-class Correlation Coefficient (ICC) is the most popular method with 25 (60%) studies having used this method followed by the comparing means (8 or 19%). Out of 25 studies using the ICC, only 7 (28%) reported the confidence intervals and types of ICC used. Most studies (71%) also tested the agreement of instruments. This study finds that the Intra-class Correlation Coefficient is the most popular method used to assess the reliability of medical instruments measuring continuous outcomes. There are also inappropriate applications and interpretations of statistical methods in some studies. It is important for medical researchers to be aware of this issue, and be able to correctly perform analysis in reliability studies.

  16. Reliability and One-Year Stability of the PIN3 Neighborhood Environmental Audit in Urban and Rural Neighborhoods.

    PubMed

    Porter, Anna K; Wen, Fang; Herring, Amy H; Rodríguez, Daniel A; Messer, Lynne C; Laraia, Barbara A; Evenson, Kelly R

    2018-06-01

    Reliable and stable environmental audit instruments are needed to successfully identify the physical and social attributes that may influence physical activity. This study described the reliability and stability of the PIN3 environmental audit instrument in both urban and rural neighborhoods. Four randomly sampled road segments in and around a one-quarter mile buffer of participants' residences from the Pregnancy, Infection, and Nutrition (PIN3) study were rated twice, approximately 2 weeks apart. One year later, 253 of the year 1 sampled roads were re-audited. The instrument included 43 measures that resulted in 73 item scores for calculation of percent overall agreement, kappa statistics, and log-linear models. For same-day reliability, 81% of items had moderate to outstanding kappa statistics (kappas ≥ 0.4). Two-week reliability was slightly lower, with 77% of items having moderate to outstanding agreement using kappa statistics. One-year stability had 68% of items showing moderate to outstanding agreement using kappa statistics. The reliability of the audit measures was largely consistent when comparing urban to rural locations, with only 8% of items exhibiting significant differences (α < 0.05) by urbanicity. The PIN3 instrument is a reliable and stable audit tool for studies assessing neighborhood attributes in urban and rural environments.

  17. A standard for test reliability in group research.

    PubMed

    Ellis, Jules L

    2013-03-01

    Many authors adhere to the rule that test reliabilities should be at least .70 or .80 in group research. This article introduces a new standard according to which reliabilities can be evaluated. This standard is based on the costs or time of the experiment and of administering the test. For example, if test administration costs are 7 % of the total experimental costs, the efficient value of the reliability is .93. If the actual reliability of a test is equal to this efficient reliability, the test size maximizes the statistical power of the experiment, given the costs. As a standard in experimental research, it is proposed that the reliability of the dependent variable be close to the efficient reliability. Adhering to this standard will enhance the statistical power and reduce the costs of experiments.

  18. Reliability and validity of a nutrition and physical activity environmental self-assessment for child care

    PubMed Central

    Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S

    2007-01-01

    Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078

  19. Notes on numerical reliability of several statistical analysis programs

    USGS Publications Warehouse

    Landwehr, J.M.; Tasker, Gary D.

    1999-01-01

    This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.

  20. Evaluation of General Classes of Reliability Estimators Often Used in Statistical Analyses of Quasi-Experimental Designs

    NASA Astrophysics Data System (ADS)

    Saini, K. K.; Sehgal, R. K.; Sethi, B. L.

    2008-10-01

    In this paper major reliability estimators are analyzed and there comparatively result are discussed. There strengths and weaknesses are evaluated in this case study. Each of the reliability estimators has certain advantages and disadvantages. Inter-rater reliability is one of the best ways to estimate reliability when your measure is an observation. However, it requires multiple raters or observers. As an alternative, you could look at the correlation of ratings of the same single observer repeated on two different occasions. Each of the reliability estimators will give a different value for reliability. In general, the test-retest and inter-rater reliability estimates will be lower in value than the parallel forms and internal consistency ones because they involve measuring at different times or with different raters. Since reliability estimates are often used in statistical analyses of quasi-experimental designs.

  1. Reliability and statistical power analysis of cortical and subcortical FreeSurfer metrics in a large sample of healthy elderly.

    PubMed

    Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz

    2015-03-01

    FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Interrater reliability: the kappa statistic.

    PubMed

    McHugh, Mary L

    2012-01-01

    The kappa statistic is frequently used to test interrater reliability. The importance of rater reliability lies in the fact that it represents the extent to which the data collected in the study are correct representations of the variables measured. Measurement of the extent to which data collectors (raters) assign the same score to the same variable is called interrater reliability. While there have been a variety of methods to measure interrater reliability, traditionally it was measured as percent agreement, calculated as the number of agreement scores divided by the total number of scores. In 1960, Jacob Cohen critiqued use of percent agreement due to its inability to account for chance agreement. He introduced the Cohen's kappa, developed to account for the possibility that raters actually guess on at least some variables due to uncertainty. Like most correlation statistics, the kappa can range from -1 to +1. While the kappa is one of the most commonly used statistics to test interrater reliability, it has limitations. Judgments about what level of kappa should be acceptable for health research are questioned. Cohen's suggested interpretation may be too lenient for health related studies because it implies that a score as low as 0.41 might be acceptable. Kappa and percent agreement are compared, and levels for both kappa and percent agreement that should be demanded in healthcare studies are suggested.

  3. Bayesian methods in reliability

    NASA Astrophysics Data System (ADS)

    Sander, P.; Badoux, R.

    1991-11-01

    The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.

  4. A Statistical Method for Syntactic Dialectometry

    ERIC Educational Resources Information Center

    Sanders, Nathan C.

    2010-01-01

    This dissertation establishes the utility and reliability of a statistical distance measure for syntactic dialectometry, expanding dialectometry's methods to include syntax as well as phonology and the lexicon. It establishes the measure's reliability by comparing its results to those of dialectology and phonological dialectometry on Swedish…

  5. Statistical Analysis on the Mechanical Properties of Magnesium Alloys

    PubMed Central

    Liu, Ruoyu; Jiang, Xianquan; Zhang, Hongju; Zhang, Dingfei; Wang, Jingfeng; Pan, Fusheng

    2017-01-01

    Knowledge of statistical characteristics of mechanical properties is very important for the practical application of structural materials. Unfortunately, the scatter characteristics of magnesium alloys for mechanical performance remain poorly understood until now. In this study, the mechanical reliability of magnesium alloys is systematically estimated using Weibull statistical analysis. Interestingly, the Weibull modulus, m, of strength for magnesium alloys is as high as that for aluminum and steels, confirming the very high reliability of magnesium alloys. The high predictability in the tensile strength of magnesium alloys represents the capability of preventing catastrophic premature failure during service, which is essential for safety and reliability assessment. PMID:29113116

  6. Confirmatory Factor Analysis of the Malay Version of the Confusion, Hubbub and Order Scale (CHAOS-6) among Myocardial Infarction Survivors in a Malaysian Cardiac Healthcare Facility.

    PubMed

    Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul

    2017-08-01

    The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version's validity. Composite reliability was tested to determine the scale's reliability. All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6) , for the Malaysian population.

  7. Confirmatory Factor Analysis of the Malay Version of the Confusion, Hubbub and Order Scale (CHAOS-6) among Myocardial Infarction Survivors in a Malaysian Cardiac Healthcare Facility

    PubMed Central

    Ganasegeran, Kurubaran; Selvaraj, Kamaraj; Rashid, Abdul

    2017-01-01

    Background The six item Confusion, Hubbub and Order Scale (CHAOS-6) has been validated as a reliable tool to measure levels of household disorder. We aimed to investigate the goodness of fit and reliability of a new Malay version of the CHAOS-6. Methods The original English version of the CHAOS-6 underwent forward-backward translation into the Malay language. The finalised Malay version was administered to 105 myocardial infarction survivors in a Malaysian cardiac health facility. We performed confirmatory factor analyses (CFAs) using structural equation modelling. A path diagram and fit statistics were yielded to determine the Malay version’s validity. Composite reliability was tested to determine the scale’s reliability. Results All 105 myocardial infarction survivors participated in the study. The CFA yielded a six-item, one-factor model with excellent fit statistics. Composite reliability for the single factor CHAOS-6 was 0.65, confirming that the scale is reliable for Malay speakers. Conclusion The Malay version of the CHAOS-6 was reliable and showed the best fit statistics for our study sample. We thus offer a simple, brief, validated, reliable and novel instrument to measure chaos, the Skala Kecelaruan, Keriuhan & Tertib Terubahsuai (CHAOS-6), for the Malaysian population. PMID:28951688

  8. Reliability of a Measure of Institutional Discrimination against Minorities

    DTIC Science & Technology

    1979-12-01

    samples are presented. The first is based upon classical statistical theory and the second derives from a series of computer-generated Monte Carlo...Institutional racism and sexism . Englewood Cliffs, N. J.: Prentice-Hall, Inc., 1978. Hays, W. L. and Winkler, R. L. Statistics : probability, inference... statistical measure of the e of institutional discrimination are discussed. Two methods of dealing with the problem of reliability of the measure in small

  9. fMRI reliability: influences of task and experimental design.

    PubMed

    Bennett, Craig M; Miller, Michael B

    2013-12-01

    As scientists, it is imperative that we understand not only the power of our research tools to yield results, but also their ability to obtain similar results over time. This study is an investigation into how common decisions made during the design and analysis of a functional magnetic resonance imaging (fMRI) study can influence the reliability of the statistical results. To that end, we gathered back-to-back test-retest fMRI data during an experiment involving multiple cognitive tasks (episodic recognition and two-back working memory) and multiple fMRI experimental designs (block, event-related genetic sequence, and event-related m-sequence). Using these data, we were able to investigate the relative influences of task, design, statistical contrast (task vs. rest, target vs. nontarget), and statistical thresholding (unthresholded, thresholded) on fMRI reliability, as measured by the intraclass correlation (ICC) coefficient. We also utilized data from a second study to investigate test-retest reliability after an extended, six-month interval. We found that all of the factors above were statistically significant, but that they had varying levels of influence on the observed ICC values. We also found that these factors could interact, increasing or decreasing the relative reliability of certain Task × Design combinations. The results suggest that fMRI reliability is a complex construct whose value may be increased or decreased by specific combinations of factors.

  10. Statistical significance test for transition matrices of atmospheric Markov chains

    NASA Technical Reports Server (NTRS)

    Vautard, Robert; Mo, Kingtse C.; Ghil, Michael

    1990-01-01

    Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.

  11. Space station software reliability analysis based on failures observed during testing at the multisystem integration facility

    NASA Technical Reports Server (NTRS)

    Tamayo, Tak Chai

    1987-01-01

    Quality of software not only is vital to the successful operation of the space station, it is also an important factor in establishing testing requirements, time needed for software verification and integration as well as launching schedules for the space station. Defense of management decisions can be greatly strengthened by combining engineering judgments with statistical analysis. Unlike hardware, software has the characteristics of no wearout and costly redundancies, thus making traditional statistical analysis not suitable in evaluating reliability of software. A statistical model was developed to provide a representation of the number as well as types of failures occur during software testing and verification. From this model, quantitative measure of software reliability based on failure history during testing are derived. Criteria to terminate testing based on reliability objectives and methods to estimate the expected number of fixings required are also presented.

  12. Statistical Tests of Reliability of NDE

    NASA Technical Reports Server (NTRS)

    Baaklini, George Y.; Klima, Stanley J.; Roth, Don J.; Kiser, James D.

    1987-01-01

    Capabilities of advanced material-testing techniques analyzed. Collection of four reports illustrates statistical method for characterizing flaw-detecting capabilities of sophisticated nondestructive evaluation (NDE). Method used to determine reliability of several state-of-the-art NDE techniques for detecting failure-causing flaws in advanced ceramic materials considered for use in automobiles, airplanes, and space vehicles.

  13. Evaluating Random Error in Clinician-Administered Surveys: Theoretical Considerations and Clinical Applications of Interobserver Reliability and Agreement.

    PubMed

    Bennett, Rebecca J; Taljaard, Dunay S; Olaithe, Michelle; Brennan-Jones, Chris; Eikelboom, Robert H

    2017-09-18

    The purpose of this study is to raise awareness of interobserver concordance and the differences between interobserver reliability and agreement when evaluating the responsiveness of a clinician-administered survey and, specifically, to demonstrate the clinical implications of data types (nominal/categorical, ordinal, interval, or ratio) and statistical index selection (for example, Cohen's kappa, Krippendorff's alpha, or interclass correlation). In this prospective cohort study, 3 clinical audiologists, who were masked to each other's scores, administered the Practical Hearing Aid Skills Test-Revised to 18 adult owners of hearing aids. Interobserver concordance was examined using a range of reliability and agreement statistical indices. The importance of selecting statistical measures of concordance was demonstrated with a worked example, wherein the level of interobserver concordance achieved varied from "no agreement" to "almost perfect agreement" depending on data types and statistical index selected. This study demonstrates that the methodology used to evaluate survey score concordance can influence the statistical results obtained and thus affect clinical interpretations.

  14. Statistical methods and neural network approaches for classification of data from multiple sources

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon Atli; Swain, Philip H.

    1990-01-01

    Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.

  15. Identification of reliable gridded reference data for statistical downscaling methods in Alberta

    NASA Astrophysics Data System (ADS)

    Eum, H. I.; Gupta, A.

    2017-12-01

    Climate models provide essential information to assess impacts of climate change at regional and global scales. However, statistical downscaling methods have been applied to prepare climate model data for various applications such as hydrologic and ecologic modelling at a watershed scale. As the reliability and (spatial and temporal) resolution of statistically downscaled climate data mainly depend on a reference data, identifying the most reliable reference data is crucial for statistical downscaling. A growing number of gridded climate products are available for key climate variables which are main input data to regional modelling systems. However, inconsistencies in these climate products, for example, different combinations of climate variables, varying data domains and data lengths and data accuracy varying with physiographic characteristics of the landscape, have caused significant challenges in selecting the most suitable reference climate data for various environmental studies and modelling. Employing various observation-based daily gridded climate products available in public domain, i.e. thin plate spline regression products (ANUSPLIN and TPS), inverse distance method (Alberta Townships), and numerical climate model (North American Regional Reanalysis) and an optimum interpolation technique (Canadian Precipitation Analysis), this study evaluates the accuracy of the climate products at each grid point by comparing with the Adjusted and Homogenized Canadian Climate Data (AHCCD) observations for precipitation, minimum and maximum temperature over the province of Alberta. Based on the performance of climate products at AHCCD stations, we ranked the reliability of these publically available climate products corresponding to the elevations of stations discretized into several classes. According to the rank of climate products for each elevation class, we identified the most reliable climate products based on the elevation of target points. A web-based system was developed to allow users to easily select the most reliable reference climate data at each target point based on the elevation of grid cell. By constructing the best combination of reference data for the study domain, the accurate and reliable statistically downscaled climate projections could be significantly improved.

  16. Understanding a Widely Misunderstood Statistic: Cronbach's "Alpha"

    ERIC Educational Resources Information Center

    Ritter, Nicola L.

    2010-01-01

    It is important to explore score reliability in virtually all studies, because tests are not reliable. The present paper explains the most frequently used reliability estimate, coefficient alpha, so that the coefficient's conceptual underpinnings will be understood. Researchers need to understand score reliability because of the possible impact…

  17. Acute Respiratory Distress Syndrome Measurement Error. Potential Effect on Clinical Study Results

    PubMed Central

    Cooke, Colin R.; Iwashyna, Theodore J.; Hofer, Timothy P.

    2016-01-01

    Rationale: Identifying patients with acute respiratory distress syndrome (ARDS) is a recognized challenge. Experts often have only moderate agreement when applying the clinical definition of ARDS to patients. However, no study has fully examined the implications of low reliability measurement of ARDS on clinical studies. Objectives: To investigate how the degree of variability in ARDS measurement commonly reported in clinical studies affects study power, the accuracy of treatment effect estimates, and the measured strength of risk factor associations. Methods: We examined the effect of ARDS measurement error in randomized clinical trials (RCTs) of ARDS-specific treatments and cohort studies using simulations. We varied the reliability of ARDS diagnosis, quantified as the interobserver reliability (κ-statistic) between two reviewers. In RCT simulations, patients identified as having ARDS were enrolled, and when measurement error was present, patients without ARDS could be enrolled. In cohort studies, risk factors as potential predictors were analyzed using reviewer-identified ARDS as the outcome variable. Measurements and Main Results: Lower reliability measurement of ARDS during patient enrollment in RCTs seriously degraded study power. Holding effect size constant, the sample size necessary to attain adequate statistical power increased by more than 50% as reliability declined, although the result was sensitive to ARDS prevalence. In a 1,400-patient clinical trial, the sample size necessary to maintain similar statistical power increased to over 1,900 when reliability declined from perfect to substantial (κ = 0.72). Lower reliability measurement diminished the apparent effectiveness of an ARDS-specific treatment from a 15.2% (95% confidence interval, 9.4–20.9%) absolute risk reduction in mortality to 10.9% (95% confidence interval, 4.7–16.2%) when reliability declined to moderate (κ = 0.51). In cohort studies, the effect on risk factor associations was similar. Conclusions: ARDS measurement error can seriously degrade statistical power and effect size estimates of clinical studies. The reliability of ARDS measurement warrants careful attention in future ARDS clinical studies. PMID:27159648

  18. Comment on Hall et al. (2017), "How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial".

    PubMed

    Sabour, Siamak

    2018-03-08

    The purpose of this letter, in response to Hall, Mehta, and Fackrell (2017), is to provide important knowledge about methodology and statistical issues in assessing the reliability and validity of an audiologist-administered tinnitus loudness matching test and a patient-reported tinnitus loudness rating. The author uses reference textbooks and published articles regarding scientific assessment of the validity and reliability of a clinical test to discuss the statistical test and the methodological approach in assessing validity and reliability in clinical research. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess reliability and validity. The qualitative variables of sensitivity, specificity, positive predictive value, negative predictive value, false positive and false negative rates, likelihood ratio positive and likelihood ratio negative, as well as odds ratio (i.e., ratio of true to false results), are the most appropriate estimates to evaluate validity of a test compared to a gold standard. In the case of quantitative variables, depending on distribution of the variable, Pearson r or Spearman rho can be applied. Diagnostic accuracy (validity) and diagnostic precision (reliability or agreement) are two completely different methodological issues. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess validity.

  19. Reliability Considerations for the Operation of Large Accelerator User Facilities

    DOE PAGES

    Willeke, F. J.

    2016-01-29

    The lecture provides an overview of considerations relevant for achieving highly reliable operation of accelerator based user facilities. The article starts with an overview of statistical reliability formalism which is followed by high reliability design considerations with examples. Finally, the article closes with operational aspects of high reliability such as preventive maintenance and spares inventory.

  20. Determining Functional Reliability of Pyrotechnic Mechanical Devices

    NASA Technical Reports Server (NTRS)

    Bement, Laurence J.; Multhaup, Herbert A.

    1997-01-01

    This paper describes a new approach for evaluating mechanical performance and predicting the mechanical functional reliability of pyrotechnic devices. Not included are other possible failure modes, such as the initiation of the pyrotechnic energy source. The requirement of hundreds or thousands of consecutive, successful tests on identical components for reliability predictions, using the generally accepted go/no-go statistical approach routinely ignores physics of failure. The approach described in this paper begins with measuring, understanding and controlling mechanical performance variables. Then, the energy required to accomplish the function is compared to that delivered by the pyrotechnic energy source to determine mechanical functional margin. Finally, the data collected in establishing functional margin is analyzed to predict mechanical functional reliability, using small-sample statistics. A careful application of this approach can provide considerable cost improvements and understanding over that of go/no-go statistics. Performance and the effects of variables can be defined, and reliability predictions can be made by evaluating 20 or fewer units. The application of this approach to a pin puller used on a successful NASA mission is provided as an example.

  1. An Innovative Excel Application to Improve Exam Reliability in Marketing Courses

    ERIC Educational Resources Information Center

    Keller, Christopher M.; Kros, John F.

    2011-01-01

    Measures of survey reliability are commonly addressed in marketing courses. One statistic of reliability is "Cronbach's alpha." This paper presents an application of survey reliability as a reflexive application of multiple-choice exam validation. The application provides an interactive decision support system that incorporates survey item…

  2. A Monte Carlo Simulation Study of the Reliability of Intraindividual Variability

    PubMed Central

    Estabrook, Ryne; Grimm, Kevin J.; Bowles, Ryan P.

    2012-01-01

    Recent research has seen intraindividual variability (IIV) become a useful technique to incorporate trial-to-trial variability into many types of psychological studies. IIV as measured by individual standard deviations (ISDs) has shown unique prediction to several types of positive and negative outcomes (Ram, Rabbit, Stollery, & Nesselroade, 2005). One unanswered question regarding measuring intraindividual variability is its reliability and the conditions under which optimal reliability is achieved. Monte Carlo simulation studies were conducted to determine the reliability of the ISD compared to the intraindividual mean. The results indicate that ISDs generally have poor reliability and are sensitive to insufficient measurement occasions, poor test reliability, and unfavorable amounts and distributions of variability in the population. Secondary analysis of psychological data shows that use of individual standard deviations in unfavorable conditions leads to a marked reduction in statistical power, although careful adherence to underlying statistical assumptions allows their use as a basic research tool. PMID:22268793

  3. Reliability of reference distances used in photogrammetry.

    PubMed

    Aksu, Muge; Kaya, Demet; Kocadereli, Ilken

    2010-07-01

    To determine the reliability of the reference distances used for photogrammetric assessment. The sample consisted of 100 subjects with mean ages of 22.97 +/- 2.98 years. Five lateral and four frontal parameters were measured directly on the subjects' faces. For photogrammetric assessment, two reference distances for the profile view and three reference distances for the frontal view were established. Standardized photographs were taken and all the parameters that had been measured directly on the face were measured on the photographs. The reliability of the reference distances was checked by comparing direct and indirect values of the parameters obtained from the subjects' faces and photographs. Repeated measure analysis of variance (ANOVA) and Bland-Altman analyses were used for statistical assessment. For profile measurements, the indirect values measured were statistically different from the direct values except for Sn-Sto in male subjects and Prn-Sn and Sn-Sto in female subjects. The indirect values of Prn-Sn and Sn-Sto were reliable in both sexes. The poorest results were obtained in the indirect values of the N-Sn parameter for female subjects and the Sn-Me parameter for male subjects according to the Sa-Sba reference distance. For frontal measurements, the indirect values were statistically different from the direct values in both sexes except for one in male subjects. The indirect values measured were not statistically different from the direct values for Go-Go. The indirect values of Ch-Ch were reliable in male subjects. The poorest results were obtained according to the P-P reference distance. For profile assessment, the T-Ex reference distance was reliable for Prn-Sn and Sn-Sto in both sexes. For frontal assessment, Ex-Ex and En-En reference distances were reliable for Ch-Ch in male subjects.

  4. A criterion for establishing life limits. [for Space Shuttle Main Engine service

    NASA Technical Reports Server (NTRS)

    Skopp, G. H.; Porter, A. A.

    1990-01-01

    The development of a rigorous statistical method that would utilize hardware-demonstrated reliability to evaluate hardware capability and provide ground rules for safe flight margin is discussed. A statistical-based method using the Weibull/Weibayes cumulative distribution function is described. Its advantages and inadequacies are pointed out. Another, more advanced procedure, Single Flight Reliability (SFR), determines a life limit which ensures that the reliability of any single flight is never less than a stipulated value at a stipulated confidence level. Application of the SFR method is illustrated.

  5. Improving statistical inference on pathogen densities estimated by quantitative molecular methods: malaria gametocytaemia as a case study.

    PubMed

    Walker, Martin; Basáñez, María-Gloria; Ouédraogo, André Lin; Hermsen, Cornelus; Bousema, Teun; Churcher, Thomas S

    2015-01-16

    Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens.

  6. Missile Systems Maintenance, AFSC 411XOB/C.

    DTIC Science & Technology

    1988-04-01

    technician’s rating. A statistical measurement of their agreement, known as the interrater reliability (as assessed through components of variance of...senior technician’s ratings. A statistical measurement of their agreement, known as the interrater reliability (as assessed through components of...FABRICATION TRANSITORS *INPUT/OUTPUT (PERIPHERAL) DEVICES SOLID-STATE SPECIAL PURPOSE DEVICES COMPUTER MICRO PROCESSORS AND PROGRAMS POWER SUPPLIES

  7. Impact of Rating Scale Categories on Reliability and Fit Statistics of the Malay Spiritual Well-Being Scale using Rasch Analysis.

    PubMed

    Daher, Aqil Mohammad; Ahmad, Syed Hassan; Winn, Than; Selamat, Mohd Ikhsan

    2015-01-01

    Few studies have employed the item response theory in examining reliability. We conducted this study to examine the effect of Rating Scale Categories (RSCs) on the reliability and fit statistics of the Malay Spiritual Well-Being Scale, employing the Rasch model. The Malay Spiritual Well-Being Scale (SWBS) with the original six; three and four newly structured RSCs was distributed randomly among three different samples of 50 participants each. The mean age of respondents in the three samples ranged between 36 and 39 years old. The majority was female in all samples, and Islam was the most prevalent religion among the respondents. The predominating race was Malay, followed by Chinese and Indian. The original six RSCs indicated better targeting of 0.99 and smallest model error of 0.24. The Infit Mnsq (mean square) and Zstd (Z standard) of the six RSCs were "1.1"and "-0.1"respectively. The six RSCs achieved the highest person and item reliabilities of 0.86 and 0.85 respectively. These reliabilities yielded the highest person (2.46) and item (2.38) separation indices compared to other the RSCs. The person and item reliability and, to a lesser extent, the fit statistics, were better with the six RSCs compared to the four and three RSCs.

  8. Assessment of NDE reliability data

    NASA Technical Reports Server (NTRS)

    Yee, B. G. W.; Couchman, J. C.; Chang, F. H.; Packman, D. F.

    1975-01-01

    Twenty sets of relevant nondestructive test (NDT) reliability data were identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations was formulated, and a model to grade the quality and validity of the data sets was developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, were formulated for each NDE method. A comprehensive computer program was written and debugged to calculate the probability of flaw detection at several confidence limits by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. An example of the calculated reliability of crack detection in bolt holes by an automatic eddy current method is presented.

  9. Comparing the Fit of Item Response Theory and Factor Analysis Models

    ERIC Educational Resources Information Center

    Maydeu-Olivares, Alberto; Cai, Li; Hernandez, Adolfo

    2011-01-01

    Linear factor analysis (FA) models can be reliably tested using test statistics based on residual covariances. We show that the same statistics can be used to reliably test the fit of item response theory (IRT) models for ordinal data (under some conditions). Hence, the fit of an FA model and of an IRT model to the same data set can now be…

  10. Assessing the Kansas water-level monitoring program: An example of the application of classical statistics to a geological problem

    USGS Publications Warehouse

    Davis, J.C.

    2000-01-01

    Geologists may feel that geological data are not amenable to statistical analysis, or at best require specialized approaches such as nonparametric statistics and geostatistics. However, there are many circumstances, particularly in systematic studies conducted for environmental or regulatory purposes, where traditional parametric statistical procedures can be beneficial. An example is the application of analysis of variance to data collected in an annual program of measuring groundwater levels in Kansas. Influences such as well conditions, operator effects, and use of the water can be assessed and wells that yield less reliable measurements can be identified. Such statistical studies have resulted in yearly improvements in the quality and reliability of the collected hydrologic data. Similar benefits may be achieved in other geological studies by the appropriate use of classical statistical tools.

  11. Statistical equivalence and test-retest reliability of delay and probability discounting using real and hypothetical rewards.

    PubMed

    Matusiewicz, Alexis K; Carter, Anne E; Landes, Reid D; Yi, Richard

    2013-11-01

    Delay discounting (DD) and probability discounting (PD) refer to the reduction in the subjective value of outcomes as a function of delay and uncertainty, respectively. Elevated measures of discounting are associated with a variety of maladaptive behaviors, and confidence in the validity of these measures is imperative. The present research examined (1) the statistical equivalence of discounting measures when rewards were hypothetical or real, and (2) their 1-week reliability. While previous research has partially explored these issues using the low threshold of nonsignificant difference, the present study fully addressed this issue using the more-compelling threshold of statistical equivalence. DD and PD measures were collected from 28 healthy adults using real and hypothetical $50 rewards during each of two experimental sessions, one week apart. Analyses using area-under-the-curve measures revealed a general pattern of statistical equivalence, indicating equivalence of real/hypothetical conditions as well as 1-week reliability. Exceptions are identified and discussed. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Health measurement using the ICF: Test-retest reliability study of ICF codes and qualifiers in geriatric care

    PubMed Central

    Okochi, Jiro; Utsunomiya, Sakiko; Takahashi, Tai

    2005-01-01

    Background The International Classification of Functioning, Disability and Health (ICF) was published by the World Health Organization (WHO) to standardize descriptions of health and disability. Little is known about the reliability and clinical relevance of measurements using the ICF and its qualifiers. This study examines the test-retest reliability of ICF codes, and the rate of immeasurability in long-term care settings of the elderly to evaluate the clinical applicability of the ICF and its qualifiers, and the ICF checklist. Methods Reliability of 85 body function (BF) items and 152 activity and participation (AP) items of the ICF was studied using a test-retest procedure with a sample of 742 elderly persons from 59 institutional and at home care service centers. Test-retest reliability was estimated using the weighted kappa statistic. The clinical relevance of the ICF was estimated by calculating immeasurability rate. The effect of the measurement settings and evaluators' experience was analyzed by stratification of these variables. The properties of each item were evaluated using both the kappa statistic and immeasurability rate to assess the clinical applicability of WHO's ICF checklist in the elderly care setting. Results The median of the weighted kappa statistics of 85 BF and 152 AP items were 0.46 and 0.55 respectively. The reproducibility statistics improved when the measurements were performed by experienced evaluators. Some chapters such as genitourinary and reproductive functions in the BF domain and major life area in the AP domain contained more items with lower test-retest reliability measures and rated as immeasurable than in the other chapters. Some items in the ICF checklist were rated as unreliable and immeasurable. Conclusion The reliability of the ICF codes when measured with the current ICF qualifiers is relatively low. The result in increase in reliability according to evaluators' experience suggests proper education will have positive effects to raise the reliability. The ICF checklist contains some items that are difficult to be applied in the geriatric care settings. The improvements should be achieved by selecting the most relevant items for each measurement and by developing appropriate qualifiers for each code according to the interest of the users. PMID:16050960

  13. Psychometric considerations in the measurement of event-related brain potentials: Guidelines for measurement and reporting.

    PubMed

    Clayson, Peter E; Miller, Gregory A

    2017-01-01

    Failing to consider psychometric issues related to reliability and validity, differential deficits, and statistical power potentially undermines the conclusions of a study. In research using event-related brain potentials (ERPs), numerous contextual factors (population sampled, task, data recording, analysis pipeline, etc.) can impact the reliability of ERP scores. The present review considers the contextual factors that influence ERP score reliability and the downstream effects that reliability has on statistical analyses. Given the context-dependent nature of ERPs, it is recommended that ERP score reliability be formally assessed on a study-by-study basis. Recommended guidelines for ERP studies include 1) reporting the threshold of acceptable reliability and reliability estimates for observed scores, 2) specifying the approach used to estimate reliability, and 3) justifying how trial-count minima were chosen. A reliability threshold for internal consistency of at least 0.70 is recommended, and a threshold of 0.80 is preferred. The review also advocates the use of generalizability theory for estimating score dependability (the generalizability theory analog to reliability) as an improvement on classical test theory reliability estimates, suggesting that the latter is less well suited to ERP research. To facilitate the calculation and reporting of dependability estimates, an open-source Matlab program, the ERP Reliability Analysis Toolbox, is presented. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Common pitfalls in statistical analysis: Clinical versus statistical significance

    PubMed Central

    Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc

    2015-01-01

    In clinical research, study results, which are statistically significant are often interpreted as being clinically important. While statistical significance indicates the reliability of the study results, clinical significance reflects its impact on clinical practice. The third article in this series exploring pitfalls in statistical analysis clarifies the importance of differentiating between statistical significance and clinical significance. PMID:26229754

  15. An Investigation of the Impact of Guessing on Coefficient α and Reliability

    PubMed Central

    2014-01-01

    Guessing is known to influence the test reliability of multiple-choice tests. Although there are many studies that have examined the impact of guessing, they used rather restrictive assumptions (e.g., parallel test assumptions, homogeneous inter-item correlations, homogeneous item difficulty, and homogeneous guessing levels across items) to evaluate the relation between guessing and test reliability. Based on the item response theory (IRT) framework, this study investigated the extent of the impact of guessing on reliability under more realistic conditions where item difficulty, item discrimination, and guessing levels actually vary across items with three different test lengths (TL). By accommodating multiple item characteristics simultaneously, this study also focused on examining interaction effects between guessing and other variables entered in the simulation to be more realistic. The simulation of the more realistic conditions and calculations of reliability and classical test theory (CTT) item statistics were facilitated by expressing CTT item statistics, coefficient α, and reliability in terms of IRT model parameters. In addition to the general negative impact of guessing on reliability, results showed interaction effects between TL and guessing and between guessing and test difficulty.

  16. VHSIC Electronics and the Cost of Air Force Avionics in the 1990s

    DTIC Science & Technology

    1990-11-01

    circuit. LRM Line replaceable module. LRU Line replaceable unit. LSI Large-scale integration. LSTTL Tow-power Schottky Transitor -to-Transistor Logic...displays, communications/navigation/identification, electronic combat equipment, dispensers, and computers. These CERs, which statistically relate the...some of the reliability numbers, and adding the F-15 and F-16 to obtain the data sample shown in Table 6. Both suite costs and reliability statistics

  17. Forecasting infectious disease emergence subject to seasonal forcing.

    PubMed

    Miller, Paige B; O'Dea, Eamon B; Rohani, Pejman; Drake, John M

    2017-09-06

    Despite high vaccination coverage, many childhood infections pose a growing threat to human populations. Accurate disease forecasting would be of tremendous value to public health. Forecasting disease emergence using early warning signals (EWS) is possible in non-seasonal models of infectious diseases. Here, we assessed whether EWS also anticipate disease emergence in seasonal models. We simulated the dynamics of an immunizing infectious pathogen approaching the tipping point to disease endemicity. To explore the effect of seasonality on the reliability of early warning statistics, we varied the amplitude of fluctuations around the average transmission. We proposed and analyzed two new early warning signals based on the wavelet spectrum. We measured the reliability of the early warning signals depending on the strength of their trend preceding the tipping point and then calculated the Area Under the Curve (AUC) statistic. Early warning signals were reliable when disease transmission was subject to seasonal forcing. Wavelet-based early warning signals were as reliable as other conventional early warning signals. We found that removing seasonal trends, prior to analysis, did not improve early warning statistics uniformly. Early warning signals anticipate the onset of critical transitions for infectious diseases which are subject to seasonal forcing. Wavelet-based early warning statistics can also be used to forecast infectious disease.

  18. A reliability evaluation methodology for memory chips for space applications when sample size is small

    NASA Technical Reports Server (NTRS)

    Chen, Y.; Nguyen, D.; Guertin, S.; Berstein, J.; White, M.; Menke, R.; Kayali, S.

    2003-01-01

    This paper presents a reliability evaluation methodology to obtain the statistical reliability information of memory chips for space applications when the test sample size needs to be kept small because of the high cost of the radiation hardness memories.

  19. Research Education in Undergraduate Occupational Therapy Programs.

    ERIC Educational Resources Information Center

    Petersen, Paul; And Others

    1992-01-01

    Of 63 undergraduate occupational therapy programs surveyed, the 38 responses revealed some common areas covered: elementary descriptive statistics, validity, reliability, and measurement. Areas underrepresented include statistical analysis with or without computers, research design, and advanced statistics. (SK)

  20. Evaluation of adding item-response theory analysis for evaluation of the European Board of Ophthalmology Diploma examination.

    PubMed

    Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José

    2013-11-01

    To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.

  1. The kappa statistic in rehabilitation research: an examination.

    PubMed

    Tooth, Leigh R; Ottenbacher, Kenneth J

    2004-08-01

    The number and sophistication of statistical procedures reported in medical rehabilitation research is increasing. Application of the principles and methods associated with evidence-based practice has contributed to the need for rehabilitation practitioners to understand quantitative methods in published articles. Outcomes measurement and determination of reliability are areas that have experienced rapid change during the past decade. In this study, distinctions between reliability and agreement are examined. Information is presented on analytical approaches for addressing reliability and agreement with the focus on the application of the kappa statistic. The following assumptions are discussed: (1) kappa should be used with data measured on a categorical scale, (2) the patients or objects categorized should be independent, and (3) the observers or raters must make their measurement decisions and judgments independently. Several issues related to using kappa in measurement studies are described, including use of weighted kappa, methods of reporting kappa, the effect of bias and prevalence on kappa, and sample size and power requirements for kappa. The kappa statistic is useful for assessing agreement among raters, and it is being used more frequently in rehabilitation research. Correct interpretation of the kappa statistic depends on meeting the required assumptions and accurate reporting.

  2. How to Characterize the Reliability of Ceramic Capacitors with Base-Metal Electrodes (BMEs)

    NASA Technical Reports Server (NTRS)

    Liu, David (Donhang)

    2015-01-01

    The reliability of an MLCC device is the product of a time-dependent part and a time-independent part: 1) Time-dependent part is a statistical distribution; 2) Time-independent part is the reliability at t=0, the initial reliability. Initial reliability depends only on how a BME MLCC is designed and processed. Similar to the way the minimum dielectric thickness ensured the long-term reliability of a PME MLCC, the initial reliability also ensures the long term-reliability of a BME MLCC. This presentation shows new discoveries regarding commonalities and differences between PME and BME capacitor technologies.

  3. How to Characterize the Reliability of Ceramic Capacitors with Base-Metal Electrodes (BMEs)

    NASA Technical Reports Server (NTRS)

    Liu, Donhang

    2015-01-01

    The reliability of an MLCC device is the product of a time-dependent part and a time-independent part: 1) Time-dependent part is a statistical distribution; 2) Time-independent part is the reliability at t0, the initial reliability. Initial reliability depends only on how a BME MLCC is designed and processed. Similar to the way the minimum dielectric thickness ensured the long-term reliability of a PME MLCC, the initial reliability also ensures the long term-reliability of a BME MLCC. This presentation shows new discoveries regarding commonalities and differences between PME and BME capacitor technologies.

  4. Constructing Confidence Intervals for Reliability Coefficients Using Central and Noncentral Distributions.

    ERIC Educational Resources Information Center

    Weber, Deborah A.

    Greater understanding and use of confidence intervals is central to changes in statistical practice (G. Cumming and S. Finch, 2001). Reliability coefficients and confidence intervals for reliability coefficients can be computed using a variety of methods. Estimating confidence intervals includes both central and noncentral distribution approaches.…

  5. Videotape Reliability: A Method of Evaluation of a Clinical Performance Examination.

    ERIC Educational Resources Information Center

    And Others; Liu, Philip

    1980-01-01

    A method of statistically analyzing clinical performance examinations for reliability and the application of this method in determining the reliability of two examinations of skill in administering anesthesia are described. Videotaped performances for the Spinal Anesthesia Skill Examination and the Anesthesia Setup and Machine Checkout Examination…

  6. LD-SPatt: large deviations statistics for patterns on Markov chains.

    PubMed

    Nuel, G

    2004-01-01

    Statistics on Markov chains are widely used for the study of patterns in biological sequences. Statistics on these models can be done through several approaches. Central limit theorem (CLT) producing Gaussian approximations are one of the most popular ones. Unfortunately, in order to find a pattern of interest, these methods have to deal with tail distribution events where CLT is especially bad. In this paper, we propose a new approach based on the large deviations theory to assess pattern statistics. We first recall theoretical results for empiric mean (level 1) as well as empiric distribution (level 2) large deviations on Markov chains. Then, we present the applications of these results focusing on numerical issues. LD-SPatt is the name of GPL software implementing these algorithms. We compare this approach to several existing ones in terms of complexity and reliability and show that the large deviations are more reliable than the Gaussian approximations in absolute values as well as in terms of ranking and are at least as reliable as compound Poisson approximations. We then finally discuss some further possible improvements and applications of this new method.

  7. 28 CFR 22.1 - Purpose.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Administration DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.1... information identifiable to a private person obtained in a research or statistical program may only be used... reliability of federally-supported research and statistical findings by minimizing subject concern over...

  8. 28 CFR 22.1 - Purpose.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Administration DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.1... information identifiable to a private person obtained in a research or statistical program may only be used... reliability of federally-supported research and statistical findings by minimizing subject concern over...

  9. Binge Eating Disorder: Reliability and Validity of a New Diagnostic Category.

    ERIC Educational Resources Information Center

    Brody, Michelle L.; And Others

    1994-01-01

    Examined reliability and validity of binge eating disorder (BED), proposed for inclusion in Diagnostic and Statistical Manual of Mental Disorders (DSM), fourth edition. Interrater reliability of BED diagnosis compared favorably with that of most diagnoses in DSM revised third edition. Study comparing obese individuals with and without BED and…

  10. The Information Function for the One-Parameter Logistic Model: Is it Reliability?

    ERIC Educational Resources Information Center

    Doran, Harold C.

    2005-01-01

    The information function is an important statistic in item response theory (IRT) applications. Although the information function is often described as the IRT version of reliability, it differs from the classical notion of reliability from a critical perspective: replication. This article first explores the information function for the…

  11. Assuring reliability program effectiveness.

    NASA Technical Reports Server (NTRS)

    Ball, L. W.

    1973-01-01

    An attempt is made to provide simple identification and description of techniques that have proved to be most useful either in developing a new product or in improving reliability of an established product. The first reliability task is obtaining and organizing parts failure rate data. Other tasks are parts screening, tabulation of general failure rates, preventive maintenance, prediction of new product reliability, and statistical demonstration of achieved reliability. Five principal tasks for improving reliability involve the physics of failure research, derating of internal stresses, control of external stresses, functional redundancy, and failure effects control. A final task is the training and motivation of reliability specialist engineers.

  12. Unreliable Yet Still Replicable: A Comment on LeBel and Paunonen (2011)

    PubMed Central

    De Schryver, Maarten; Hughes, Sean; Rosseel, Yves; De Houwer, Jan

    2016-01-01

    Lebel and Paunonen (2011) highlight that despite their importance and popularity in both theoretical and applied research, many implicit measures continue to be plagued by a persistent and troublesome issue—low reliability. In their paper, they offer a conceptual analysis of the relationship between reliability, power and replicability, and then provide a series of recommendations for researchers interested in using implicit measures in an experimental setting. At the core of their account is the idea that reliability can be equated with statistical power, such that “lower levels of reliability are associated with decreasing probabilities of detecting a statistically significant effect, given one exists in the population” (p. 573). They also take the additional step of equating reliability and replicability. In our commentary, we draw attention to the fact that there is no direct, fixed or one-to-one relation between reliability and power or replicability. More specifically, we argue that when adopting an experimental (rather than a correlational) approach, researchers strive to minimize inter-individual variation, which has a direct impact on sample based reliability estimates. We evaluate the strengths and weaknesses of the LeBel and Paunonen's recommendations and refine them where appropriate. PMID:26793150

  13. Patients and medical statistics. Interest, confidence, and ability.

    PubMed

    Woloshin, Steven; Schwartz, Lisa M; Welch, H Gilbert

    2005-11-01

    People are increasingly presented with medical statistics. There are no existing measures to assess their level of interest or confidence in using medical statistics. To develop 2 new measures, the STAT-interest and STAT-confidence scales, and assess their reliability and validity. Survey with retest after approximately 2 weeks. Two hundred and twenty-four people were recruited from advertisements in local newspapers, an outpatient clinic waiting area, and a hospital open house. We developed and revised 5 items on interest in medical statistics and 3 on confidence understanding statistics. Study participants were mostly college graduates (52%); 25% had a high school education or less. The mean age was 53 (range 20 to 84) years. Most paid attention to medical statistics (6% paid no attention). The mean (SD) STAT-interest score was 68 (17) and ranged from 15 to 100. Confidence in using statistics was also high: the mean (SD) STAT-confidence score was 65 (19) and ranged from 11 to 100. STAT-interest and STAT-confidence scores were moderately correlated (r=.36, P<.001). Both scales demonstrated good test-retest repeatability (r=.60, .62, respectively), internal consistency reliability (Cronbach's alpha=0.70 and 0.78), and usability (individual item nonresponse ranged from 0% to 1.3%). Scale scores correlated only weakly with scores on a medical data interpretation test (r=.15 and .26, respectively). The STAT-interest and STAT-confidence scales are usable and reliable. Interest and confidence were only weakly related to the ability to actually use data.

  14. Patients and Medical Statistics

    PubMed Central

    Woloshin, Steven; Schwartz, Lisa M; Welch, H Gilbert

    2005-01-01

    BACKGROUND People are increasingly presented with medical statistics. There are no existing measures to assess their level of interest or confidence in using medical statistics. OBJECTIVE To develop 2 new measures, the STAT-interest and STAT-confidence scales, and assess their reliability and validity. DESIGN Survey with retest after approximately 2 weeks. SUBJECTS Two hundred and twenty-four people were recruited from advertisements in local newspapers, an outpatient clinic waiting area, and a hospital open house. MEASURES We developed and revised 5 items on interest in medical statistics and 3 on confidence understanding statistics. RESULTS Study participants were mostly college graduates (52%); 25% had a high school education or less. The mean age was 53 (range 20 to 84) years. Most paid attention to medical statistics (6% paid no attention). The mean (SD) STAT-interest score was 68 (17) and ranged from 15 to 100. Confidence in using statistics was also high: the mean (SD) STAT-confidence score was 65 (19) and ranged from 11 to 100. STAT-interest and STAT-confidence scores were moderately correlated (r=.36, P<.001). Both scales demonstrated good test–retest repeatability (r=.60, .62, respectively), internal consistency reliability (Cronbach's α=0.70 and 0.78), and usability (individual item nonresponse ranged from 0% to 1.3%). Scale scores correlated only weakly with scores on a medical data interpretation test (r=.15 and .26, respectively). CONCLUSION The STAT-interest and STAT-confidence scales are usable and reliable. Interest and confidence were only weakly related to the ability to actually use data. PMID:16307623

  15. Factors that Affect Operational Reliability of Turbojet Engines

    NASA Technical Reports Server (NTRS)

    1956-01-01

    The problem of improving operational reliability of turbojet engines is studied in a series of papers. Failure statistics for this engine are presented, the theory and experimental evidence on how engine failures occur are described, and the methods available for avoiding failure in operation are discussed. The individual papers of the series are Objectives, Failure Statistics, Foreign-Object Damage, Compressor Blades, Combustor Assembly, Nozzle Diaphrams, Turbine Buckets, Turbine Disks, Rolling Contact Bearings, Engine Fuel Controls, and Summary Discussion.

  16. Statistical complex fatigue data for SAE 4340 steel and its use in design by reliability

    NASA Technical Reports Server (NTRS)

    Kececioglu, D.; Smith, J. L.

    1970-01-01

    A brief description of the complex fatigue machines used in the test program is presented. The data generated from these machines are given and discussed. Two methods of obtaining strength distributions from the data are also discussed. Then follows a discussion of the construction of statistical fatigue diagrams and their use in designing by reliability. Finally, some of the problems encountered in the test equipment and a corrective modification are presented.

  17. Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump.

    PubMed

    Ruhi, S; Karim, M R

    2016-01-01

    Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance. In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump. Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson-Darling test statistic, Kolmogrov-Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities. In this study, it is found that a threefold mixture model (Weibull-Normal-Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.

  18. Precision, Reliability, and Effect Size of Slope Variance in Latent Growth Curve Models: Implications for Statistical Power Analysis

    PubMed Central

    Brandmaier, Andreas M.; von Oertzen, Timo; Ghisletta, Paolo; Lindenberger, Ulman; Hertzog, Christopher

    2018-01-01

    Latent Growth Curve Models (LGCM) have become a standard technique to model change over time. Prediction and explanation of inter-individual differences in change are major goals in lifespan research. The major determinants of statistical power to detect individual differences in change are the magnitude of true inter-individual differences in linear change (LGCM slope variance), design precision, alpha level, and sample size. Here, we show that design precision can be expressed as the inverse of effective error. Effective error is determined by instrument reliability and the temporal arrangement of measurement occasions. However, it also depends on another central LGCM component, the variance of the latent intercept and its covariance with the latent slope. We derive a new reliability index for LGCM slope variance—effective curve reliability (ECR)—by scaling slope variance against effective error. ECR is interpretable as a standardized effect size index. We demonstrate how effective error, ECR, and statistical power for a likelihood ratio test of zero slope variance formally relate to each other and how they function as indices of statistical power. We also provide a computational approach to derive ECR for arbitrary intercept-slope covariance. With practical use cases, we argue for the complementary utility of the proposed indices of a study's sensitivity to detect slope variance when making a priori longitudinal design decisions or communicating study designs. PMID:29755377

  19. Surface flaw reliability analysis of ceramic components with the SCARE finite element postprocessor program

    NASA Technical Reports Server (NTRS)

    Gyekenyesi, John P.; Nemeth, Noel N.

    1987-01-01

    The SCARE (Structural Ceramics Analysis and Reliability Evaluation) computer program on statistical fast fracture reliability analysis with quadratic elements for volume distributed imperfections is enhanced to include the use of linear finite elements and the capability of designing against concurrent surface flaw induced ceramic component failure. The SCARE code is presently coupled as a postprocessor to the MSC/NASTRAN general purpose, finite element analysis program. The improved version now includes the Weibull and Batdorf statistical failure theories for both surface and volume flaw based reliability analysis. The program uses the two-parameter Weibull fracture strength cumulative failure probability distribution model with the principle of independent action for poly-axial stress states, and Batdorf's shear-sensitive as well as shear-insensitive statistical theories. The shear-sensitive surface crack configurations include the Griffith crack and Griffith notch geometries, using the total critical coplanar strain energy release rate criterion to predict mixed-mode fracture. Weibull material parameters based on both surface and volume flaw induced fracture can also be calculated from modulus of rupture bar tests, using the least squares method with known specimen geometry and grouped fracture data. The statistical fast fracture theories for surface flaw induced failure, along with selected input and output formats and options, are summarized. An example problem to demonstrate various features of the program is included.

  20. Integrating Formal Methods and Testing 2002

    NASA Technical Reports Server (NTRS)

    Cukic, Bojan

    2002-01-01

    Traditionally, qualitative program verification methodologies and program testing are studied in separate research communities. None of them alone is powerful and practical enough to provide sufficient confidence in ultra-high reliability assessment when used exclusively. Significant advances can be made by accounting not only tho formal verification and program testing. but also the impact of many other standard V&V techniques, in a unified software reliability assessment framework. The first year of this research resulted in the statistical framework that, given the assumptions on the success of the qualitative V&V and QA procedures, significantly reduces the amount of testing needed to confidently assess reliability at so-called high and ultra-high levels (10-4 or higher). The coming years shall address the methodologies to realistically estimate the impacts of various V&V techniques to system reliability and include the impact of operational risk to reliability assessment. Combine formal correctness verification, process and product metrics, and other standard qualitative software assurance methods with statistical testing with the aim of gaining higher confidence in software reliability assessment for high-assurance applications. B) Quantify the impact of these methods on software reliability. C) Demonstrate that accounting for the effectiveness of these methods reduces the number of tests needed to attain certain confidence level. D) Quantify and justify the reliability estimate for systems developed using various methods.

  1. Principle of maximum entropy for reliability analysis in the design of machine components

    NASA Astrophysics Data System (ADS)

    Zhang, Yimin

    2018-03-01

    We studied the reliability of machine components with parameters that follow an arbitrary statistical distribution using the principle of maximum entropy (PME). We used PME to select the statistical distribution that best fits the available information. We also established a probability density function (PDF) and a failure probability model for the parameters of mechanical components using the concept of entropy and the PME. We obtained the first four moments of the state function for reliability analysis and design. Furthermore, we attained an estimate of the PDF with the fewest human bias factors using the PME. This function was used to calculate the reliability of the machine components, including a connecting rod, a vehicle half-shaft, a front axle, a rear axle housing, and a leaf spring, which have parameters that typically follow a non-normal distribution. Simulations were conducted for comparison. This study provides a design methodology for the reliability of mechanical components for practical engineering projects.

  2. NDE reliability and probability of detection (POD) evolution and paradigm shift

    NASA Astrophysics Data System (ADS)

    Singh, Surendra

    2014-02-01

    The subject of NDE Reliability and POD has gone through multiple phases since its humble beginning in the late 1960s. This was followed by several programs including the important one nicknamed "Have Cracks - Will Travel" or in short "Have Cracks" by Lockheed Georgia Company for US Air Force during 1974-1978. This and other studies ultimately led to a series of developments in the field of reliability and POD starting from the introduction of fracture mechanics and Damaged Tolerant Design (DTD) to statistical framework by Bernes and Hovey in 1981 for POD estimation to MIL-STD HDBK 1823 (1999) and 1823A (2009). During the last decade, various groups and researchers have further studied the reliability and POD using Model Assisted POD (MAPOD), Simulation Assisted POD (SAPOD), and applying Bayesian Statistics. All and each of these developments had one objective, i.e., improving accuracy of life prediction in components that to a large extent depends on the reliability and capability of NDE methods. Therefore, it is essential to have a reliable detection and sizing of large flaws in components. Currently, POD is used for studying reliability and capability of NDE methods, though POD data offers no absolute truth regarding NDE reliability, i.e., system capability, effects of flaw morphology, and quantifying the human factors. Furthermore, reliability and POD have been reported alike in meaning but POD is not NDE reliability. POD is a subset of the reliability that consists of six phases: 1) samples selection using DOE, 2) NDE equipment setup and calibration, 3) System Measurement Evaluation (SME) including Gage Repeatability &Reproducibility (Gage R&R) and Analysis Of Variance (ANOVA), 4) NDE system capability and electronic and physical saturation, 5) acquiring and fitting data to a model, and data analysis, and 6) POD estimation. This paper provides an overview of all major POD milestones for the last several decades and discuss rationale for using Integrated Computational Materials Engineering (ICME), MAPOD, SAPOD, and Bayesian statistics for studying controllable and non-controllable variables including human factors for estimating POD. Another objective is to list gaps between "hoped for" versus validated or fielded failed hardware.

  3. Solar-cell interconnect design for terrestrial photovoltaic modules

    NASA Technical Reports Server (NTRS)

    Mon, G. R.; Moore, D. M.; Ross, R. G., Jr.

    1984-01-01

    Useful solar cell interconnect reliability design and life prediction algorithms are presented, together with experimental data indicating that the classical strain cycle (fatigue) curve for the interconnect material does not account for the statistical scatter that is required in reliability predictions. This shortcoming is presently addressed by fitting a functional form to experimental cumulative interconnect failure rate data, which thereby yields statistical fatigue curves enabling not only the prediction of cumulative interconnect failures during the design life of an array field, but also the quantitative interpretation of data from accelerated thermal cycling tests. Optimal interconnect cost reliability design algorithms are also derived which may allow the minimization of energy cost over the design life of the array field.

  4. Solar-cell interconnect design for terrestrial photovoltaic modules

    NASA Astrophysics Data System (ADS)

    Mon, G. R.; Moore, D. M.; Ross, R. G., Jr.

    1984-11-01

    Useful solar cell interconnect reliability design and life prediction algorithms are presented, together with experimental data indicating that the classical strain cycle (fatigue) curve for the interconnect material does not account for the statistical scatter that is required in reliability predictions. This shortcoming is presently addressed by fitting a functional form to experimental cumulative interconnect failure rate data, which thereby yields statistical fatigue curves enabling not only the prediction of cumulative interconnect failures during the design life of an array field, but also the quantitative interpretation of data from accelerated thermal cycling tests. Optimal interconnect cost reliability design algorithms are also derived which may allow the minimization of energy cost over the design life of the array field.

  5. Small numbers, disclosure risk, security, and reliability issues in Web-based data query systems.

    PubMed

    Rudolph, Barbara A; Shah, Gulzar H; Love, Denise

    2006-01-01

    This article describes the process for developing consensus guidelines and tools for releasing public health data via the Web and highlights approaches leading agencies have taken to balance disclosure risk with public dissemination of reliable health statistics. An agency's choice of statistical methods for improving the reliability of released data for Web-based query systems is based upon a number of factors, including query system design (dynamic analysis vs preaggregated data and tables), population size, cell size, data use, and how data will be supplied to users. The article also describes those efforts that are necessary to reduce the risk of disclosure of an individual's protected health information.

  6. Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, paranoid personality disorder diagnosis: a unitary or a two-dimensional construct?

    PubMed

    Falkum, Erik; Pedersen, Geir; Karterud, Sigmund

    2009-01-01

    This article examines reliability and validity aspects of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) paranoid personality disorder (PPD) diagnosis. Patients with personality disorders (n = 930) from the Norwegian network of psychotherapeutic day hospitals, of which 114 had PPD, were included in the study. Frequency distribution, chi(2), correlations, reliability statistics, exploratory, and confirmatory factor analyses were performed. The distribution of PPD criteria revealed no distinct boundary between patients with and without PPD. Diagnostic category membership was obtained in 37 of 64 theoretically possible ways. The PPD criteria formed a separate factor in a principal component analysis, whereas a confirmatory factor analysis indicated that the DSM-IV PPD construct consists of 2 separate dimensions as follows: suspiciousness and hostility. The reliability of the unitary PPD scale was only 0.70, probably partly due to the apparent 2-dimensionality of the construct. Persistent unwarranted doubts about the loyalty of friends had the highest diagnostic efficiency, whereas unwarranted accusations of infidelity of partner had particularly poor indicator properties. The reliability and validity of the unitary PPD construct may be questioned. The 2-dimensional PPD model should be further explored.

  7. Statistical Considerations in Choosing a Test Reliability Coefficient. ACT Research Report Series, 2012 (10)

    ERIC Educational Resources Information Center

    Woodruff, David; Wu, Yi-Fang

    2012-01-01

    The purpose of this paper is to illustrate alpha's robustness and usefulness, using actual and simulated educational test data. The sampling properties of alpha are compared with the sampling properties of several other reliability coefficients: Guttman's lambda[subscript 2], lambda[subscript 4], and lambda[subscript 6]; test-retest reliability;…

  8. Final Report: Studies in Structural, Stochastic and Statistical Reliability for Communication Networks and Engineered Systems

    DTIC Science & Technology

    to do so, and (5) three distinct versions of the problem of estimating component reliability from system failure-time data are treated, each resulting inconsistent estimators with asymptotically normal distributions.

  9. Test-retest reliability of biodex system 4 pro for isometric ankle-eversion and -inversion measurement.

    PubMed

    Tankevicius, Gediminas; Lankaite, Doanata; Krisciunas, Aleksandras

    2013-08-01

    The lack of knowledge about isometric ankle testing indicates the need for research in this area. to assess test-retest reliability and to determine the optimal position for isometric ankle-eversion and -inversion testing. Test-retest reliability study. Isometric ankle eversion and inversion were assessed in 3 different dynamometer foot-plate positions: 0°, 7°, and 14° of inversion. Two maximal repetitions were performed at each angle. Both limbs were tested (40 ankles in total). The test was performed 2 times with a period of 7 d between the tests. University hospital. The study was carried out on 20 healthy athletes with no history of ankle sprains. Reliability was assessed using intraclass correlation coefficient (ICC2,1); minimal detectable change (MDC) was calculated using a 95% confidence interval. Paired t test was used to measure statistically significant changes, and P <.05 was considered statistically significant. Eversion and inversion peak torques showed high ICCs in all 3 angles (ICC values .87-.96, MDC values 3.09-6.81 Nm). Eversion peak torque was the smallest when testing at the 0° angle and gradually increased, reaching maximum values at 14° angle. The increase of eversion peak torque was statistically significant at 7 ° and 14° of inversion. Inversion peak torque showed an opposite pattern-it was the smallest when measured at the 14° angle and increased at the other 2 angles; statistically significant changes were seen only between measures taken at 0° and 14°. Isometric eversion and inversion testing using the Biodex 4 Pro system is a reliable method. The authors suggest that the angle of 7° of inversion is the best for isometric eversion and inversion testing.

  10. 76 FR 34385 - Program Integrity: Gainful Employment-Debt Measures

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-13

    ... postsecondary education at a public institution. National Center for Education Statistics, 2004/2009 Beginning... reliable earnings information, including use of State data, survey data, or Bureau of Labor Statistics (BLS...

  11. Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial

    PubMed Central

    Hallgren, Kevin A.

    2012-01-01

    Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohen’s kappa and intra-class correlations to assess IRR. PMID:22833776

  12. The limits of protein sequence comparison?

    PubMed Central

    Pearson, William R; Sierk, Michael L

    2010-01-01

    Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized. PMID:15919194

  13. A reliability study on brain activation during active and passive arm movements supported by an MRI-compatible robot.

    PubMed

    Estévez, Natalia; Yu, Ningbo; Brügger, Mike; Villiger, Michael; Hepp-Reymond, Marie-Claude; Riener, Robert; Kollias, Spyros

    2014-11-01

    In neurorehabilitation, longitudinal assessment of arm movement related brain function in patients with motor disability is challenging due to variability in task performance. MRI-compatible robots monitor and control task performance, yielding more reliable evaluation of brain function over time. The main goals of the present study were first to define the brain network activated while performing active and passive elbow movements with an MRI-compatible arm robot (MaRIA) in healthy subjects, and second to test the reproducibility of this activation over time. For the fMRI analysis two models were compared. In model 1 movement onset and duration were included, whereas in model 2 force and range of motion were added to the analysis. Reliability of brain activation was tested with several statistical approaches applied on individual and group activation maps and on summary statistics. The activated network included mainly the primary motor cortex, primary and secondary somatosensory cortex, superior and inferior parietal cortex, medial and lateral premotor regions, and subcortical structures. Reliability analyses revealed robust activation for active movements with both fMRI models and all the statistical methods used. Imposed passive movements also elicited mainly robust brain activation for individual and group activation maps, and reliability was improved by including additional force and range of motion using model 2. These findings demonstrate that the use of robotic devices, such as MaRIA, can be useful to reliably assess arm movement related brain activation in longitudinal studies and may contribute in studies evaluating therapies and brain plasticity following injury in the nervous system.

  14. A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model.

    PubMed

    Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

    2017-01-01

    The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.

  15. Statistical summaries of fatigue data for design purposes

    NASA Technical Reports Server (NTRS)

    Wirsching, P. H.

    1983-01-01

    Two methods are discussed for constructing a design curve on the safe side of fatigue data. Both the tolerance interval and equivalent prediction interval (EPI) concepts provide such a curve while accounting for both the distribution of the estimators in small samples and the data scatter. The EPI is also useful as a mechanism for providing necessary statistics on S-N data for a full reliability analysis which includes uncertainty in all fatigue design factors. Examples of statistical analyses of the general strain life relationship are presented. The tolerance limit and EPI techniques for defining a design curve are demonstrated. Examples usng WASPALOY B and RQC-100 data demonstrate that a reliability model could be constructed by considering the fatigue strength and fatigue ductility coefficients as two independent random variables. A technique given for establishing the fatigue strength for high cycle lives relies on an extrapolation technique and also accounts for "runners." A reliability model or design value can be specified.

  16. Implementing the Routine Computation and Use of Roadway Performance Measures within WSDOT

    DOT National Transportation Integrated Search

    2017-08-01

    The Washington State Department of Transportation (WSDOT) is one of the nation's leaders in calculating and using reliability statistics for urban freeways. The Department currently uses reliability measures for decision making for urban freeways-whe...

  17. Computing Reliabilities Of Ceramic Components Subject To Fracture

    NASA Technical Reports Server (NTRS)

    Nemeth, N. N.; Gyekenyesi, J. P.; Manderscheid, J. M.

    1992-01-01

    CARES calculates fast-fracture reliability or failure probability of macroscopically isotropic ceramic components. Program uses results from commercial structural-analysis program (MSC/NASTRAN or ANSYS) to evaluate reliability of component in presence of inherent surface- and/or volume-type flaws. Computes measure of reliability by use of finite-element mathematical model applicable to multiple materials in sense model made function of statistical characterizations of many ceramic materials. Reliability analysis uses element stress, temperature, area, and volume outputs, obtained from two-dimensional shell and three-dimensional solid isoparametric or axisymmetric finite elements. Written in FORTRAN 77.

  18. The influence of various test plans on mission reliability. [for Shuttle Spacelab payloads

    NASA Technical Reports Server (NTRS)

    Stahle, C. V.; Gongloff, H. R.; Young, J. P.; Keegan, W. B.

    1977-01-01

    Methods have been developed for the evaluation of cost effective vibroacoustic test plans for Shuttle Spacelab payloads. The shock and vibration environments of components have been statistically represented, and statistical decision theory has been used to evaluate the cost effectiveness of five basic test plans with structural test options for two of the plans. Component, subassembly, and payload testing have been performed for each plan along with calculations of optimum test levels and expected costs. The tests have been ranked according to both minimizing expected project costs and vibroacoustic reliability. It was found that optimum costs may vary up to $6 million with the lowest plan eliminating component testing and maintaining flight vibration reliability via subassembly tests at high acoustic levels.

  19. 7 CFR 785.8 - Reports by qualifying States receiving mediation grant funds.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... reliable statistical data as may be obtained from State statistical sources including the certified State's bar association, State Department of Agriculture, State court system or Better Business Bureau, or...

  20. Statistical Knowledge and the Over-Interpretation of Student Evaluations of Teaching

    ERIC Educational Resources Information Center

    Boysen, Guy A.

    2017-01-01

    Research shows that teachers interpret small differences in student evaluations of teaching as meaningful even when available statistical information indicates that the differences are not reliable. The current research explored the effect of statistical training on college teachers' tendency to over-interpret student evaluation differences. A…

  1. Software Reliability, Measurement, and Testing. Volume 2. Guidebook for Software Reliability Measurement and Testing

    DTIC Science & Technology

    1992-04-01

    contractor’s existing data collection, analysis and corrective action system shall be utilized, with modification only as necessary to meet the...either from test or from analysis of field data . The procedures of MIL-STD-756B assume that the reliability of a 18 DEFINE IDENTIFY SOFTWARE LIFE CYCLE...to generate sufficient data to report a statistically valid reliability figure for a class of software. Casual data gathering accumulates data more

  2. Statistical validity of using ratio variables in human kinetics research.

    PubMed

    Liu, Yuanlong; Schutz, Robert W

    2003-09-01

    The purposes of this study were to investigate the validity of the simple ratio and three alternative deflation models and examine how the variation of the numerator and denominator variables affects the reliability of a ratio variable. A simple ratio and three alternative deflation models were fitted to four empirical data sets, and common criteria were applied to determine the best model for deflation. Intraclass correlation was used to examine the component effect on the reliability of a ratio variable. The results indicate that the validity, of a deflation model depends on the statistical characteristics of the particular component variables used, and an optimal deflation model for all ratio variables may not exist. Therefore, it is recommended that different models be fitted to each empirical data set to determine the best deflation model. It was found that the reliability of a simple ratio is affected by the coefficients of variation and the within- and between-trial correlations between the numerator and denominator variables. It was recommended that researchers should compute the reliability of the derived ratio scores and not assume that strong reliabilities in the numerator and denominator measures automatically lead to high reliability in the ratio measures.

  3. Inter-rater reliability of the Sødring Motor Evaluation of Stroke patients (SMES).

    PubMed

    Halsaa, K E; Sødring, K M; Bjelland, E; Finsrud, K; Bautz-Holter, E

    1999-12-01

    The Sødring Motor Evaluation of Stroke patients is an instrument for physiotherapists to evaluate motor function and activities in stroke patients. The rating reflects quality as well as quantity of the patient's unassisted performance within three domains: leg, arm and gross function. The inter-rater reliability of the method was studied in a sample of 30 patients admitted to a stroke rehabilitation unit. Three therapists were involved in the study; two therapists assessed the same patient on two consecutive days in a balanced design. Cohen's weighted kappa and McNemar's test of symmetry were used as measures of item reliability, and the intraclass correlation coefficient was used to express the reliability of the sumscores. For 24 out of 32 items the weighted kappa statistic was excellent (0.75-0.98), while 7 items had a kappa statistic within the range 0.53-0.74 (fair to good). The reliability of one item was poor (0.13). The intraclass correlation coefficient for the three sumscores was 0.97, 0.91 and 0.97. We conclude that the Sødring Motor Evaluation of Stroke patients is a reliable measure of motor function in stroke patients undergoing rehabilitation.

  4. Statistical model selection for better prediction and discovering science mechanisms that affect reliability

    DOE PAGES

    Anderson-Cook, Christine M.; Morzinski, Jerome; Blecker, Kenneth D.

    2015-08-19

    Understanding the impact of production, environmental exposure and age characteristics on the reliability of a population is frequently based on underlying science and empirical assessment. When there is incomplete science to prescribe which inputs should be included in a model of reliability to predict future trends, statistical model/variable selection techniques can be leveraged on a stockpile or population of units to improve reliability predictions as well as suggest new mechanisms affecting reliability to explore. We describe a five-step process for exploring relationships between available summaries of age, usage and environmental exposure and reliability. The process involves first identifying potential candidatemore » inputs, then second organizing data for the analysis. Third, a variety of models with different combinations of the inputs are estimated, and fourth, flexible metrics are used to compare them. As a result, plots of the predicted relationships are examined to distill leading model contenders into a prioritized list for subject matter experts to understand and compare. The complexity of the model, quality of prediction and cost of future data collection are all factors to be considered by the subject matter experts when selecting a final model.« less

  5. The Future of Statistical Software. Proceedings of a Forum--Panel on Guidelines for Statistical Software (Washington, D.C., February 22, 1991).

    ERIC Educational Resources Information Center

    National Academy of Sciences - National Research Council, Washington, DC.

    The Panel on Guidelines for Statistical Software was organized in 1990 to document, assess, and prioritize problem areas regarding quality and reliability of statistical software; present prototype guidelines in high priority areas; and make recommendations for further research and discussion. This document provides the following papers presented…

  6. Modeling, implementation, and validation of arterial travel time reliability : [summary].

    DOT National Transportation Integrated Search

    2013-11-01

    Travel time reliability (TTR) has been proposed as : a better measure of a facilitys performance than : a statistical measure like peak hour demand. TTR : is based on more information about average traffic : flows and longer time periods, thus inc...

  7. NDE reliability and probability of detection (POD) evolution and paradigm shift

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, Surendra

    2014-02-18

    The subject of NDE Reliability and POD has gone through multiple phases since its humble beginning in the late 1960s. This was followed by several programs including the important one nicknamed “Have Cracks – Will Travel” or in short “Have Cracks” by Lockheed Georgia Company for US Air Force during 1974–1978. This and other studies ultimately led to a series of developments in the field of reliability and POD starting from the introduction of fracture mechanics and Damaged Tolerant Design (DTD) to statistical framework by Bernes and Hovey in 1981 for POD estimation to MIL-STD HDBK 1823 (1999) and 1823Amore » (2009). During the last decade, various groups and researchers have further studied the reliability and POD using Model Assisted POD (MAPOD), Simulation Assisted POD (SAPOD), and applying Bayesian Statistics. All and each of these developments had one objective, i.e., improving accuracy of life prediction in components that to a large extent depends on the reliability and capability of NDE methods. Therefore, it is essential to have a reliable detection and sizing of large flaws in components. Currently, POD is used for studying reliability and capability of NDE methods, though POD data offers no absolute truth regarding NDE reliability, i.e., system capability, effects of flaw morphology, and quantifying the human factors. Furthermore, reliability and POD have been reported alike in meaning but POD is not NDE reliability. POD is a subset of the reliability that consists of six phases: 1) samples selection using DOE, 2) NDE equipment setup and calibration, 3) System Measurement Evaluation (SME) including Gage Repeatability and Reproducibility (Gage R and R) and Analysis Of Variance (ANOVA), 4) NDE system capability and electronic and physical saturation, 5) acquiring and fitting data to a model, and data analysis, and 6) POD estimation. This paper provides an overview of all major POD milestones for the last several decades and discuss rationale for using Integrated Computational Materials Engineering (ICME), MAPOD, SAPOD, and Bayesian statistics for studying controllable and non-controllable variables including human factors for estimating POD. Another objective is to list gaps between “hoped for” versus validated or fielded failed hardware.« less

  8. Reliability-based econometrics of aerospace structural systems: Design criteria and test options. Ph.D. Thesis - Georgia Inst. of Tech.

    NASA Technical Reports Server (NTRS)

    Thomas, J. M.; Hanagud, S.

    1974-01-01

    The design criteria and test options for aerospace structural reliability were investigated. A decision methodology was developed for selecting a combination of structural tests and structural design factors. The decision method involves the use of Bayesian statistics and statistical decision theory. Procedures are discussed for obtaining and updating data-based probabilistic strength distributions for aerospace structures when test information is available and for obtaining subjective distributions when data are not available. The techniques used in developing the distributions are explained.

  9. Development of modelling algorithm of technological systems by statistical tests

    NASA Astrophysics Data System (ADS)

    Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.

    2018-03-01

    The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.

  10. Small sample estimation of the reliability function for technical products

    NASA Astrophysics Data System (ADS)

    Lyamets, L. L.; Yakimenko, I. V.; Kanishchev, O. A.; Bliznyuk, O. A.

    2017-12-01

    It is demonstrated that, in the absence of big statistic samples obtained as a result of testing complex technical products for failure, statistic estimation of the reliability function of initial elements can be made by the moments method. A formal description of the moments method is given and its advantages in the analysis of small censored samples are discussed. A modified algorithm is proposed for the implementation of the moments method with the use of only the moments at which the failures of initial elements occur.

  11. Developing Confidence Limits For Reliability Of Software

    NASA Technical Reports Server (NTRS)

    Hayhurst, Kelly J.

    1991-01-01

    Technique developed for estimating reliability of software by use of Moranda geometric de-eutrophication model. Pivotal method enables straightforward construction of exact bounds with associated degree of statistical confidence about reliability of software. Confidence limits thus derived provide precise means of assessing quality of software. Limits take into account number of bugs found while testing and effects of sampling variation associated with random order of discovering bugs.

  12. Brief Report: Interrater Reliability of Clinical Diagnosis and DSM-IV Criteria for Autistic Disorder: Results of the DSM-IV Autism Field Trial.

    ERIC Educational Resources Information Center

    Klin, Ami; Lang, Jason; Cicchetti, Domenic V.; Volkmar, Fred R.

    2000-01-01

    This study examined the inter-rater reliability of clinician-assigned diagnosis of autism using or not using the criteria specified in the Diagnostic and Statistical Manual IV (DSM-IV). For experienced raters there was little difference in reliability in the two conditions. However, a clinically significant improvement in diagnostic reliability…

  13. Software For Computing Reliability Of Other Software

    NASA Technical Reports Server (NTRS)

    Nikora, Allen; Antczak, Thomas M.; Lyu, Michael

    1995-01-01

    Computer Aided Software Reliability Estimation (CASRE) computer program developed for use in measuring reliability of other software. Easier for non-specialists in reliability to use than many other currently available programs developed for same purpose. CASRE incorporates mathematical modeling capabilities of public-domain Statistical Modeling and Estimation of Reliability Functions for Software (SMERFS) computer program and runs in Windows software environment. Provides menu-driven command interface; enabling and disabling of menu options guides user through (1) selection of set of failure data, (2) execution of mathematical model, and (3) analysis of results from model. Written in C language.

  14. Topics in Measurement: Reliability and Validity.

    ERIC Educational Resources Information Center

    Dick, Walter; Hagerty, Nancy

    This text was developed on an autoinstructional basis to familiarize the reader with the various interpretations of reliability and validity, their measurement and evaluation, and factors influencing their measurement. The text enables those with prior knowledge of statistics to increase their understanding of variance and correlation. Review…

  15. The Attenuation of Correlation Coefficients: A Statistical Literacy Issue

    ERIC Educational Resources Information Center

    Trafimow, David

    2016-01-01

    Much of the science reported in the media depends on correlation coefficients. But the size of correlation coefficients depends, in part, on the reliability with which the correlated variables are measured. Understanding this is a statistical literacy issue.

  16. PERFORMANCE OF TRICKLING FILTER PLANTS: RELIABILITY, STABILITY, VARIABILITY

    EPA Science Inventory

    Effluent quality variability from trickling filters was examined in this study by statistically analyzing daily effluent BOD5 and suspended solids data from 11 treatment plants. Summary statistics (mean, standard deviation, etc.) were examined to determine the general characteris...

  17. Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

    PubMed

    Caruso, J C

    2001-06-01

    The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.

  18. Investigation of improving MEMS-type VOA reliability

    NASA Astrophysics Data System (ADS)

    Hong, Seok K.; Lee, Yeong G.; Park, Moo Y.

    2003-12-01

    MEMS technologies have been applied to a lot of areas, such as optical communications, Gyroscopes and Bio-medical components and so on. In terms of the applications in the optical communication field, MEMS technologies are essential, especially, in multi dimensional optical switches and Variable Optical Attenuators(VOAs). This paper describes the process for the development of MEMS type VOAs with good optical performance and improved reliability. Generally, MEMS VOAs have been fabricated by silicon micro-machining process, precise fibre alignment and sophisticated packaging process. Because, it is composed of many structures with various materials, it is difficult to make devices reliable. We have developed MEMS type VOSs with many failure mode considerations (FMEA: Failure Mode Effect Analysis) in the initial design step, predicted critical failure factors and revised the design, and confirmed the reliability by preliminary test. These predicted failure factors were moisture, bonding strength of the wire, which wired between the MEMS chip and TO-CAN and instability of supplied signals. Statistical quality control tools (ANOVA, T-test and so on) were used to control these potential failure factors and produce optimum manufacturing conditions. To sum up, we have successfully developed reliable MEMS type VOAs with good optical performances by controlling potential failure factors and using statistical quality control tools. As a result, developed VOAs passed international reliability standards (Telcodia GR-1221-CORE).

  19. Investigation of improving MEMS-type VOA reliability

    NASA Astrophysics Data System (ADS)

    Hong, Seok K.; Lee, Yeong G.; Park, Moo Y.

    2004-01-01

    MEMS technologies have been applied to a lot of areas, such as optical communications, Gyroscopes and Bio-medical components and so on. In terms of the applications in the optical communication field, MEMS technologies are essential, especially, in multi dimensional optical switches and Variable Optical Attenuators(VOAs). This paper describes the process for the development of MEMS type VOAs with good optical performance and improved reliability. Generally, MEMS VOAs have been fabricated by silicon micro-machining process, precise fibre alignment and sophisticated packaging process. Because, it is composed of many structures with various materials, it is difficult to make devices reliable. We have developed MEMS type VOSs with many failure mode considerations (FMEA: Failure Mode Effect Analysis) in the initial design step, predicted critical failure factors and revised the design, and confirmed the reliability by preliminary test. These predicted failure factors were moisture, bonding strength of the wire, which wired between the MEMS chip and TO-CAN and instability of supplied signals. Statistical quality control tools (ANOVA, T-test and so on) were used to control these potential failure factors and produce optimum manufacturing conditions. To sum up, we have successfully developed reliable MEMS type VOAs with good optical performances by controlling potential failure factors and using statistical quality control tools. As a result, developed VOAs passed international reliability standards (Telcodia GR-1221-CORE).

  20. A Comparative Evaluation of Mixed Dentition Analysis on Reliability of Cone Beam Computed Tomography Image Compared to Plaster Model

    PubMed Central

    Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam

    2017-01-01

    Aims and Objective: The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Materials and Methods: Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t-test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Results: Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. Conclusion: CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis. PMID:28852639

  1. Computation of the Molenaar Sijtsma Statistic

    NASA Astrophysics Data System (ADS)

    Andries van der Ark, L.

    The Molenaar Sijtsma statistic is an estimate of the reliability of a test score. In some special cases, computation of the Molenaar Sijtsma statistic requires provisional measures. These provisional measures have not been fully described in the literature, and we show that they have not been implemented in the software. We describe the required provisional measures as to allow the computation of the Molenaar Sijtsma statistic for all data sets.

  2. CRACK GROWTH ANALYSIS OF SOLID OXIDE FUEL CELL ELECTROLYTES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    S. Bandopadhyay; N. Nagabhushana

    2003-10-01

    Defects and Flaws control the structural and functional property of ceramics. In determining the reliability and lifetime of ceramics structures it is very important to quantify the crack growth behavior of the ceramics. In addition, because of the high variability of the strength and the relatively low toughness of ceramics, a statistical design approach is necessary. The statistical nature of the strength of ceramics is currently well recognized, and is usually accounted for by utilizing Weibull or similar statistical distributions. Design tools such as CARES using a combination of strength measurements, stress analysis, and statistics are available and reasonably wellmore » developed. These design codes also incorporate material data such as elastic constants as well as flaw distributions and time-dependent properties. The fast fracture reliability for ceramics is often different from their time-dependent reliability. Further confounding the design complexity, the time-dependent reliability varies with the environment/temperature/stress combination. Therefore, it becomes important to be able to accurately determine the behavior of ceramics under simulated application conditions to provide a better prediction of the lifetime and reliability for a given component. In the present study, Yttria stabilized Zirconia (YSZ) of 9.6 mol% Yttria composition was procured in the form of tubes of length 100 mm. The composition is of interest as tubular electrolytes for Solid Oxide Fuel Cells. Rings cut from the tubes were characterized for microstructure, phase stability, mechanical strength (Weibull modulus) and fracture mechanisms. The strength at operating condition of SOFCs (1000 C) decreased to 95 MPa as compared to room temperature strength of 230 MPa. However, the Weibull modulus remains relatively unchanged. Slow crack growth (SCG) parameter, n = 17 evaluated at room temperature in air was representative of well studied brittle materials. Based on the results, further work was planned to evaluate the strength degradation, modulus and failure in more representative environment of the SOFCs.« less

  3. Reliability analysis of structural ceramics subjected to biaxial flexure

    NASA Technical Reports Server (NTRS)

    Chao, Luen-Yuan; Shetty, Dinesh K.

    1991-01-01

    The reliability of alumina disks subjected to biaxial flexure is predicted on the basis of statistical fracture theory using a critical strain energy release rate fracture criterion. Results on a sintered silicon nitride are consistent with reliability predictions based on pore-initiated penny-shaped cracks with preferred orientation normal to the maximum principal stress. Assumptions with regard to flaw types and their orientations in each ceramic can be justified by fractography. It is shown that there are no universal guidelines for selecting fracture criteria or assuming flaw orientations in reliability analyses.

  4. Statistical Analysis Tools for Learning in Engineering Laboratories.

    ERIC Educational Resources Information Center

    Maher, Carolyn A.

    1990-01-01

    Described are engineering programs that have used automated data acquisition systems to implement data collection and analyze experiments. Applications include a biochemical engineering laboratory, heat transfer performance, engineering materials testing, mechanical system reliability, statistical control laboratory, thermo-fluid laboratory, and a…

  5. A framework for conducting mechanistic based reliability assessments of components operating in complex systems

    NASA Astrophysics Data System (ADS)

    Wallace, Jon Michael

    2003-10-01

    Reliability prediction of components operating in complex systems has historically been conducted in a statistically isolated manner. Current physics-based, i.e. mechanistic, component reliability approaches focus more on component-specific attributes and mathematical algorithms and not enough on the influence of the system. The result is that significant error can be introduced into the component reliability assessment process. The objective of this study is the development of a framework that infuses the needs and influence of the system into the process of conducting mechanistic-based component reliability assessments. The formulated framework consists of six primary steps. The first three steps, identification, decomposition, and synthesis, are primarily qualitative in nature and employ system reliability and safety engineering principles to construct an appropriate starting point for the component reliability assessment. The following two steps are the most unique. They involve a step to efficiently characterize and quantify the system-driven local parameter space and a subsequent step using this information to guide the reduction of the component parameter space. The local statistical space quantification step is accomplished using two proposed multivariate probability models: Multi-Response First Order Second Moment and Taylor-Based Inverse Transformation. Where existing joint probability models require preliminary distribution and correlation information of the responses, these models combine statistical information of the input parameters with an efficient sampling of the response analyses to produce the multi-response joint probability distribution. Parameter space reduction is accomplished using Approximate Canonical Correlation Analysis (ACCA) employed as a multi-response screening technique. The novelty of this approach is that each individual local parameter and even subsets of parameters representing entire contributing analyses can now be rank ordered with respect to their contribution to not just one response, but the entire vector of component responses simultaneously. The final step of the framework is the actual probabilistic assessment of the component. Although the same multivariate probability tools employed in the characterization step can be used for the component probability assessment, variations of this final step are given to allow for the utilization of existing probabilistic methods such as response surface Monte Carlo and Fast Probability Integration. The overall framework developed in this study is implemented to assess the finite-element based reliability prediction of a gas turbine airfoil involving several failure responses. Results of this implementation are compared to results generated using the conventional 'isolated' approach as well as a validation approach conducted through large sample Monte Carlo simulations. The framework resulted in a considerable improvement to the accuracy of the part reliability assessment and an improved understanding of the component failure behavior. Considerable statistical complexity in the form of joint non-normal behavior was found and accounted for using the framework. Future applications of the framework elements are discussed.

  6. Change in blood coagulation indices as a function of the incubation period of plasma in a constant magnetic field. [considering heparin tolerance and recalcification

    NASA Technical Reports Server (NTRS)

    Yepishina, S. G.

    1974-01-01

    The influence of a constant magnetic field (CMF) with a strength of 250 and 2500 oersteds on the recalcification reaction and the tolerance of plasma to heparin was studied as a function of the exposure time of the plasma to the CMF. The maximum and reliable change in the activation of the coagulatory system of the blood was observed after a 20-hour incubation of the plasma in a CMF. As the exposure time increased, the recalcification reaction changed insigificantly; the difference between the mean arithmetic of the experiment and control values was not statistically reliable. The tolerance of the plasma to heparin as a function of the exposure time to the CMF of the plasma was considerably modified, an was statistically reliable.

  7. A psychometric evaluation of the Rorschach comprehensive system's perceptual thinking index.

    PubMed

    Dao, Tam K; Prevatt, Frances

    2006-04-01

    In this study, we investigated evidence for reliability and validity of the Perceptual Thinking Index (PTI; Exner, 2000a, 2000b) among an adult inpatient population. We conducted reliability and validity analyses on 107 patients who met the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text revision; American Psychiatric Association, 2000) criteria for a schizophrenia-spectrum disorder (SSD) or mood disorder with no psychotic features (MD). Results provided support for interrater reliability as well as internal consistency of the PTI. Furthermore, the PTI was an effective index in differentiating SSD patients from patients diagnosed with an MD. Finally, the PTI demonstrated adequate diagnostic statistics that can be useful in the classification of patients diagnosed with SSD and MD. We discuss methodological issues, implications for assessment practice, and directions for future research.

  8. Assessment of NDE Reliability Data

    NASA Technical Reports Server (NTRS)

    Yee, B. G. W.; Chang, F. H.; Couchman, J. C.; Lemon, G. H.; Packman, P. F.

    1976-01-01

    Twenty sets of relevant Nondestructive Evaluation (NDE) reliability data have been identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations has been formulated. A model to grade the quality and validity of the data sets has been developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, have been formulated for each NDE method. A comprehensive computer program has been written to calculate the probability of flaw detection at several confidence levels by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. Probability of detection curves at 95 and 50 percent confidence levels have been plotted for individual sets of relevant data as well as for several sets of merged data with common sets of NDE parameters.

  9. Leveraging Code Comments to Improve Software Reliability

    ERIC Educational Resources Information Center

    Tan, Lin

    2009-01-01

    Commenting source code has long been a common practice in software development. This thesis, consisting of three pieces of work, made novel use of the code comments written in natural language to improve software reliability. Our solution combines Natural Language Processing (NLP), Machine Learning, Statistics, and Program Analysis techniques to…

  10. Monte Carlo Approach for Reliability Estimations in Generalizability Studies.

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.

    A Monte Carlo approach is proposed, using the Statistical Analysis System (SAS) programming language, for estimating reliability coefficients in generalizability theory studies. Test scores are generated by a probabilistic model that considers the probability for a person with a given ability score to answer an item with a given difficulty…

  11. Interrater and intrarater reliability of FDI criteria applied to photographs of posterior tooth-colored restorations.

    PubMed

    Kim, Dohyun; Ahn, So-Yeon; Kim, Junyoung; Park, Sung-Ho

    2017-07-01

    Since 2007, the FDI World Dental Federation (FDI) criteria have been used for the clinical evaluation of dental restorations. However, the reliability of the FDI criteria has not been sufficiently addressed. The purpose of this study was to assess and compare the interrater and intrarater reliability of the FDI criteria by evaluating posterior tooth-colored restorations photographically. A total of 160 clinical photographs of posterior tooth-colored restorations were evaluated independently by 5 raters with 9 of the FDI criteria suitable for photographic evaluation. The raters recorded the score of each restoration by using 5 grades, and the score was dichotomized into the clinical evaluation scores. After 1 month, 2 of the raters reevaluated the same set of 160 photographs in random order. To estimate the interrater reliability among the 5 raters, the proportion of agreement was calculated, and the Fleiss multirater kappa statistic was used. For the intrarater reliability, the proportion of agreement was calculated, and the Cohen standard kappa statistic was used for each of the 2 raters. The interrater proportion of agreement was 0.41 to 0.57, and the kappa value was 0.09 to 0.39. Overall, the intrarater reliability was higher than the interrater reliability, and rater 1 demonstrated higher intrarater reliability than rater 2. The proportion of agreement and kappa values increased when the 5 scores were dichotomized. The reliability was relatively lower for the esthetic properties compared with the functional or biological properties. Within the limitations of this study, the FDI criteria presented slight to fair interrater reliability and fair to excellent intrarater reliability in the photographic evaluation of posterior tooth-colored restorations. The reliability was improved by simplifying the evaluation scores. Copyright © 2016 Editorial Council for the Journal of Prosthetic Dentistry. Published by Elsevier Inc. All rights reserved.

  12. A Statistical Testing Approach for Quantifying Software Reliability; Application to an Example System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chu, Tsong-Lun; Varuttamaseni, Athi; Baek, Joo-Seok

    The U.S. Nuclear Regulatory Commission (NRC) encourages the use of probabilistic risk assessment (PRA) technology in all regulatory matters, to the extent supported by the state-of-the-art in PRA methods and data. Although much has been accomplished in the area of risk-informed regulation, risk assessment for digital systems has not been fully developed. The NRC established a plan for research on digital systems to identify and develop methods, analytical tools, and regulatory guidance for (1) including models of digital systems in the PRAs of nuclear power plants (NPPs), and (2) incorporating digital systems in the NRC's risk-informed licensing and oversight activities.more » Under NRC's sponsorship, Brookhaven National Laboratory (BNL) explored approaches for addressing the failures of digital instrumentation and control (I and C) systems in the current NPP PRA framework. Specific areas investigated included PRA modeling digital hardware, development of a philosophical basis for defining software failure, and identification of desirable attributes of quantitative software reliability methods. Based on the earlier research, statistical testing is considered a promising method for quantifying software reliability. This paper describes a statistical software testing approach for quantifying software reliability and applies it to the loop-operating control system (LOCS) of an experimental loop of the Advanced Test Reactor (ATR) at Idaho National Laboratory (INL).« less

  13. Comparison of the Chinese ischemic stroke subclassification and Trial of Org 10172 in acute stroke treatment systems in minor stroke.

    PubMed

    Tan, Sha; Zhang, Lei; Chen, Xiaoyu; Wang, Yanqiang; Lin, Yinyao; Cai, Wei; Shan, Yilong; Qiu, Wei; Hu, Xueqiang; Lu, Zhengqi

    2016-09-06

    The underlying causes of minor stroke are difficult to assess. Here, we evaluate the reliability of the Chinese Ischemic Stroke Subclassification (CISS) system in patients with minor stroke, and compare it to the Trial of Org 10172 in Acute Stroke Treatment (TOAST) system. A total of 320 patients with minor stroke were retrospectively registered and categorized into different subgroups of the CISS and TOAST by two neurologists. Inter- and intra-rater agreement with the two systems were assessed with kappa statistics. The percentage of undetermined etiology (UE) cases in the CISS system was 77.3 % less than that in the TOAST system, which was statistically significant (P < 0.001). The percentage of large artery atherosclerosis (LAA) in the CISS system was 79.7 % more than that in the TOAST system, which was also statistically significant (P < 0.001). The kappa values for inter-examiner agreement were 0.898 (P = 0.031) and 0.732 (P = 0.022) for the CISS and TOAST systems, respectively. The intra-observer reliability indexes were moderate (0.569 for neurologist A, and 0.487 for neurologist B). The CISS and TOAST systems are both reliable in classifying patients with minor stroke. CISS classified more patients into known etiologic categories without sacrificing reliability.

  14. Abdominal pain endpoints currently recommended by the FDA and EMA for adult patients with irritable bowel syndrome may not be reliable in children.

    PubMed

    Saps, M; Lavigne, J V

    2015-06-01

    The Food and Drug Administration (FDA) recommended ≥30% decrease on patient-reported outcomes for pain be considered clinically significant in clinical trials for adults with irritable bowel syndrome. This percent change approach may not be appropriate for children. We compared three alternate approaches to determining clinically significant reductions in pain among children. 80 children with functional abdominal pain participated in a study of the efficacy of amitriptyline. Endpoints included patient-reported estimates of feeling better, and pain Visual Analog Scale (VAS). The minimum clinically important difference in pain report was calculated as (i) mean change in VAS score for children reporting being 'better'; (ii) percent changes in pain (≥30% and ≥50%) on the VAS; and (iii) statistically reliable changes on the VAS for 68% and 95% confidence intervals. There was poor agreement between the three approaches. 43.6% of the children who met the FDA ≥30% criterion for clinically significant change did not achieve a reliable level of improvement (95% confidence interval). Children's self-reported ratings of being better may not be statistically reliable. A combined approach in which children must report improvement as better and achieve a statistically significant change may be more appropriate for outcomes in clinical trials. © 2015 John Wiley & Sons Ltd.

  15. Psychometrics Matter in Health Behavior: A Long-term Reliability Generalization Study.

    PubMed

    Pickett, Andrew C; Valdez, Danny; Barry, Adam E

    2017-09-01

    Despite numerous calls for increased understanding and reporting of reliability estimates, social science research, including the field of health behavior, has been slow to respond and adopt such practices. Therefore, we offer a brief overview of reliability and common reporting errors; we then perform analyses to examine and demonstrate the variability of reliability estimates by sample and over time. Using meta-analytic reliability generalization, we examined the variability of coefficient alpha scores for a well-designed, consistent, nationwide health study, covering a span of nearly 40 years. For each year and sample, reliability varied. Furthermore, reliability was predicted by a sample characteristic that differed among age groups within each administration. We demonstrated that reliability is influenced by the methods and individuals from which a given sample is drawn. Our work echoes previous calls that psychometric properties, particularly reliability of scores, are important and must be considered and reported before drawing statistical conclusions.

  16. FabricS: A user-friendly, complete and robust software for particle shape-fabric analysis

    NASA Astrophysics Data System (ADS)

    Moreno Chávez, G.; Castillo Rivera, F.; Sarocchi, D.; Borselli, L.; Rodríguez-Sedano, L. A.

    2018-06-01

    Shape-fabric is a textural parameter related to the spatial arrangement of elongated particles in geological samples. Its usefulness spans a range from sedimentary petrology to igneous and metamorphic petrology. Independently of the process being studied, when a material flows, the elongated particles are oriented with the major axis in the direction of flow. In sedimentary petrology this information has been used for studies of paleo-flow direction of turbidites, the origin of quartz sediments, and locating ignimbrite vents, among others. In addition to flow direction and its polarity, the method enables flow rheology to be inferred. The use of shape-fabric has been limited due to the difficulties of automatically measuring particles and analyzing them with reliable circular statistics programs. This has dampened interest in the method for a long time. Shape-fabric measurement has increased in popularity since the 1980s thanks to the development of new image analysis techniques and circular statistics software. However, the programs currently available are unreliable, old and are incompatible with newer operating systems, or require programming skills. The goal of our work is to develop a user-friendly program, in the MATLAB environment, with a graphical user interface, that can process images and includes editing functions, and thresholds (elongation and size) for selecting a particle population and analyzing it with reliable circular statistics algorithms. Moreover, the method also has to produce rose diagrams, orientation vectors, and a complete series of statistical parameters. All these requirements are met by our new software. In this paper, we briefly explain the methodology from collection of oriented samples in the field to the minimum number of particles needed to obtain reliable fabric data. We obtained the data using specific statistical tests and taking into account the degree of iso-orientation of the samples and the required degree of reliability. The program has been verified by means of several simulations performed using appropriately designed features and by analyzing real samples.

  17. Exponential order statistic models of software reliability growth

    NASA Technical Reports Server (NTRS)

    Miller, D. R.

    1985-01-01

    Failure times of a software reliabilty growth process are modeled as order statistics of independent, nonidentically distributed exponential random variables. The Jelinsky-Moranda, Goel-Okumoto, Littlewood, Musa-Okumoto Logarithmic, and Power Law models are all special cases of Exponential Order Statistic Models, but there are many additional examples also. Various characterizations, properties and examples of this class of models are developed and presented.

  18. Network-based statistical comparison of citation topology of bibliographic databases

    PubMed Central

    Šubelj, Lovro; Fiala, Dalibor; Bajec, Marko

    2014-01-01

    Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science, CiteSeer and arXiv.org. The networks are assessed through a rich set of local and global graph statistics. We first reveal statistically significant inconsistencies between some of the databases with respect to individual statistics. For example, the introduced field bow-tie decomposition of DBLP Computer Science Bibliography substantially differs from the rest due to the coverage of the database, while the citation information within arXiv.org is the most exhaustive. Finally, we compare the databases over multiple graph statistics using the critical difference diagram. The citation topology of DBLP Computer Science Bibliography is the least consistent with the rest, while, not surprisingly, Web of Science is significantly more reliable from the perspective of consistency. This work can serve either as a reference for scholars in bibliometrics and scientometrics or a scientific evaluation guideline for governments and research agencies. PMID:25263231

  19. Online incidental statistical learning of audiovisual word sequences in adults: a registered report.

    PubMed

    Kuppuraj, Sengottuvel; Duta, Mihaela; Thompson, Paul; Bishop, Dorothy

    2018-02-01

    Statistical learning has been proposed as a key mechanism in language learning. Our main goal was to examine whether adults are capable of simultaneously extracting statistical dependencies in a task where stimuli include a range of structures amenable to statistical learning within a single paradigm. We devised an online statistical learning task using real word auditory-picture sequences that vary in two dimensions: (i) predictability and (ii) adjacency of dependent elements. This task was followed by an offline recall task to probe learning of each sequence type. We registered three hypotheses with specific predictions. First, adults would extract regular patterns from continuous stream (effect of grammaticality). Second, within grammatical conditions, they would show differential speeding up for each condition as a factor of statistical complexity of the condition and exposure. Third, our novel approach to measure online statistical learning would be reliable in showing individual differences in statistical learning ability. Further, we explored the relation between statistical learning and a measure of verbal short-term memory (STM). Forty-two participants were tested and retested after an interval of at least 3 days on our novel statistical learning task. We analysed the reaction time data using a novel regression discontinuity approach. Consistent with prediction, participants showed a grammaticality effect, agreeing with the predicted order of difficulty for learning different statistical structures. Furthermore, a learning index from the task showed acceptable test-retest reliability ( r  = 0.67). However, STM did not correlate with statistical learning. We discuss the findings noting the benefits of online measures in tracking the learning process.

  20. Online incidental statistical learning of audiovisual word sequences in adults: a registered report

    PubMed Central

    Duta, Mihaela; Thompson, Paul

    2018-01-01

    Statistical learning has been proposed as a key mechanism in language learning. Our main goal was to examine whether adults are capable of simultaneously extracting statistical dependencies in a task where stimuli include a range of structures amenable to statistical learning within a single paradigm. We devised an online statistical learning task using real word auditory–picture sequences that vary in two dimensions: (i) predictability and (ii) adjacency of dependent elements. This task was followed by an offline recall task to probe learning of each sequence type. We registered three hypotheses with specific predictions. First, adults would extract regular patterns from continuous stream (effect of grammaticality). Second, within grammatical conditions, they would show differential speeding up for each condition as a factor of statistical complexity of the condition and exposure. Third, our novel approach to measure online statistical learning would be reliable in showing individual differences in statistical learning ability. Further, we explored the relation between statistical learning and a measure of verbal short-term memory (STM). Forty-two participants were tested and retested after an interval of at least 3 days on our novel statistical learning task. We analysed the reaction time data using a novel regression discontinuity approach. Consistent with prediction, participants showed a grammaticality effect, agreeing with the predicted order of difficulty for learning different statistical structures. Furthermore, a learning index from the task showed acceptable test–retest reliability (r = 0.67). However, STM did not correlate with statistical learning. We discuss the findings noting the benefits of online measures in tracking the learning process. PMID:29515876

  1. Reliability, reference values and predictor variables of the ulnar sensory nerve in disease free adults.

    PubMed

    Ruediger, T M; Allison, S C; Moore, J M; Wainner, R S

    2014-09-01

    The purposes of this descriptive and exploratory study were to examine electrophysiological measures of ulnar sensory nerve function in disease free adults to determine reliability, determine reference values computed with appropriate statistical methods, and examine predictive ability of anthropometric variables. Antidromic sensory nerve conduction studies of the ulnar nerve using surface electrodes were performed on 100 volunteers. Reference values were computed from optimally transformed data. Reliability was computed from 30 subjects. Multiple linear regression models were constructed from four predictor variables. Reliability was greater than 0.85 for all paired measures. Responses were elicited in all subjects; reference values for sensory nerve action potential (SNAP) amplitude from above elbow stimulation are 3.3 μV and decrement across-elbow less than 46%. No single predictor variable accounted for more than 15% of the variance in the response. Electrophysiologic measures of the ulnar sensory nerve are reliable. Absent SNAP responses are inconsistent with disease free individuals. Reference values recommended in this report are based on appropriate transformations of non-normally distributed data. No strong statistical model of prediction could be derived from the limited set of predictor variables. Reliability analyses combined with relatively low level of measurement error suggest that ulnar sensory reference values may be used with confidence. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  2. Assessment of Reliable Change Using 95% Credible Intervals for the Differences in Proportions: A Statistical Analysis for Case-Study Methodology.

    PubMed

    Unicomb, Rachael; Colyvas, Kim; Harrison, Elisabeth; Hewat, Sally

    2015-06-01

    Case-study methodology studying change is often used in the field of speech-language pathology, but it can be criticized for not being statistically robust. Yet with the heterogeneous nature of many communication disorders, case studies allow clinicians and researchers to closely observe and report on change. Such information is valuable and can further inform large-scale experimental designs. In this research note, a statistical analysis for case-study data is outlined that employs a modification to the Reliable Change Index (Jacobson & Truax, 1991). The relationship between reliable change and clinical significance is discussed. Example data are used to guide the reader through the use and application of this analysis. A method of analysis is detailed that is suitable for assessing change in measures with binary categorical outcomes. The analysis is illustrated using data from one individual, measured before and after treatment for stuttering. The application of this approach to assess change in categorical, binary data has potential application in speech-language pathology. It enables clinicians and researchers to analyze results from case studies for their statistical and clinical significance. This new method addresses a gap in the research design literature, that is, the lack of analysis methods for noncontinuous data (such as counts, rates, proportions of events) that may be used in case-study designs.

  3. Recent Reliability Reporting Practices in "Psychological Assessment": Recognizing the People behind the Data

    ERIC Educational Resources Information Center

    Green, Carlton E.; Chen, Cynthia E.; Helms, Janet E.; Henze, Kevin T.

    2011-01-01

    Helms, Henze, Sass, and Mifsud (2006) defined good practices for internal consistency reporting, interpretation, and analysis consistent with an alpha-as-data perspective. Their viewpoint (a) expands on previous arguments that reliability coefficients are group-level summary statistics of samples' responses rather than stable properties of scales…

  4. Impact of Measurement Error on Statistical Power: Review of an Old Paradox.

    ERIC Educational Resources Information Center

    Williams, Richard H.; And Others

    1995-01-01

    The paradox that a Student t-test based on pretest-posttest differences can attain its greatest power when the difference score reliability is zero was explained by demonstrating that power is not a mathematical function of reliability unless either true score variance or error score variance is constant. (SLD)

  5. Enhanced Component Performance Study: Turbine-Driven Pumps 1998–2014

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schroeder, John Alton

    2015-11-01

    This report presents an enhanced performance evaluation of turbine-driven pumps (TDPs) at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience failure reports from fiscal year 1998 through 2014 for the component reliability as reported in the Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES). The TDP failure modes considered are failure to start (FTS), failure to run less than or equal to one hour (FTR=1H), failure to run more than one hour (FTR>1H), and normally running systems FTS and failure to run (FTR). The component reliability estimates and themore » reliability data are trended for the most recent 10-year period while yearly estimates for reliability are provided for the entire active period. Statistically significant increasing trends were identified for TDP unavailability, for frequency of start demands for standby TDPs, and for run hours in the first hour after start. Statistically significant decreasing trends were identified for start demands for normally running TDPs, and for run hours per reactor critical year for normally running TDPs.« less

  6. Reliability of horizontal and vertical tube shift techniques in the localisation of supernumerary teeth.

    PubMed

    Mallineni, S K; Anthonappa, R P; King, N M

    2016-12-01

    To assess the reliability of the vertical tube shift technique (VTST) and horizontal tube shift technique (HTST) for the localisation of unerupted supernumerary teeth (ST) in the anterior region of the maxilla. A convenience sample of 83 patients who attended a major teaching hospital because of unerupted ST was selected. Only non-syndromic patients with ST and who had complete clinical and radiographic and surgical records were included in the study. Ten examiners independently rated the paired set of radiographs for each technique. Chi-square test, paired t test and kappa statistics were employed to assess the intra- and inter-examiner reliability. Paired sets of 1660 radiographs (830 pairs for each technique) were available for the analysis. The overall sensitivity for VTST and HTST was 80.6 and 72.1% respectively, with slight inter-examiner and good intra-examiner reliability. Statistically significant differences were evident between the two localisation techniques (p < 0.05). Localisation of unerupted ST using VTST was more successful than HTST in the anterior region of the maxilla.

  7. Interconnect fatigue design for terrestrial photovoltaic modules

    NASA Technical Reports Server (NTRS)

    Mon, G. R.; Moore, D. M.; Ross, R. G., Jr.

    1982-01-01

    The results of comprehensive investigation of interconnect fatigue that has led to the definition of useful reliability-design and life-prediction algorithms are presented. Experimental data indicate that the classical strain-cycle (fatigue) curve for the interconnect material is a good model of mean interconnect fatigue performance, but it fails to account for the broad statistical scatter, which is critical to reliability prediction. To fill this shortcoming the classical fatigue curve is combined with experimental cumulative interconnect failure rate data to yield statistical fatigue curves (having failure probability as a parameter) which enable (1) the prediction of cumulative interconnect failures during the design life of an array field, and (2) the unambiguous--ie., quantitative--interpretation of data from field-service qualification (accelerated thermal cycling) tests. Optimal interconnect cost-reliability design algorithms are derived based on minimizing the cost of energy over the design life of the array field.

  8. Interconnect fatigue design for terrestrial photovoltaic modules

    NASA Astrophysics Data System (ADS)

    Mon, G. R.; Moore, D. M.; Ross, R. G., Jr.

    1982-03-01

    The results of comprehensive investigation of interconnect fatigue that has led to the definition of useful reliability-design and life-prediction algorithms are presented. Experimental data indicate that the classical strain-cycle (fatigue) curve for the interconnect material is a good model of mean interconnect fatigue performance, but it fails to account for the broad statistical scatter, which is critical to reliability prediction. To fill this shortcoming the classical fatigue curve is combined with experimental cumulative interconnect failure rate data to yield statistical fatigue curves (having failure probability as a parameter) which enable (1) the prediction of cumulative interconnect failures during the design life of an array field, and (2) the unambiguous--ie., quantitative--interpretation of data from field-service qualification (accelerated thermal cycling) tests. Optimal interconnect cost-reliability design algorithms are derived based on minimizing the cost of energy over the design life of the array field.

  9. Are photographic records reliable for orthodontic screening?

    PubMed

    Mandall, N A

    2002-06-01

    The aim of the study was to evaluate the reliability of a panel of orthodontists for accepting new patient referrals based on clinical photographs. Eight orthodontists from Greater Manchester, Lancashire, Chester, and Derbyshire observed clinical photographs of 40 consecutive new patients attending the orthodontic department, Hope Hospital, Salford. They recorded whether or not they would accept the patient, as a new patient referral, in their department. Each consultant was asked to take into account factors, such as oral hygiene, dental development, and severity of the malocclusion. Kappa statistic for multiple-rater agreement and kappa statistic for intra-observer reliability were calculated. Inter-observer panel agreement for accepting new patient referrals based on photographic information was low (multiple rater kappa score 0.37). Intra-examiner agreement was better (kappa range 0.34-0.90). Clinician agreement for screening and accepting orthodontic referrals based on clinical photographs is comparable to that previously reported for other clinical decision making.

  10. Enabling High-Energy, High-Voltage Lithium-Ion Cells: Standardization of Coin-Cell Assembly, Electrochemical Testing, and Evaluation of Full Cells

    DOE PAGES

    Long, Brandon R.; Rinaldo, Steven G.; Gallagher, Kevin G.; ...

    2016-11-09

    Coin-cells are often the test format of choice for laboratories engaged in battery research and development as they provide a convenient platform for rapid testing of new materials on a small scale. However, reliable, reproducible data via the coin-cell format is inherently difficult, particularly in the full-cell configuration. In addition, statistical evaluation to prove the consistency and reliability of such data is often neglected. Herein we report on several studies aimed at formalizing physical process parameters and coin-cell construction related to full cells. Statistical analysis and performance benchmarking approaches are advocated as a means to more confidently track changes inmore » cell performance. Finally, we show that trends in the electrochemical data obtained from coin-cells can be reliable and informative when standardized approaches are implemented in a consistent manner.« less

  11. Reliability studies of diagnostic methods in Indian traditional Ayurveda medicine: An overview

    PubMed Central

    Kurande, Vrinda Hitendra; Waagepetersen, Rasmus; Toft, Egon; Prasad, Ramjee

    2013-01-01

    Recently, a need to develop supportive new scientific evidence for contemporary Ayurveda has emerged. One of the research objectives is an assessment of the reliability of diagnoses and treatment. Reliability is a quantitative measure of consistency. It is a crucial issue in classification (such as prakriti classification), method development (pulse diagnosis), quality assurance for diagnosis and treatment and in the conduct of clinical studies. Several reliability studies are conducted in western medicine. The investigation of the reliability of traditional Chinese, Japanese and Sasang medicine diagnoses is in the formative stage. However, reliability studies in Ayurveda are in the preliminary stage. In this paper, examples are provided to illustrate relevant concepts of reliability studies of diagnostic methods and their implication in practice, education, and training. An introduction to reliability estimates and different study designs and statistical analysis is given for future studies in Ayurveda. PMID:23930037

  12. Comparison between the Laser-Badal and Vernier Optometers.

    DTIC Science & Technology

    1988-09-01

    naval aviators (SNAs). We also measured dark vcrgence in the same sample of SNAs. THE FINDINGS There was no statistically significant difference found...relatively inexperienced operator. 7. The difference between mean scores on the vernier and laser-Badal optometers was statistically significant...thus indicating that test results were reliable within instru- menrts. TAbLE 1. Test and Retest Statistics . Measure Mean SD n t-value Dark vergence

  13. Properties of different selection signature statistics and a new strategy for combining them.

    PubMed

    Ma, Y; Ding, X; Qanbari, S; Weigend, S; Zhang, Q; Simianer, H

    2015-11-01

    Identifying signatures of recent or ongoing selection is of high relevance in livestock population genomics. From a statistical perspective, determining a proper testing procedure and combining various test statistics is challenging. On the basis of extensive simulations in this study, we discuss the statistical properties of eight different established selection signature statistics. In the considered scenario, we show that a reasonable power to detect selection signatures is achieved with high marker density (>1 SNP/kb) as obtained from sequencing, while rather small sample sizes (~15 diploid individuals) appear to be sufficient. Most selection signature statistics such as composite likelihood ratio and cross population extended haplotype homozogysity have the highest power when fixation of the selected allele is reached, while integrated haplotype score has the highest power when selection is ongoing. We suggest a novel strategy, called de-correlated composite of multiple signals (DCMS) to combine different statistics for detecting selection signatures while accounting for the correlation between the different selection signature statistics. When examined with simulated data, DCMS consistently has a higher power than most of the single statistics and shows a reliable positional resolution. We illustrate the new statistic to the established selective sweep around the lactase gene in human HapMap data providing further evidence of the reliability of this new statistic. Then, we apply it to scan selection signatures in two chicken samples with diverse skin color. Our analysis suggests that a set of well-known genes such as BCO2, MC1R, ASIP and TYR were involved in the divergent selection for this trait.

  14. Methodological difficulties of conducting agroecological studies from a statistical perspective

    USDA-ARS?s Scientific Manuscript database

    Statistical methods for analysing agroecological data might not be able to help agroecologists to solve all of the current problems concerning crop and animal husbandry, but such methods could well help agroecologists to assess, tackle, and resolve several agroecological issues in a more reliable an...

  15. Reliability of intestinal temperature using an ingestible telemetry pill system during exercise in a hot environment.

    PubMed

    Ruddock, Alan D; Tew, Garry A; Purvis, Alison J

    2014-03-01

    Ingestible telemetry pill systems are being increasingly used to assess the intestinal temperature during exercise in hot environments. The purpose of this investigation was to assess the interday reliability of intestinal temperature during an exercise-heat challenge. Intestinal temperature was recorded as 12 physically active men (25 ± 4 years, stature 181.7 ± 7.0 cm, body mass 81.1 ± 10.6 kg) performed two 60-minute bouts of recumbent cycling (50% of peak aerobic power [watts]) in an environmental chamber set at 35° C 50% relative humidity 3-10 days apart. A range of statistics were used to calculate the reliability, including a paired t-test, 95% limits of agreement (LOA), coefficient of variation (CV), standard error of measurement (SEM), Pearson's correlation coefficient (r), intraclass correlation coefficient (ICC), and Cohen's d. Statistical significance was set at p ≤ 0.05. The method indicated a good overall reliability (LOA = ± 0.61° C, CV = 0.58%, SEM = 0.12° C, Cohen's d = 0.12, r = 0.84, ICC = 0.84). Analysis revealed a statistically significant (p = 0.02) mean systematic bias of -0.07 ± 0.31° C, and the investigation of the Bland-Altman plot suggested the presence of heteroscedasticity. Further analysis revealed the minimum "likely" change in intestinal temperature to be 0.34° C. Although the method demonstrates a good reliability, researchers should be aware of heteroscedasticity. Changes in intestinal temperature >0.34° C as a result of exercise or an intervention in a hot environment are likely changes and less influenced by error associated with the method.

  16. Analysis Testing of Sociocultural Factors Influence on Human Reliability within Sociotechnical Systems: The Algerian Oil Companies.

    PubMed

    Laidoune, Abdelbaki; Rahal Gharbi, Med El Hadi

    2016-09-01

    The influence of sociocultural factors on human reliability within an open sociotechnical systems is highlighted. The design of such systems is enhanced by experience feedback. The study was focused on a survey related to the observation of working cases, and by processing of incident/accident statistics and semistructured interviews in the qualitative part. In order to consolidate the study approach, we considered a schedule for the purpose of standard statistical measurements. We tried to be unbiased by supporting an exhaustive list of all worker categories including age, sex, educational level, prescribed task, accountability level, etc. The survey was reinforced by a schedule distributed to 300 workers belonging to two oil companies. This schedule comprises 30 items related to six main factors that influence human reliability. Qualitative observations and schedule data processing had shown that the sociocultural factors can negatively and positively influence operator behaviors. The explored sociocultural factors influence the human reliability both in qualitative and quantitative manners. The proposed model shows how reliability can be enhanced by some measures such as experience feedback based on, for example, safety improvements, training, and information. With that is added the continuous systems improvements to improve sociocultural reality and to reduce negative behaviors.

  17. Validity and reliability of bilingual English-Arabic version of Schutte self report emotional intelligence scale in an undergraduate Arab medical student sample.

    PubMed

    Naeem, Naghma; Muijtjens, Arno

    2015-04-01

    The psychological construct of emotional intelligence (EI), its theoretical models, measurement instruments and applications have been the subject of several research studies in health professions education. The objective of the current study was to investigate the factorial validity and reliability of a bilingual version of the Schutte Self Report Emotional Intelligence Scale (SSREIS) in an undergraduate Arab medical student population. The study was conducted during April-May 2012. A cross-sectional survey design was employed. A sample (n = 467) was obtained from undergraduate medical students belonging to the male and female medical college of King Saud University, Riyadh, Saudi Arabia. Exploratory and confirmatory factor analysis was performed using SPSS 16.0 and AMOS 4.0 statistical software to determine the factor structure. Reliability was determined using Cronbach's alpha statistics. The results obtained using an undergraduate Arab medical student sample supported a multidimensional; three factor structure of the SSREIS. The three factors are Optimism, Awareness-of-Emotions and Use-of-Emotions. The reliability (Cronbach's alpha) for the three subscales was 0.76, 0.72 and 0.55, respectively. Emotional intelligence is a multifactorial construct (three factors). The bilingual version of the SSREIS is a valid and reliable measure of trait emotional intelligence in an undergraduate Arab medical student population.

  18. Taguchi Based Performance and Reliability Improvement of an Ion Chamber Amplifier for Enhanced Nuclear Reactor Safety

    NASA Astrophysics Data System (ADS)

    Kulkarni, R. D.; Agarwal, Vivek

    2008-08-01

    An ion chamber amplifier (ICA) is used as a safety device for neutronic power (flux) measurement in regulation and protection systems of nuclear reactors. Therefore, performance reliability of an ICA is an important issue. Appropriate quality engineering is essential to achieve a robust design and performance of the ICA circuit. It is observed that the low input bias current operational amplifiers used in the input stage of the ICA circuit are the most critical devices for proper functioning of the ICA. They are very sensitive to the gamma radiation present in their close vicinity. Therefore, the response of the ICA deteriorates with exposure to gamma radiation resulting in a decrease in the overall reliability, unless desired performance is ensured under all conditions. This paper presents a performance enhancement scheme for an ICA operated in the nuclear environment. The Taguchi method, which is a proven technique for reliability enhancement, has been used in this work. It is demonstrated that if a statistical, optimal design approach, like the Taguchi method is used, the cost of high quality and reliability may be brought down drastically. The complete methodology and statistical calculations involved are presented, as are the experimental and simulation results to arrive at a robust design of the ICA.

  19. The validation of a swimming turn wall-contact-time measurement system: a touchpad application reliability study.

    PubMed

    Brackley, Victoria; Ball, Kevin; Tor, Elaine

    2018-05-12

    The effectiveness of the swimming turn is highly influential to overall performance in competitive swimming. The push-off or wall contact, within the turn phase, is directly involved in determining the speed the swimmer leaves the wall. Therefore, it is paramount to develop reliable methods to measure the wall-contact-time during the turn phase for training and research purposes. The aim of this study was to determine the concurrent validity and reliability of the Pool Pad App to measure wall-contact-time during the freestyle and backstroke tumble turn. The wall-contact-times of nine elite and sub-elite participants were recorded during their regular training sessions. Concurrent validity statistics included the standardised typical error estimate, linear analysis and effect sizes while the intraclass correlating coefficient (ICC) was used for the reliability statistics. The standardised typical error estimate resulted in a moderate Cohen's d effect size with an R 2 value of 0.80 and the ICC between the Pool Pad and 2D video footage was 0.89. Despite these measurement differences, the results from this concurrent validity and reliability analyses demonstrated that the Pool Pad is suitable for measuring wall-contact-time during the freestyle and backstroke tumble turn within a training environment.

  20. The reliability of Cavalier's principle of stereological method in determining volumes of enchondromas using the computerized tomography tools.

    PubMed

    Acar, Nihat; Karakasli, Ahmet; Karaarslan, Ahmet; Mas, Nermin Ng; Hapa, Onur

    2017-01-01

    Volumetric measurements of benign tumors enable surgeons to trace volume changes during follow-up periods. For a volumetric measurement technique to be applicable, it should be easy, rapid, and inexpensive and should carry a high interobserver reliability. We aimed to assess the interobserver reliability of a volumetric measurement technique using the Cavalier's principle of stereological methods. The computerized tomography (CT) of 15 patients with a histopathologically confirmed diagnosis of enchondroma with variant tumor sizes and localizations was retrospectively reviewed for interobserver reliability evaluation of the volumetric stereological measurement with the Cavalier's principle, V = t × [((SU) × d) /SL]2 × Σ P. The volumes of the 15 tumors collected by the observers are demonstrated in Table 1. There was no statistical significance between the first and second observers ( p = 0.000 and intraclass correlation coefficient = 0.970) and between the first and third observers ( p = 0.000 and intraclass correlation coefficient = 0.981). No statistical significance was detected between the second and third observers ( p = 0.000 and intraclass correlation coefficient = 0.976). The Cavalier's principle with the stereological technique using the CT scans is an easy, rapid, and inexpensive technique in volumetric evaluation of enchondromas with a trustable interobserver reliability.

  1. Kuhn-Tucker optimization based reliability analysis for probabilistic finite elements

    NASA Technical Reports Server (NTRS)

    Liu, W. K.; Besterfield, G.; Lawrence, M.; Belytschko, T.

    1988-01-01

    The fusion of probability finite element method (PFEM) and reliability analysis for fracture mechanics is considered. Reliability analysis with specific application to fracture mechanics is presented, and computational procedures are discussed. Explicit expressions for the optimization procedure with regard to fracture mechanics are given. The results show the PFEM is a very powerful tool in determining the second-moment statistics. The method can determine the probability of failure or fracture subject to randomness in load, material properties and crack length, orientation, and location.

  2. Impact of High-Reliability Education on Adverse Event Reporting by Registered Nurses.

    PubMed

    McFarland, Diane M; Doucette, Jeffrey N

    Adverse event reporting is one strategy to identify risks and improve patient safety, but, historically, adverse events are underreported by registered nurses (RNs) because of fear of retribution and blame. A program was provided on high reliability to examine whether education would impact RNs' willingness to report adverse events. Although the findings were not statistically significant, they demonstrated a positive impact on adverse event reporting and support the need to create a culture of high reliability.

  3. The Infeasibility of Experimental Quantification of Life-Critical Software Reliability

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Finelli, George B.

    1991-01-01

    This paper affirms that quantification of life-critical software reliability is infeasible using statistical methods whether applied to standard software or fault-tolerant software. The key assumption of software fault tolerance|separately programmed versions fail independently|is shown to be problematic. This assumption cannot be justified by experimentation in the ultra-reliability region and subjective arguments in its favor are not sufficiently strong to justify it as an axiom. Also, the implications of the recent multi-version software experiments support this affirmation.

  4. Evaluating test-retest reliability in patient-reported outcome measures for older people: A systematic review.

    PubMed

    Park, Myung Sook; Kang, Kyung Ja; Jang, Sun Joo; Lee, Joo Yun; Chang, Sun Ju

    2018-03-01

    This study aimed to evaluate the components of test-retest reliability including time interval, sample size, and statistical methods used in patient-reported outcome measures in older people and to provide suggestions on the methodology for calculating test-retest reliability for patient-reported outcomes in older people. This was a systematic literature review. MEDLINE, Embase, CINAHL, and PsycINFO were searched from January 1, 2000 to August 10, 2017 by an information specialist. This systematic review was guided by both the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist and the guideline for systematic review published by the National Evidence-based Healthcare Collaborating Agency in Korea. The methodological quality was assessed by the Consensus-based Standards for the selection of health Measurement Instruments checklist box B. Ninety-five out of 12,641 studies were selected for the analysis. The median time interval for test-retest reliability was 14days, and the ratio of sample size for test-retest reliability to the number of items in each measure ranged from 1:1 to 1:4. The most frequently used statistical methods for continuous scores was intraclass correlation coefficients (ICCs). Among the 63 studies that used ICCs, 21 studies presented models for ICC calculations and 30 studies reported 95% confidence intervals of the ICCs. Additional analyses using 17 studies that reported a strong ICC (>0.09) showed that the mean time interval was 12.88days and the mean ratio of the number of items to sample size was 1:5.37. When researchers plan to assess the test-retest reliability of patient-reported outcome measures for older people, they need to consider an adequate time interval of approximately 13days and the sample size of about 5 times the number of items. Particularly, statistical methods should not only be selected based on the types of scores of the patient-reported outcome measures, but should also be described clearly in the studies that report the results of test-retest reliability. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Characterizing reliability in a product/process design-assurance program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kerscher, W.J. III; Booker, J.M.; Bement, T.R.

    1997-10-01

    Over the years many advancing techniques in the area of reliability engineering have surfaced in the military sphere of influence, and one of these techniques is Reliability Growth Testing (RGT). Private industry has reviewed RGT as part of the solution to their reliability concerns, but many practical considerations have slowed its implementation. It`s objective is to demonstrate the reliability requirement of a new product with a specified confidence. This paper speaks directly to that objective but discusses a somewhat different approach to achieving it. Rather than conducting testing as a continuum and developing statistical confidence bands around the results, thismore » Bayesian updating approach starts with a reliability estimate characterized by large uncertainty and then proceeds to reduce the uncertainty by folding in fresh information in a Bayesian framework.« less

  6. Bootstrap study of genome-enabled prediction reliabilities using haplotype blocks across Nordic Red cattle breeds.

    PubMed

    Cuyabano, B C D; Su, G; Rosa, G J M; Lund, M S; Gianola, D

    2015-10-01

    This study compared the accuracy of genome-enabled prediction models using individual single nucleotide polymorphisms (SNP) or haplotype blocks as covariates when using either a single breed or a combined population of Nordic Red cattle. The main objective was to compare predictions of breeding values of complex traits using a combined training population with haplotype blocks, with predictions using a single breed as training population and individual SNP as predictors. To compare the prediction reliabilities, bootstrap samples were taken from the test data set. With the bootstrapped samples of prediction reliabilities, we built and graphed confidence ellipses to allow comparisons. Finally, measures of statistical distances were used to calculate the gain in predictive ability. Our analyses are innovative in the context of assessment of predictive models, allowing a better understanding of prediction reliabilities and providing a statistical basis to effectively calibrate whether one prediction scenario is indeed more accurate than another. An ANOVA indicated that use of haplotype blocks produced significant gains mainly when Bayesian mixture models were used but not when Bayesian BLUP was fitted to the data. Furthermore, when haplotype blocks were used to train prediction models in a combined Nordic Red cattle population, we obtained up to a statistically significant 5.5% average gain in prediction accuracy, over predictions using individual SNP and training the model with a single breed. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  7. Reliability of Computerized Neurocognitive Tests for Concussion Assessment: A Meta-Analysis.

    PubMed

    Farnsworth, James L; Dargo, Lucas; Ragan, Brian G; Kang, Minsoo

    2017-09-01

      Although widely used, computerized neurocognitive tests (CNTs) have been criticized because of low reliability and poor sensitivity. A systematic review was published summarizing the reliability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores; however, this was limited to a single CNT. Expansion of the previous review to include additional CNTs and a meta-analysis is needed. Therefore, our purpose was to analyze reliability data for CNTs using meta-analysis and examine moderating factors that may influence reliability.   A systematic literature search (key terms: reliability, computerized neurocognitive test, concussion) of electronic databases (MEDLINE, PubMed, Google Scholar, and SPORTDiscus) was conducted to identify relevant studies.   Studies were included if they met all of the following criteria: used a test-retest design, involved at least 1 CNT, provided sufficient statistical data to allow for effect-size calculation, and were published in English.   Two independent reviewers investigated each article to assess inclusion criteria. Eighteen studies involving 2674 participants were retained. Intraclass correlation coefficients were extracted to calculate effect sizes and determine overall reliability. The Fisher Z transformation adjusted for sampling error associated with averaging correlations. Moderator analyses were conducted to evaluate the effects of the length of the test-retest interval, intraclass correlation coefficient model selection, participant demographics, and study design on reliability. Heterogeneity was evaluated using the Cochran Q statistic.   The proportion of acceptable outcomes was greatest for the Axon Sports CogState Test (75%) and lowest for the ImPACT (25%). Moderator analyses indicated that the type of intraclass correlation coefficient model used significantly influenced effect-size estimates, accounting for 17% of the variation in reliability.   The Axon Sports CogState Test, which has a higher proportion of acceptable outcomes and shorter test duration relative to other CNTs, may be a reliable option; however, future studies are needed to compare the diagnostic accuracy of these instruments.

  8. Spurious correlations and inference in landscape genetics

    Treesearch

    Samuel A. Cushman; Erin L. Landguth

    2010-01-01

    Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...

  9. Qualitative Meta-Analysis on the Hospital Task: Implications for Research

    ERIC Educational Resources Information Center

    Noll, Jennifer; Sharma, Sashi

    2014-01-01

    The "law of large numbers" indicates that as sample size increases, sample statistics become less variable and more closely estimate their corresponding population parameters. Different research studies investigating how people consider sample size when evaluating the reliability of a sample statistic have found a wide range of…

  10. Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling

    PubMed Central

    Wood, John

    2017-01-01

    Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered—some very seriously so—but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. PMID:28706080

  11. Maximizing Statistical Power When Verifying Probabilistic Forecasts of Hydrometeorological Events

    NASA Astrophysics Data System (ADS)

    DeChant, C. M.; Moradkhani, H.

    2014-12-01

    Hydrometeorological events (i.e. floods, droughts, precipitation) are increasingly being forecasted probabilistically, owing to the uncertainties in the underlying causes of the phenomenon. In these forecasts, the probability of the event, over some lead time, is estimated based on some model simulations or predictive indicators. By issuing probabilistic forecasts, agencies may communicate the uncertainty in the event occurring. Assuming that the assigned probability of the event is correct, which is referred to as a reliable forecast, the end user may perform some risk management based on the potential damages resulting from the event. Alternatively, an unreliable forecast may give false impressions of the actual risk, leading to improper decision making when protecting resources from extreme events. Due to this requisite for reliable forecasts to perform effective risk management, this study takes a renewed look at reliability assessment in event forecasts. Illustrative experiments will be presented, showing deficiencies in the commonly available approaches (Brier Score, Reliability Diagram). Overall, it is shown that the conventional reliability assessment techniques do not maximize the ability to distinguish between a reliable and unreliable forecast. In this regard, a theoretical formulation of the probabilistic event forecast verification framework will be presented. From this analysis, hypothesis testing with the Poisson-Binomial distribution is the most exact model available for the verification framework, and therefore maximizes one's ability to distinguish between a reliable and unreliable forecast. Application of this verification system was also examined within a real forecasting case study, highlighting the additional statistical power provided with the use of the Poisson-Binomial distribution.

  12. Post-traumatic subtalar osteoarthritis: which grading system should we use?

    PubMed

    de Muinck Keizer, Robert-Jan O; Backes, Manouk; Dingemans, Siem A; Goslings, J Carel; Schepers, Tim

    2016-09-01

    To assess and compare post-traumatic osteoarthritis following intra-articular calcaneal fractures, one must have a reliable grading system that consistently grades the post-traumatic changes of the joint. A reliable grading system aids in the communication between treating physicians and improves the interpretation of research. To date, there is no consensus on what grading system to use in the evaluation of post-traumatic subtalar osteoarthritis. The objective of this study was to determine and compare the inter- and intra-rater reliability of two grading systems for post-traumatic subtalar osteoarthritis. Four observers evaluated 50 calcaneal fractures at least one year after trauma on conventional oblique lateral, internally and externally rotated views, and graded post-traumatic subtalar osteoarthritis using the Kellgren and Lawrence Grading Scale (KLGS) and the Paley Grading System (PGS). Inter- and intra-rater reliability were calculated and compared. The inter-rater reliability showed an intra-class correlation (ICC) of 0.54 (95 % CI 0.40-0.67) for the KLGS and an ICC of 0.41 (95 % CI 0.26 - 0.57) for the PGS. This difference was not statistically significant. The intra-rater reliability showed a mean weighted kappa of 0.62 for both the KLGS and the PGS. There is no statistically significant difference in reliability between the Kellgren and Lawrence Grading System (KLGS) and the Paley Grading System (PGS). The PGS allows for an easy two-step approach making it easy for everyday clinical purposes. For research purposes however, the more detailed and widely used KLGS seems preferable.

  13. A simulation-based evaluation of methods for inferring linear barriers to gene flow

    Treesearch

    Christopher Blair; Dana E. Weigel; Matthew Balazik; Annika T. H. Keeley; Faith M. Walker; Erin Landguth; Sam Cushman; Melanie Murphy; Lisette Waits; Niko Balkenhol

    2012-01-01

    Different analytical techniques used on the same data set may lead to different conclusions about the existence and strength of genetic structure. Therefore, reliable interpretation of the results from different methods depends on the efficacy and reliability of different statistical methods. In this paper, we evaluated the performance of multiple analytical methods to...

  14. Construct Reliability and Validity of the Shortened Version of the Information-Seeking Behavior Scale

    ERIC Educational Resources Information Center

    Lerdpornkulrat, Thanita; Poondej, Chanut; Koul, Ravinder

    2017-01-01

    This study aimed to translate the information-seeking behavior scale from English to Thai, and to ascertain the construct reliability and validity of the scale. Data were collected from 664 undergraduate students in Thailand. The descriptive statistics were explored to see the extent to which various information sources are being used by…

  15. The Use of Invariance and Bootstrap Procedures as a Method to Establish the Reliability of Research Results.

    ERIC Educational Resources Information Center

    Sandler, Andrew B.

    Statistical significance is misused in educational and psychological research when it is applied as a method to establish the reliability of research results. Other techniques have been developed which can be correctly utilized to establish the generalizability of findings. Methods that do provide such estimates are known as invariance or…

  16. Confirmatory Factor Analysis of Persian Adaptation of Multidimensional Students' Life Satisfaction Scale (MSLSS)

    ERIC Educational Resources Information Center

    Hatami, Gissou; Motamed, Niloofar; Ashrafzadeh, Mahshid

    2010-01-01

    Validity and reliability of Persian adaptation of MSLSS in the 12-18 years, middle and high school students (430 students in grades 6-12 in Bushehr port, Iran) using confirmatory factor analysis by means of LISREL statistical package were checked. Internal consistency reliability estimates (Cronbach's coefficient [alpha]) were all above the…

  17. Reliability Sampling Plans: A Review and Some New Results

    ERIC Educational Resources Information Center

    Isaic-Maniu, Alexandru; Voda, Viorel Gh.

    2009-01-01

    In this work we present a large area of aspects related to the problem of sampling inspection in the case of reliability. First we discuss the actual status of this domain, mentioning the newest approaches (from a technical view point) such as HALT and HASS and the statistical perspective. After a brief description of the general procedure in…

  18. [How reliable is the monitoring for doping?].

    PubMed

    Hüsler, J

    1990-12-01

    The reliability of the dope control, of the chemical analysis of the urine probes in the accredited laboratories and their decisions, is discussed using probabilistic and statistical methods. Basically, we evaluated and estimated the positive predictive value which means the probability that an urine probe contains prohibited dope substances given a positive test decision. Since there are not statistical data and evidence for some important quantities in relation to the predictive value, an exact evaluation is not possible, only conservative, lower bounds can be given. We found that the predictive value is at least 90% or 95% with respect to the analysis and decision based on the A-probe only, and at least 99% with respect to both A- and B-probes. A more realistic observation, but without sufficient statistical confidence, points to the fact that the true predictive value is significantly larger than these lower estimates.

  19. Report on Wind Turbine Subsystem Reliability - A Survey of Various Databases (Presentation)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sheng, S.

    2013-07-01

    Wind industry has been challenged by premature subsystem/component failures. Various reliability data collection efforts have demonstrated their values in supporting wind turbine reliability and availability research & development and industrial activities. However, most information on these data collection efforts are scattered and not in a centralized place. With the objective of getting updated reliability statistics of wind turbines and/or subsystems so as to benefit future wind reliability and availability activities, this report is put together based on a survey of various reliability databases that are accessible directly or indirectly by NREL. For each database, whenever feasible, a brief description summarizingmore » database population, life span, and data collected is given along with its features & status. Then selective results deemed beneficial to the industry and generated based on the database are highlighted. This report concludes with several observations obtained throughout the survey and several reliability data collection opportunities in the future.« less

  20. What Is SEER?

    Cancer.gov

    An infographic describing the functions of NCI’s Surveillance, Epidemiology, and End Results (SEER) program: collecting, analyzing, interpreting, and disseminating reliable population-based statistics.

  1. Ceramic component reliability with the restructured NASA/CARES computer program

    NASA Technical Reports Server (NTRS)

    Powers, Lynn M.; Starlinger, Alois; Gyekenyesi, John P.

    1992-01-01

    The Ceramics Analysis and Reliability Evaluation of Structures (CARES) integrated design program on statistical fast fracture reliability and monolithic ceramic components is enhanced to include the use of a neutral data base, two-dimensional modeling, and variable problem size. The data base allows for the efficient transfer of element stresses, temperatures, and volumes/areas from the finite element output to the reliability analysis program. Elements are divided to insure a direct correspondence between the subelements and the Gaussian integration points. Two-dimensional modeling is accomplished by assessing the volume flaw reliability with shell elements. To demonstrate the improvements in the algorithm, example problems are selected from a round-robin conducted by WELFEP (WEakest Link failure probability prediction by Finite Element Postprocessors).

  2. Reliability estimation of a N- M-cold-standby redundancy system in a multicomponent stress-strength model with generalized half-logistic distribution

    NASA Astrophysics Data System (ADS)

    Liu, Yiming; Shi, Yimin; Bai, Xuchao; Zhan, Pei

    2018-01-01

    In this paper, we study the estimation for the reliability of a multicomponent system, named N- M-cold-standby redundancy system, based on progressive Type-II censoring sample. In the system, there are N subsystems consisting of M statistically independent distributed strength components, and only one of these subsystems works under the impact of stresses at a time and the others remain as standbys. Whenever the working subsystem fails, one from the standbys takes its place. The system fails when the entire subsystems fail. It is supposed that the underlying distributions of random strength and stress both belong to the generalized half-logistic distribution with different shape parameter. The reliability of the system is estimated by using both classical and Bayesian statistical inference. Uniformly minimum variance unbiased estimator and maximum likelihood estimator for the reliability of the system are derived. Under squared error loss function, the exact expression of the Bayes estimator for the reliability of the system is developed by using the Gauss hypergeometric function. The asymptotic confidence interval and corresponding coverage probabilities are derived based on both the Fisher and the observed information matrices. The approximate highest probability density credible interval is constructed by using Monte Carlo method. Monte Carlo simulations are performed to compare the performances of the proposed reliability estimators. A real data set is also analyzed for an illustration of the findings.

  3. Reliability of a rating procedure to monitor industry self-regulation codes governing alcohol advertising content.

    PubMed

    Babor, Thomas F; Xuan, Ziming; Proctor, Dwayne

    2008-03-01

    The purposes of this study were to develop reliable procedures to monitor the content of alcohol advertisements broadcast on television and in other media, and to detect violations of the content guidelines of the alcohol industry's self-regulation codes. A set of rating-scale items was developed to measure the content guidelines of the 1997 version of the U.S. Beer Institute Code. Six focus groups were conducted with 60 college students to evaluate the face validity of the items and the feasibility of the procedure. A test-retest reliability study was then conducted with 74 participants, who rated five alcohol advertisements on two occasions separated by 1 week. Average correlations across all advertisements using three reliability statistics (r, rho, and kappa) were almost all statistically significant and the kappas were good for most items, which indicated high test-retest agreement. We also found high interrater reliabilities (intraclass correlations) among raters for item-level and guideline-level violations, indicating that regardless of the specific item, raters were consistent in their general evaluations of the advertisements. Naïve (untrained) raters can provide consistent (reliable) ratings of the main content guidelines proposed in the U.S. Beer Institute Code. The rating procedure may have future applications for monitoring compliance with industry self-regulation codes and for conducting research on the ways in which alcohol advertisements are perceived by young adults and other vulnerable populations.

  4. Principles of Statistics: What the Sports Medicine Professional Needs to Know.

    PubMed

    Riemann, Bryan L; Lininger, Monica R

    2018-07-01

    Understanding the results and statistics reported in original research remains a large challenge for many sports medicine practitioners and, in turn, may be among one of the biggest barriers to integrating research into sports medicine practice. The purpose of this article is to provide minimal essentials a sports medicine practitioner needs to know about interpreting statistics and research results to facilitate the incorporation of the latest evidence into practice. Topics covered include the difference between statistical significance and clinical meaningfulness; effect sizes and confidence intervals; reliability statistics, including the minimal detectable difference and minimal important difference; and statistical power. Copyright © 2018 Elsevier Inc. All rights reserved.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Unwin, Stephen D.; Lowry, Peter P.; Layton, Robert F.

    This is a working report drafted under the Risk-Informed Safety Margin Characterization pathway of the Light Water Reactor Sustainability Program, describing statistical models of passives component reliabilities.

  6. Incorporating Nonparametric Statistics into Delphi Studies in Library and Information Science

    ERIC Educational Resources Information Center

    Ju, Boryung; Jin, Tao

    2013-01-01

    Introduction: The Delphi technique is widely used in library and information science research. However, many researchers in the field fail to employ standard statistical tests when using this technique. This makes the technique vulnerable to criticisms of its reliability and validity. The general goal of this article is to explore how…

  7. The Power of 'Evidence': Reliable Science or a Set of Blunt Tools?

    ERIC Educational Resources Information Center

    Wrigley, Terry

    2018-01-01

    In response to the increasing emphasis on 'evidence-based teaching', this article examines the privileging of randomised controlled trials and their statistical synthesis (meta-analysis). It also pays particular attention to two third-level statistical syntheses: John Hattie's "Visible learning" project and the EEF's "Teaching and…

  8. The Application of Strength of Association Statistics to the Item Analysis of an In-Training Examination in Diagnostic Radiology.

    ERIC Educational Resources Information Center

    Diamond, James J.; McCormick, Janet

    1986-01-01

    Using item responses from an in-training examination in diagnostic radiology, the application of a strength of association statistic to the general problem of item analysis is illustrated. Criteria for item selection, general issues of reliability, and error of measurement are discussed. (Author/LMO)

  9. Usage Data for Electronic Resources: A Comparison between Locally Collected and Vendor-Provided Statistics.

    ERIC Educational Resources Information Center

    Duy, Joanna; Vaughan, Liwen

    2003-01-01

    Vendor-provided electronic resource usage statistics are not currently standardized across vendors. This study investigates the feasibility of using locally collected data to check the reliability of vendor-provided data. Vendor-provided data were compared with local data collected from North Carolina State University (NCSU) Libraries' Web…

  10. The impact of statistical adjustment on conditional standard errors of measurement in the assessment of physician communication skills.

    PubMed

    Raymond, Mark R; Clauser, Brian E; Furman, Gail E

    2010-10-01

    The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary least squares regression to adjust ratings, and then used generalizability theory to evaluate the impact of these adjustments on score reliability and the overall standard error of measurement. In addition, conditional standard errors of measurement were computed for both observed and adjusted scores to determine whether the improvements in measurement precision were uniform across the score distribution. Results indicated that measurement was generally less precise for communication ratings toward the lower end of the score distribution; and the improvement in measurement precision afforded by statistical modeling varied slightly across the score distribution such that the most improvement occurred in the upper-middle range of the score scale. Possible reasons for these patterns in measurement precision are discussed, as are the limitations of the statistical models used for adjusting performance ratings.

  11. DoD Met Most Requirements of the Improper Payments Elimination and Recovery Act in FY 2014, but Improper Payment Estimates Were Unreliable

    DTIC Science & Technology

    2015-05-12

    Deficiencies That Affect the Reliability of Estimates ________________________________________6 Statistical Precision Could Be Improved... statistical precision of improper payments estimates in seven of the DoD payment programs through the use of stratified sample designs. DoD improper...payments not subject to sampling, which made the results statistically invalid. We made a recommendation to correct this problem in a previous report;4

  12. Effects of War Tax Collection in Honduran Society: Evaluating the Social and Economic Cost

    DTIC Science & Technology

    2015-06-12

    information gathering for this thesis relied on open electronic sources and statistical data from security and intelligence agencies. There are no...commission of this crime have identified. However, the main limitation encountered is the poor reliability of the statistical data because the high...justice systems with crime and violence. According to statistical data presented in Air & Space Power Journal, at least 60 percent of the 2,576 murders

  13. Quality of Death Rates by Race and Hispanic Origin: A Summary of Current Research, 1999. Vital and Health Statistics. Series 2: Data Evaluation and Methods Research. No. 128.

    ERIC Educational Resources Information Center

    National Center for Health Statistics (DHHS/PHS), Hyattsville, MD.

    This report summarizes current knowledge and research on the quality and reliability of death rates by race and Hispanic origin in official mortality statistics of the United States produced by the National Center for Health Statistics (NCHS). It provides a quantitative assessment of bias in death rates by race and Hispanic origin and identifies…

  14. Reliability, precision, and measurement in the context of data from ability tests, surveys, and assessments

    NASA Astrophysics Data System (ADS)

    Fisher, W. P., Jr.; Elbaum, B.; Coulter, A.

    2010-07-01

    Reliability coefficients indicate the proportion of total variance attributable to differences among measures separated along a quantitative continuum by a testing, survey, or assessment instrument. Reliability is usually considered to be influenced by both the internal consistency of a data set and the number of items, though textbooks and research papers rarely evaluate the extent to which these factors independently affect the data in question. Probabilistic formulations of the requirements for unidimensional measurement separate consistency from error by modelling individual response processes instead of group-level variation. The utility of this separation is illustrated via analyses of small sets of simulated data, and of subsets of data from a 78-item survey of over 2,500 parents of children with disabilities. Measurement reliability ultimately concerns the structural invariance specified in models requiring sufficient statistics, parameter separation, unidimensionality, and other qualities that historically have made quantification simple, practical, and convenient for end users. The paper concludes with suggestions for a research program aimed at focusing measurement research more on the calibration and wide dissemination of tools applicable to individuals, and less on the statistical study of inter-variable relations in large data sets.

  15. Reliability and validity of MicroScribe-3DXL system in comparison with radiographic cephalometric system: Angular measurements.

    PubMed

    Barmou, Maher M; Hussain, Saba F; Abu Hassan, Mohamed I

    2018-06-01

    The aim of the study was to assess the reliability and validity of cephalometric variables from MicroScribe-3DXL. Seven cephalometric variables (facial angle, ANB, maxillary depth, U1/FH, FMA, IMPA, FMIA) were measured by a dentist in 60 Malay subjects (30 males and 30 females) with class I occlusion and balanced face. Two standard images were taken for each subject with conventional cephalometric radiography and MicroScribe-3DXL. All the images were traced and analysed. SPSS version 2.0 was used for statistical analysis with P-value was set at P<0.05. The results revealed a significant statistic difference in four measurements (U1/FH, FMA, IMPA, FMIA) with P-value range (0.00 to 0.03). The difference in the measurements was considered clinically acceptable. The overall reliability of MicroScribe-3DXL was 92.7% and its validity was 91.8%. The MicroScribe-3DXL is reliable and valid to most of the cephalometric variables with the advantages of saving time and cost. This is a promising device to assist in diverse areas in dental practice and research. Copyright © 2018. Published by Elsevier Masson SAS.

  16. Enhanced Component Performance Study: Motor-Driven Pumps 1998–2014

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schroeder, John Alton

    2016-02-01

    This report presents an enhanced performance evaluation of motor-driven pumps at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience failure reports from fiscal year 1998 through 2014 for the component reliability as reported in the Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES). The motor-driven pump failure modes considered for standby systems are failure to start, failure to run less than or equal to one hour, and failure to run more than one hour; for normally running systems, the failure modes considered are failure to start and failure tomore » run. An eight hour unreliability estimate is also calculated and trended. The component reliability estimates and the reliability data are trended for the most recent 10-year period while yearly estimates for reliability are provided for the entire active period. Statistically significant increasing trends were identified in pump run hours per reactor year. Statistically significant decreasing trends were identified for standby systems industry-wide frequency of start demands, and run hours per reactor year for runs of less than or equal to one hour.« less

  17. Calculating system reliability with SRFYDO

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morzinski, Jerome; Anderson - Cook, Christine M; Klamann, Richard M

    2010-01-01

    SRFYDO is a process for estimating reliability of complex systems. Using information from all applicable sources, including full-system (flight) data, component test data, and expert (engineering) judgment, SRFYDO produces reliability estimates and predictions. It is appropriate for series systems with possibly several versions of the system which share some common components. It models reliability as a function of age and up to 2 other lifecycle (usage) covariates. Initial output from its Exploratory Data Analysis mode consists of plots and numerical summaries so that the user can check data entry and model assumptions, and help determine a final form for themore » system model. The System Reliability mode runs a complete reliability calculation using Bayesian methodology. This mode produces results that estimate reliability at the component, sub-system, and system level. The results include estimates of uncertainty, and can predict reliability at some not-too-distant time in the future. This paper presents an overview of the underlying statistical model for the analysis, discusses model assumptions, and demonstrates usage of SRFYDO.« less

  18. Spatial Correlations in Natural Scenes Modulate Response Reliability in Mouse Visual Cortex

    PubMed Central

    Rikhye, Rajeev V.

    2015-01-01

    Intrinsic neuronal variability significantly limits information encoding in the primary visual cortex (V1). Certain stimuli can suppress this intertrial variability to increase the reliability of neuronal responses. In particular, responses to natural scenes, which have broadband spatiotemporal statistics, are more reliable than responses to stimuli such as gratings. However, very little is known about which stimulus statistics modulate reliable coding and how this occurs at the neural ensemble level. Here, we sought to elucidate the role that spatial correlations in natural scenes play in reliable coding. We developed a novel noise-masking method to systematically alter spatial correlations in natural movies, without altering their edge structure. Using high-speed two-photon calcium imaging in vivo, we found that responses in mouse V1 were much less reliable at both the single neuron and population level when spatial correlations were removed from the image. This change in reliability was due to a reorganization of between-neuron correlations. Strongly correlated neurons formed ensembles that reliably and accurately encoded visual stimuli, whereas reducing spatial correlations reduced the activation of these ensembles, leading to an unreliable code. Together with an ensemble-specific normalization model, these results suggest that the coordinated activation of specific subsets of neurons underlies the reliable coding of natural scenes. SIGNIFICANCE STATEMENT The natural environment is rich with information. To process this information with high fidelity, V1 neurons have to be robust to noise and, consequentially, must generate responses that are reliable from trial to trial. While several studies have hinted that both stimulus attributes and population coding may reduce noise, the details remain unclear. Specifically, what features of natural scenes are important and how do they modulate reliability? This study is the first to investigate the role of spatial correlations, which are a fundamental attribute of natural scenes, in shaping stimulus coding by V1 neurons. Our results provide new insights into how stimulus spatial correlations reorganize the correlated activation of specific ensembles of neurons to ensure accurate information processing in V1. PMID:26511254

  19. The effect of wall thickness distribution on mechanical reliability and strength in unidirectional porous ceramics.

    PubMed

    Seuba, Jordi; Deville, Sylvain; Guizard, Christian; Stevenson, Adam J

    2016-01-01

    Macroporous ceramics exhibit an intrinsic strength variability caused by the random distribution of defects in their structure. However, the precise role of microstructural features, other than pore volume, on reliability is still unknown. Here, we analyze the applicability of the Weibull analysis to unidirectional macroporous yttria-stabilized-zirconia (YSZ) prepared by ice-templating. First, we performed crush tests on samples with controlled microstructural features with the loading direction parallel to the porosity. The compressive strength data were fitted using two different fitting techniques, ordinary least squares and Bayesian Markov Chain Monte Carlo, to evaluate whether Weibull statistics are an adequate descriptor of the strength distribution. The statistical descriptors indicated that the strength data are well described by the Weibull statistical approach, for both fitting methods used. Furthermore, we assess the effect of different microstructural features (volume, size, densification of the walls, and morphology) on Weibull modulus and strength. We found that the key microstructural parameter controlling reliability is wall thickness. In contrast, pore volume is the main parameter controlling the strength. The highest Weibull modulus ([Formula: see text]) and mean strength (198.2 MPa) were obtained for the samples with the smallest and narrowest wall thickness distribution (3.1 [Formula: see text]m) and lower pore volume (54.5%).

  20. The effect of wall thickness distribution on mechanical reliability and strength in unidirectional porous ceramics

    NASA Astrophysics Data System (ADS)

    Seuba, Jordi; Deville, Sylvain; Guizard, Christian; Stevenson, Adam J.

    2016-01-01

    Macroporous ceramics exhibit an intrinsic strength variability caused by the random distribution of defects in their structure. However, the precise role of microstructural features, other than pore volume, on reliability is still unknown. Here, we analyze the applicability of the Weibull analysis to unidirectional macroporous yttria-stabilized-zirconia (YSZ) prepared by ice-templating. First, we performed crush tests on samples with controlled microstructural features with the loading direction parallel to the porosity. The compressive strength data were fitted using two different fitting techniques, ordinary least squares and Bayesian Markov Chain Monte Carlo, to evaluate whether Weibull statistics are an adequate descriptor of the strength distribution. The statistical descriptors indicated that the strength data are well described by the Weibull statistical approach, for both fitting methods used. Furthermore, we assess the effect of different microstructural features (volume, size, densification of the walls, and morphology) on Weibull modulus and strength. We found that the key microstructural parameter controlling reliability is wall thickness. In contrast, pore volume is the main parameter controlling the strength. The highest Weibull modulus (?) and mean strength (198.2 MPa) were obtained for the samples with the smallest and narrowest wall thickness distribution (3.1 ?m) and lower pore volume (54.5%).

  1. Care 3, Phase 1, volume 1

    NASA Technical Reports Server (NTRS)

    Stiffler, J. J.; Bryant, L. A.; Guccione, L.

    1979-01-01

    A computer program to aid in accessing the reliability of fault tolerant avionics systems was developed. A simple mathematical expression was used to evaluate the reliability of any redundant configuration over any interval during which the failure rates and coverage parameters remained unaffected by configuration changes. Provision was made for convolving such expressions in order to evaluate the reliability of a dual mode system. A coverage model was also developed to determine the various relevant coverage coefficients as a function of the available hardware and software fault detector characteristics, and subsequent isolation and recovery delay statistics.

  2. Numerical solutions for patterns statistics on Markov chains.

    PubMed

    Nuel, Gregory

    2006-01-01

    We propose here a review of the methods available to compute pattern statistics on text generated by a Markov source. Theoretical, but also numerical aspects are detailed for a wide range of techniques (exact, Gaussian, large deviations, binomial and compound Poisson). The SPatt package (Statistics for Pattern, free software available at http://stat.genopole.cnrs.fr/spatt) implementing all these methods is then used to compare all these approaches in terms of computational time and reliability in the most complete pattern statistics benchmark available at the present time.

  3. Reliable probabilities through statistical post-processing of ensemble predictions

    NASA Astrophysics Data System (ADS)

    Van Schaeybroeck, Bert; Vannitsem, Stéphane

    2013-04-01

    We develop post-processing or calibration approaches based on linear regression that make ensemble forecasts more reliable. We enforce climatological reliability in the sense that the total variability of the prediction is equal to the variability of the observations. Second, we impose ensemble reliability such that the spread around the ensemble mean of the observation coincides with the one of the ensemble members. In general the attractors of the model and reality are inhomogeneous. Therefore ensemble spread displays a variability not taken into account in standard post-processing methods. We overcome this by weighting the ensemble by a variable error. The approaches are tested in the context of the Lorenz 96 model (Lorenz 1996). The forecasts become more reliable at short lead times as reflected by a flatter rank histogram. Our best method turns out to be superior to well-established methods like EVMOS (Van Schaeybroeck and Vannitsem, 2011) and Nonhomogeneous Gaussian Regression (Gneiting et al., 2005). References [1] Gneiting, T., Raftery, A. E., Westveld, A., Goldman, T., 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Weather Rev. 133, 1098-1118. [2] Lorenz, E. N., 1996: Predictability - a problem partly solved. Proceedings, Seminar on Predictability ECMWF. 1, 1-18. [3] Van Schaeybroeck, B., and S. Vannitsem, 2011: Post-processing through linear regression, Nonlin. Processes Geophys., 18, 147.

  4. eHealth literacy in chronic disease patients: An item response theory analysis of the eHealth literacy scale (eHEALS).

    PubMed

    Paige, Samantha R; Krieger, Janice L; Stellefson, Michael; Alber, Julia M

    2017-02-01

    Chronic disease patients are affected by low computer and health literacy, which negatively affects their ability to benefit from access to online health information. To estimate reliability and confirm model specifications for eHealth Literacy Scale (eHEALS) scores among chronic disease patients using Classical Test (CTT) and Item Response Theory techniques. A stratified sample of Black/African American (N=341) and Caucasian (N=343) adults with chronic disease completed an online survey including the eHEALS. Item discrimination was explored using bi-variate correlations and Cronbach's alpha for internal consistency. A categorical confirmatory factor analysis tested a one-factor structure of eHEALS scores. Item characteristic curves, in-fit/outfit statistics, omega coefficient, and item reliability and separation estimates were computed. A 1-factor structure of eHEALS was confirmed by statistically significant standardized item loadings, acceptable model fit indices (CFI/TLI>0.90), and 70% variance explained by the model. Item response categories increased with higher theta levels, and there was evidence of acceptable reliability (ω=0.94; item reliability=89; item separation=8.54). eHEALS scores are a valid and reliable measure of self-reported eHealth literacy among Internet-using chronic disease patients. Providers can use eHEALS to help identify patients' eHealth literacy skills. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. The alcohol use disorder and associated disabilities interview schedule-IV (AUDADIS-IV): reliability of new psychiatric diagnostic modules and risk factors in a general population sample.

    PubMed

    Ruan, W June; Goldstein, Risë B; Chou, S Patricia; Smith, Sharon M; Saha, Tulshi D; Pickering, Roger P; Dawson, Deborah A; Huang, Boji; Stinson, Frederick S; Grant, Bridget F

    2008-01-01

    This study presents test-retest reliability statistics and information on internal consistency for new diagnostic modules and risk factors for alcohol, drug, and psychiatric disorders from the Alcohol Use Disorder and Associated Disabilities Interview Schedule-IV (AUDADIS-IV). Test-retest statistics were derived from a random sample of 1899 adults selected from 34,653 respondents who participated in the 2004-2005 Wave 2 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Internal consistency of continuous scales was assessed using the entire Wave 2 NESARC. Both test and retest interviews were conducted face-to-face. Test-retest and internal consistency results for diagnoses and symptom scales associated with posttraumatic stress disorder, attention-deficit/hyperactivity disorder, and borderline, narcissistic, and schizotypal personality disorders were predominantly good (kappa>0.63; ICC>0.69; alpha>0.75) and reliability for risk factor measures fell within the good to excellent range (intraclass correlations=0.50-0.94; alpha=0.64-0.90). The high degree of reliability found in this study suggests that new AUDADIS-IV diagnostic measures can be useful tools in research settings. The availability of highly reliable measures of risk factors for alcohol, drug, and psychiatric disorders will contribute to the validity of conclusions drawn from future research in the domains of substance use disorder and psychiatric epidemiology.

  6. Empirical methods for assessing meaningful neuropsychological change following epilepsy surgery.

    PubMed

    Sawrie, S M; Chelune, G J; Naugle, R I; Lüders, H O

    1996-11-01

    Traditional methods for assessing the neurocognitive effects of epilepsy surgery are confounded by practice effects, test-retest reliability issues, and regression to the mean. This study employs 2 methods for assessing individual change that allow direct comparison of changes across both individuals and test measures. Fifty-one medically intractable epilepsy patients completed a comprehensive neuropsychological battery twice, approximately 8 months apart, prior to any invasive monitoring or surgical intervention. First, a Reliable Change (RC) index score was computed for each test score to take into account the reliability of that measure, and a cutoff score was empirically derived to establish the limits of statistically reliable change. These indices were subsequently adjusted for expected practice effects. The second approach used a regression technique to establish "change norms" along a common metric that models both expected practice effects and regression to the mean. The RC index scores provide the clinician with a statistical means of determining whether a patient's retest performance is "significantly" changed from baseline. The regression norms for change allow the clinician to evaluate the magnitude of a given patient's change on 1 or more variables along a common metric that takes into account the reliability and stability of each test measure. Case data illustrate how these methods provide an empirically grounded means for evaluating neurocognitive outcomes following medical interventions such as epilepsy surgery.

  7. Improving the Validity and Reliability of a Health Promotion Survey for Physical Therapists

    PubMed Central

    Stephens, Jaca L.; Lowman, John D.; Graham, Cecilia L.; Morris, David M.; Kohler, Connie L.; Waugh, Jonathan B.

    2013-01-01

    Purpose Physical therapists (PTs) have a unique opportunity to intervene in the area of health promotion. However, no instrument has been validated to measure PTs’ views on health promotion in physical therapy practice. The purpose of this study was to evaluate the content validity and test-retest reliability of a health promotion survey designed for PTs. Methods An expert panel of PTs assessed the content validity of “The Role of Health Promotion in Physical Therapy Survey” and provided suggestions for revision. Item content validity was assessed using the content validity ratio (CVR) as well as the modified kappa statistic. Therapists then participated in the test-retest reliability assessment of the revised health promotion survey, which was assessed using a weighted kappa statistic. Results Based on feedback from the expert panelists, significant revisions were made to the original survey. The expert panel reached at least a majority consensus agreement for all items in the revised survey and the survey-CVR improved from 0.44 to 0.66. Only one item on the revised survey had substantial test-retest agreement, with 55% of the items having moderate agreement and 43% poor agreement. Conclusions All items on the revised health promotion survey demonstrated at least fair validity, but few items had reasonable test-retest reliability. Further modifications should be made to strengthen the validity and improve the reliability of this survey. PMID:23754935

  8. Enhanced definitions of intimate partner violence for DSM-5 and ICD-11 may promote improved screening and treatment.

    PubMed

    Heyman, Richard E; Slep, Amy M Smith; Foran, Heather M

    2015-03-01

    Nuanced, multifaceted, and content valid diagnostic criteria for intimate partner violence (IPV) have been created and can be used reliably in the field even by those with little-to-no clinical training/background. The use of such criteria such as these would likely lead to more reliable decision making in the field and more consistency across studies. Further, interrater agreement was higher than that usually reported for individual mental disorders. This paper will provide an overview of (a) IPV's scope and impact; (b) the reliable and valid diagnostic criteria that have been used and the adaptation of these criteria inserted in the latest Diagnostic and Statistical Manual of Mental Disorders (DSM) and another adaptation proposed for the forthcoming International Statistical Classification of Diseases and Related Health Problems (ICD); (c) suggestions for screening of IPV in primary care settings; (d) interventions for IPV; and (e) suggested steps toward globally accepted programs. © 2015 Family Process Institute.

  9. Statistical methodology: II. Reliability and validity assessment in study design, Part B.

    PubMed

    Karras, D J

    1997-02-01

    Validity measures the correspondence between a test and other purported measures of the same or similar qualities. When a reference standard exists, a criterion-based validity coefficient can be calculated. If no such standard is available, the concepts of content and construct validity may be used, but quantitative analysis may not be possible. The Pearson and Spearman tests of correlation are often used to assess the correspondence between tests, but do not account for measurement biases and may yield misleading results. Techniques that measure interest differences may be more meaningful in validity assessment, and the kappa statistic is useful for analyzing categorical variables. Questionnaires often can be designed to allow quantitative assessment of reliability and validity, although this may be difficult. Inclusion of homogeneous questions is necessary to assess reliability. Analysis is enhanced by using Likert scales or similar techniques that yield ordinal data. Validity assessment of questionnaires requires careful definition of the scope of the test and comparison with previously validated tools.

  10. The Effects of a Peer-Delivered Social Skills Intervention for Adults with Comorbid Down Syndrome and Autism Spectrum Disorder.

    PubMed

    Davis, Matthew A Cody; Spriggs, Amy; Rodgers, Alexis; Campbell, Jonathan

    2018-06-01

    Deficits in social skills are often exhibited in individuals with comorbid Down syndrome (DS) and autism spectrum disorder (ASD), and there is a paucity of research to help guide intervention for this population. In the present study, a multiple probe study across behaviors, replicated across participants, assessed the effectiveness of peer-delivered simultaneous prompting in teaching socials skills to adults with DS-ASD using visual analysis techniques and Tau-U statistics to measure effect. Peer-mediators with DS and intellectual disability (ID) delivered simultaneous prompting sessions reliably (i.e., > 80% reliability) to teach social skills to adults with ID and a dual-diagnoses of DS-ASD with small (Tau Weighted  = .55, 90% CI [.29, .82]) to medium effects (Tau Weighted  = .75, 90% CI [.44, 1]). Statistical and visual analysis findings suggest a promising social skills intervention for individuals with DS-ASD as well as reliable delivery of simultaneous prompting procedures by individuals with DS.

  11. A new principle for the standardization of long paragraphs for reading speed analysis.

    PubMed

    Radner, Wolfgang; Radner, Stephan; Diendorfer, Gabriela

    2016-01-01

    To investigate the reliability, validity, and statistical comparability of long paragraphs that were developed to be equivalent in construction and difficulty. Seven long paragraphs were developed that were equal in syntax, morphology, and number and position of words (111), with the same number of syllables (179) and number of characters (660). For validity analyses, the paragraphs were compared with the mean reading speed of a set of seven sentence optotypes of the RADNER Reading Charts (mean of 7 × 14 = 98 words read). Reliability analyses were performed by calculating the Cronbach's alpha value and the corrected total item correlation. Sixty participants (aged 20-77 years) read the paragraphs and the sentences (distance 40 cm; font: Times New Roman 12 pt). Test items were presented randomly; reading length was measured with a stopwatch. Reliability analysis yielded a Cronbach's alpha value of 0.988. When the long paragraphs were compared in pairwise fashion, significant differences were found in 13 of the 21 pairs (p < 0.05). In two sequences of three paragraphs each and in eight pairs of paragraphs, the paragraphs did not differ significantly, and these paragraph combinations are therefore suitable for comparative research studies. The mean reading speed was 173.34 ± 24.01 words per minute (wpm) for the long paragraphs and 198.26 ± 28.60 wpm for the sentence optotypes. The maximum difference in reading speed was 5.55 % for the long paragraphs and 2.95 % for the short sentence optotypes. The correlation between long paragraphs and sentence optotypes was high (r = 0.9243). Despite good reliability and equivalence in construction and degree of difficulty, a statistically significant difference in reading speed can occur between long paragraphs. Since statistical significance should be dependent only on the persons tested, either standardizing long paragraphs for statistical equality of reading speed measurements or increasing the number of presented paragraphs is recommended for comparative investigations.

  12. A Quantitative Assessment of Utility Reporting Practices for Reporting Electric Power Distribution Events

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamachi La Commare, Kristina

    Metrics for reliability, such as the frequency and duration of power interruptions, have been reported by electric utilities for many years. This study examines current utility practices for collecting and reporting electricity reliability information and discusses challenges that arise in assessing reliability because of differences among these practices. The study is based on reliability information for year 2006 reported by 123 utilities in 37 states representing over 60percent of total U.S. electricity sales. We quantify the effects that inconsistencies among current utility reporting practices have on comparisons of System Average Interruption Duration Index (SAIDI) and System Average Interruption Frequency Indexmore » (SAIFI) reported by utilities. We recommend immediate adoption of IEEE Std. 1366-2003 as a consistent method for measuring and reporting reliability statistics.« less

  13. A General Reliability Model for Ni-BaTiO3-Based Multilayer Ceramic Capacitors

    NASA Technical Reports Server (NTRS)

    Liu, Donhang

    2014-01-01

    The evaluation of multilayer ceramic capacitors (MLCCs) with Ni electrode and BaTiO3 dielectric material for potential space project applications requires an in-depth understanding of their reliability. A general reliability model for Ni-BaTiO3 MLCC is developed and discussed. The model consists of three parts: a statistical distribution; an acceleration function that describes how a capacitor's reliability life responds to the external stresses, and an empirical function that defines contribution of the structural and constructional characteristics of a multilayer capacitor device, such as the number of dielectric layers N, dielectric thickness d, average grain size, and capacitor chip size A. Application examples are also discussed based on the proposed reliability model for Ni-BaTiO3 MLCCs.

  14. A General Reliability Model for Ni-BaTiO3-Based Multilayer Ceramic Capacitors

    NASA Technical Reports Server (NTRS)

    Liu, Donhang

    2014-01-01

    The evaluation for potential space project applications of multilayer ceramic capacitors (MLCCs) with Ni electrode and BaTiO3 dielectric material requires an in-depth understanding of the MLCCs reliability. A general reliability model for Ni-BaTiO3 MLCCs is developed and discussed in this paper. The model consists of three parts: a statistical distribution; an acceleration function that describes how a capacitors reliability life responds to external stresses; and an empirical function that defines the contribution of the structural and constructional characteristics of a multilayer capacitor device, such as the number of dielectric layers N, dielectric thickness d, average grain size r, and capacitor chip size A. Application examples are also discussed based on the proposed reliability model for Ni-BaTiO3 MLCCs.

  15. MEMS reliability: The challenge and the promise

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, W.M.; Tanner, D.M.; Miller, S.L.

    1998-05-01

    MicroElectroMechanical Systems (MEMS) that think, sense, act and communicate will open up a broad new array of cost effective solutions only if they prove to be sufficiently reliable. A valid reliability assessment of MEMS has three prerequisites: (1) statistical significance; (2) a technique for accelerating fundamental failure mechanisms, and (3) valid physical models to allow prediction of failures during actual use. These already exist for the microelectronics portion of such integrated systems. The challenge lies in the less well understood micromachine portions and its synergistic effects with microelectronics. This paper presents a methodology addressing these prerequisites and a description ofmore » the underlying physics of reliability for micromachines.« less

  16. Comparing Interrater reliability between eye examination and eye self-examination 1

    PubMed Central

    de Lima, Maria Alzete; Pagliuca, Lorita Marlena Freitag; do Nascimento, Jennara Cândido; Caetano, Joselany Áfio

    2017-01-01

    Resume Objective: to compare Interrater reliability concerning two eye assessment methods. Method: quasi-experimental study conducted with 324 college students including eye self-examination and eye assessment performed by the researchers in a public university. Kappa coefficient was used to verify agreement. Results: reliability coefficients between Interraters ranged from 0.85 to 0.95, with statistical significance at 0.05. The exams to check for near acuity and peripheral vision presented a reasonable kappa >0.2. The remaining coefficients were higher, ranging from very to totally reliable. Conclusion: comparatively, the results of both methods were similar. The virtual manual on eye self-examination can be used to screen for eye conditions. PMID:29069269

  17. IMPLEMENTATION AND VALIDATION OF STATISTICAL TESTS IN RESEARCH'S SOFTWARE HELPING DATA COLLECTION AND PROTOCOLS ANALYSIS IN SURGERY.

    PubMed

    Kuretzki, Carlos Henrique; Campos, Antônio Carlos Ligocki; Malafaia, Osvaldo; Soares, Sandramara Scandelari Kusano de Paula; Tenório, Sérgio Bernardo; Timi, Jorge Rufino Ribas

    2016-03-01

    The use of information technology is often applied in healthcare. With regard to scientific research, the SINPE(c) - Integrated Electronic Protocols was created as a tool to support researchers, offering clinical data standardization. By the time, SINPE(c) lacked statistical tests obtained by automatic analysis. Add to SINPE(c) features for automatic realization of the main statistical methods used in medicine . The study was divided into four topics: check the interest of users towards the implementation of the tests; search the frequency of their use in health care; carry out the implementation; and validate the results with researchers and their protocols. It was applied in a group of users of this software in their thesis in the strict sensu master and doctorate degrees in one postgraduate program in surgery. To assess the reliability of the statistics was compared the data obtained both automatically by SINPE(c) as manually held by a professional in statistics with experience with this type of study. There was concern for the use of automatic statistical tests, with good acceptance. The chi-square, Mann-Whitney, Fisher and t-Student were considered as tests frequently used by participants in medical studies. These methods have been implemented and thereafter approved as expected. The incorporation of the automatic SINPE (c) Statistical Analysis was shown to be reliable and equal to the manually done, validating its use as a research tool for medical research.

  18. Relating design and environmental variables to reliability

    NASA Astrophysics Data System (ADS)

    Kolarik, William J.; Landers, Thomas L.

    The combination of space application and nuclear power source demands high reliability hardware. The possibilities of failure, either an inability to provide power or a catastrophic accident, must be minimized. Nuclear power experiences on the ground have led to highly sophisticated probabilistic risk assessment procedures, most of which require quantitative information to adequately assess such risks. In the area of hardware risk analysis, reliability information plays a key role. One of the lessons learned from the Three Mile Island experience is that thorough analyses of critical components are essential. Nuclear grade equipment shows some reliability advantages over commercial. However, no statistically significant difference has been found. A recent study pertaining to spacecraft electronics reliability, examined some 2500 malfunctions on more than 300 aircraft. The study classified the equipment failures into seven general categories. Design deficiencies and lack of environmental protection accounted for about half of all failures. Within each class, limited reliability modeling was performed using a Weibull failure model.

  19. Statistics on Blindness in the Model Reporting Area 1969-1970.

    ERIC Educational Resources Information Center

    Kahn, Harold A.; Moorhead, Helen B.

    Presented in the form of 30 tables are statistics on blindness in 16 states which have agreed to uniform definitions and procedures to improve reliability of data regarding blind persons. The data indicates that rates of blindness were generally higher for nonwhites than for whites with the ratio ranging from almost 10 for glaucoma to minimal for…

  20. The Impact of Outliers on Cronbach's Coefficient Alpha Estimate of Reliability: Visual Analogue Scales

    ERIC Educational Resources Information Center

    Liu, Yan; Zumbo, Bruno D.

    2007-01-01

    The impact of outliers on Cronbach's coefficient [alpha] has not been documented in the psychometric or statistical literature. This is an important gap because coefficient [alpha] is the most widely used measurement statistic in all of the social, educational, and health sciences. The impact of outliers on coefficient [alpha] is investigated for…

  1. The Probability of Obtaining Two Statistically Different Test Scores as a Test Index

    ERIC Educational Resources Information Center

    Muller, Jorg M.

    2006-01-01

    A new test index is defined as the probability of obtaining two randomly selected test scores (PDTS) as statistically different. After giving a concept definition of the test index, two simulation studies are presented. The first analyzes the influence of the distribution of test scores, test reliability, and sample size on PDTS within classical…

  2. Length and Rate of Individual Participation in Various Activities on Recreation Sites and Areas

    Treesearch

    Gary L. Tyre; George A. James

    1971-01-01

    While statistically reliable methods exist for estimating recreation use on large areas, they often prove prohibitively expensive. Inexpensive alternatives involving the length and rate of individual participation in specific activites are presented, together with data and statistics on the recreational use of three large areas on the National Forests. This...

  3. Districts Take Action to Stem Violence Aimed at Teachers

    ERIC Educational Resources Information Center

    Honawar, Vaishali

    2008-01-01

    Experts caution that reliable and up-to-date statistics on student violence against teachers can be hard to acquire. National and district data, however, show a drop in such violence over the past decade. The National Center for Education Statistics' 2007 school crime and safety report, the only known source for such data nationwide, says the…

  4. Reliability of the ADI-R for the Single Case-Part II: Clinical versus Statistical Significance

    ERIC Educational Resources Information Center

    Cicchetti, Domenic V.; Lord, Catherine; Koenig, Kathy; Klin, Ami; Volkmar, Fred R.

    2014-01-01

    In an earlier investigation, the authors assessed the reliability of the ADI-R when multiple clinicians evaluated a single case, here a female 3 year old toddler suspected of having an autism spectrum disorder (Cicchetti et al. in "J Autism Dev Disord" 38:764-770, 2008). Applying the clinical criteria of Cicchetti and Sparrow ("Am J…

  5. GTest: a software tool for graphical assessment of empirical distributions' Gaussianity.

    PubMed

    Barca, E; Bruno, E; Bruno, D E; Passarella, G

    2016-03-01

    In the present paper, the novel software GTest is introduced, designed for testing the normality of a user-specified empirical distribution. It has been implemented with two unusual characteristics; the first is the user option of selecting four different versions of the normality test, each of them suited to be applied to a specific dataset or goal, and the second is the inferential paradigm that informs the output of such tests: it is basically graphical and intrinsically self-explanatory. The concept of inference-by-eye is an emerging inferential approach which will find a successful application in the near future due to the growing need of widening the audience of users of statistical methods to people with informal statistical skills. For instance, the latest European regulation concerning environmental issues introduced strict protocols for data handling (data quality assurance, outliers detection, etc.) and information exchange (areal statistics, trend detection, etc.) between regional and central environmental agencies. Therefore, more and more frequently, laboratory and field technicians will be requested to utilize complex software applications for subjecting data coming from monitoring, surveying or laboratory activities to specific statistical analyses. Unfortunately, inferential statistics, which actually influence the decisional processes for the correct managing of environmental resources, are often implemented in a way which expresses its outcomes in a numerical form with brief comments in a strict statistical jargon (degrees of freedom, level of significance, accepted/rejected H0, etc.). Therefore, often, the interpretation of such outcomes is really difficult for people with poor statistical knowledge. In such framework, the paradigm of the visual inference can contribute to fill in such gap, providing outcomes in self-explanatory graphical forms with a brief comment in the common language. Actually, the difficulties experienced by colleagues and their request for an effective tool for addressing such difficulties motivated us in adopting the inference-by-eye paradigm and implementing an easy-to-use, quick and reliable statistical tool. GTest visualizes its outcomes as a modified version of the Q-Q plot. The application has been developed in Visual Basic for Applications (VBA) within MS Excel 2010, which demonstrated to have all the characteristics of robustness and reliability needed. GTest provides true graphical normality tests which are as reliable as any statistical quantitative approach but much easier to understand. The Q-Q plots have been integrated with the outlining of an acceptance region around the representation of the theoretical distribution, defined in accordance with the alpha level of significance and the data sample size. The test decision rule is the following: if the empirical scatterplot falls completely within the acceptance region, then it can be concluded that the empirical distribution fits the theoretical one at the given alpha level. A comprehensive case study has been carried out with simulated and real-world data in order to check the robustness and reliability of the software.

  6. Evaluation of wind field statistics near and inside clouds using a coherent Doppler lidar

    NASA Astrophysics Data System (ADS)

    Lottman, Brian Todd

    1998-09-01

    This work proposes advanced techniques for measuring the spatial wind field statistics near and inside clouds using a vertically pointing solid state coherent Doppler lidar on a fixed ground based platform. The coherent Doppler lidar is an ideal instrument for high spatial and temporal resolution velocity estimates. The basic parameters of lidar are discussed, including a complete statistical description of the Doppler lidar signal. This description is extended to cases with simple functional forms for aerosol backscatter and velocity. An estimate for the mean velocity over a sensing volume is produced by estimating the mean spectra. There are many traditional spectral estimators, which are useful for conditions with slowly varying velocity and backscatter. A new class of estimators (novel) is introduced that produces reliable velocity estimates for conditions with large variations in aerosol backscatter and velocity with range, such as cloud conditions. Performance of traditional and novel estimators is computed for a variety of deterministic atmospheric conditions using computer simulated data. Wind field statistics are produced for actual data for a cloud deck, and for multi- layer clouds. Unique results include detection of possible spectral signatures for rain, estimates for the structure function inside a cloud deck, reliable velocity estimation techniques near and inside thin clouds, and estimates for simple wind field statistics between cloud layers.

  7. Malaria remains a military medical problem.

    PubMed

    World, M J

    2001-10-01

    To bring military medical problems concerning malaria to the attention of the Defence Medical Services. Seven military medical problems related to malaria are illustrated by cases referred for secondary assessment over the past five years. Each is discussed in relation to published data. The cases of failure of various kinds of chemoprophylaxis, diagnosis and treatment of malaria may represent just a fraction of the magnitude of the overall problem but in the absence of reliable published military medical statistics concerning malaria cases, the situation is unclear. Present experience suggests there are a number of persisting problems affecting the military population in relation to malaria. Only publication of reliable statistics will define their magnitude. Interim remedies are proposed whose cost-effectiveness remains to be established.

  8. Pitfalls and important issues in testing reliability using intraclass correlation coefficients in orthopaedic research.

    PubMed

    Lee, Kyoung Min; Lee, Jaebong; Chung, Chin Youb; Ahn, Soyeon; Sung, Ki Hyuk; Kim, Tae Won; Lee, Hui Jong; Park, Moon Seok

    2012-06-01

    Intra-class correlation coefficients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic field. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use. First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modified. Of the 92 orthopedic articles identified, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller. Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.

  9. Estimating liver cancer deaths in Thailand based on verbal autopsy study.

    PubMed

    Waeto, Salwa; Pipatjaturon, Nattakit; Tongkumchum, Phattrawan; Choonpradub, Chamnein; Saelim, Rattikan; Makaje, Nifatamah

    2014-01-01

    Liver cancer mortality is high in Thailand but utility of related vital statistics is limited due to national vital registration (VR) data being under reported for specific causes of deaths. Accurate methodologies and reliable supplementary data are needed to provide worthy national vital statistics. This study aimed to model liver cancer deaths based on verbal autopsy (VA) study in 2005 to provide more accurate estimates of liver cancer deaths than those reported. The results were used to estimate number of liver cancer deaths during 2000-2009. A verbal autopsy (VA) was carried out in 2005 based on a sample of 9,644 deaths from nine provinces and it provided reliable information on causes of deaths by gender, age group, location of deaths in or outside hospital, and causes of deaths of the VR database. Logistic regression was used to model liver cancer deaths and other variables. The estimated probabilities from the model were applied to liver cancer deaths in the VR database, 2000-2009. Thus, the more accurately VA-estimated numbers of liver cancer deaths were obtained. The model fits the data quite well with sensitivity 0.64. The confidence intervals from statistical model provide the estimates and their precisions. The VA-estimated numbers of liver cancer deaths were higher than the corresponding VR database with inflation factors 1.56 for males and 1.64 for females. The statistical methods used in this study can be applied to available mortality data in developing countries where their national vital registration data are of low quality and supplementary reliable data are available.

  10. Reliability of Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory in a test-retest design.

    PubMed

    Larson, Tomas; Kerekes, Nóra; Selinus, Eva Norén; Lichtenstein, Paul; Gumpert, Clara Hellner; Anckarsäter, Henrik; Nilsson, Thomas; Lundström, Sebastian

    2014-02-01

    The Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A-TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A-TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's kappa. A-TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A-TAC had intra- and inter-rater reliability intraclass correlation coefficients of > or = .60. Cohen's kappa indi- cated acceptable reliability. The current study provides statistical evidence that the A-TAC yields good test-retest reliability in a population-based cohort of children.

  11. The inter and intra rater reliability of the Netball Movement Screening Tool.

    PubMed

    Reid, Duncan A; Vanweerd, Rebecca J; Larmer, Peter J; Kingstone, Rachel

    2015-05-01

    To establish the inter- and intra-rater reliability of the Netball Movement Screening Tool, for screening adolescent female netball players. Inter- and intra-rater reliability study. Forty secondary school netball players were recruited to take part in the study. Twenty subjects were screened simultaneously and independently by two raters to ascertain inter-rater agreement. Twenty subjects were scored by rater one on two occasions, separated by a week, to ascertain intra-rater agreement. Inter and intra-rater agreement was assessed utilising the two-way mixed inter class correlation coefficient and weighted kappa statistics. No significant demographic differences were found between the inter and intra-rater groups of subjects. Inter class correlation coefficients' demonstrated excellent inter-rater (two-way mixed inter class correlation coefficients 0.84, standard error of measurement 0.25) and intra-rater (two-way mixed inter class correlation coefficients 0.96, standard error of measurement 0.13) reliability for the overall Netball Movement Screening Tool score and substantial-excellent (two-way mixed inter class correlation coefficients 1.0-0.65) inter-rater and substantial-excellent intra-rater (two-way mixed inter class correlation coefficients 0.96-0.79) reliability for the component scores of the Netball Movement Screening Tool. Kappa statistic showed substantial to poor inter-rater (k=0.75-0.32) and intra-rater (k=0.77-0.27) agreement for individual tests of the NMST. The Netball Movement Screening Tool may be a reliable screening tool for adolescent netball players; however the individual test scores have low reliability. The screening tool can be administered reliably by raters with similar levels of training in the tool but variable clinical experience. On-going research needs to be undertaken to ascertain whether the Netball Movement Screening Tool is a valid tool in ascertaining increased injury risk for netball players. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  12. Reconciling Streamflow Uncertainty Estimation and River Bed Morphology Dynamics. Insights from a Probabilistic Assessment of Streamflow Uncertainties Using a Reliability Diagram

    NASA Astrophysics Data System (ADS)

    Morlot, T.; Mathevet, T.; Perret, C.; Favre Pugin, A. C.

    2014-12-01

    Streamflow uncertainty estimation has recently received a large attention in the literature. A dynamic rating curve assessment method has been introduced (Morlot et al., 2014). This dynamic method allows to compute a rating curve for each gauging and a continuous streamflow time-series, while calculating streamflow uncertainties. Streamflow uncertainty takes into account many sources of uncertainty (water level, rating curve interpolation and extrapolation, gauging aging, etc.) and produces an estimated distribution of streamflow for each days. In order to caracterise streamflow uncertainty, a probabilistic framework has been applied on a large sample of hydrometric stations of the Division Technique Générale (DTG) of Électricité de France (EDF) hydrometric network (>250 stations) in France. A reliability diagram (Wilks, 1995) has been constructed for some stations, based on the streamflow distribution estimated for a given day and compared to a real streamflow observation estimated via a gauging. To build a reliability diagram, we computed the probability of an observed streamflow (gauging), given the streamflow distribution. Then, the reliability diagram allows to check that the distribution of probabilities of non-exceedance of the gaugings follows a uniform law (i.e., quantiles should be equipropables). Given the shape of the reliability diagram, the probabilistic calibration is caracterised (underdispersion, overdispersion, bias) (Thyer et al., 2009). In this paper, we present case studies where reliability diagrams have different statistical properties for different periods. Compared to our knowledge of river bed morphology dynamic of these hydrometric stations, we show how reliability diagram gives us invaluable information on river bed movements, like a continuous digging or backfilling of the hydraulic control due to erosion or sedimentation processes. Hence, the careful analysis of reliability diagrams allows to reconcile statistics and long-term river bed morphology processes. This knowledge improves our real-time management of hydrometric stations, given a better caracterisation of erosion/sedimentation processes and the stability of hydrometric station hydraulic control.

  13. [Reliability and validity analysis of simplified Chinese version of QOL questionnaire of olfactory disorders].

    PubMed

    Jin, X F; Wang, J; Li, Y J; Liu, J F; Ni, D F

    2016-09-20

    Objective: To cross-culturally translate the questionnaire of olfactory disorders(QOD)into a simplified Chinese version, and evaluate its reliability and validity in clinical. Method: A simplified Chinese version of the QOD was evaluated in test-retest reliability, split-half reliability and internal consistency.Then it was evaluated in validity test including content validity, criterion-related validity, responsibility. Criterion-related validity was using the medical outcome study's 36-item short rorm health survey(SF-36) and the World Health Organization quality of life-brief (WHOQOL-BREF) for comparison. Result: A total of 239 patients with olfactory dysfunction were enrolled and tested, in which 195 patients completed all three surveys(QOD, SF-36, WHOQOL-BREF). The test-retest reliabilities of the QOD-parosmia statements(QOD-P), QOD-quality of life(QOD-QoL), and the QOD-visual simulation(QOD-VAS)sections were 0.799( P <0.01),0.781( P <0.01),0.488( P <0.01), respectively, and the Cronbach' s α coefficients reliability were 0.477,0.812,0.889,respectively.The split-half reliability of QOD-QoL was 0.89. There was no correlation between the QOD-P section and the SF-36, but there were statistically significant correlations between the QOD-QoL and QOD-VAS sections with the SF-36. There was no correlation between the QOD-P section and the WHOQOL-BREF, but there were statistically significant correlations between the QOD-QoL and QOD-VAS sections with the SF-36 in most sections. Conclusion: The simplified Chinese version of the QOD was testified to be a reliable and valid questionnaire for evaluating patients with olfactory dysfunction living in mainland of China.The QOD-P section needs further modifications to properly adapt patients with Chinese cultural and knowledge background. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.

  14. Power-up: A Reanalysis of 'Power Failure' in Neuroscience Using Mixture Modeling.

    PubMed

    Nord, Camilla L; Valton, Vincent; Wood, John; Roiser, Jonathan P

    2017-08-23

    Recently, evidence for endemically low statistical power has cast neuroscience findings into doubt. If low statistical power plagues neuroscience, then this reduces confidence in the reported effects. However, if statistical power is not uniformly low, then such blanket mistrust might not be warranted. Here, we provide a different perspective on this issue, analyzing data from an influential study reporting a median power of 21% across 49 meta-analyses (Button et al., 2013). We demonstrate, using Gaussian mixture modeling, that the sample of 730 studies included in that analysis comprises several subcomponents so the use of a single summary statistic is insufficient to characterize the nature of the distribution. We find that statistical power is extremely low for studies included in meta-analyses that reported a null result and that it varies substantially across subfields of neuroscience, with particularly low power in candidate gene association studies. Therefore, whereas power in neuroscience remains a critical issue, the notion that studies are systematically underpowered is not the full story: low power is far from a universal problem. SIGNIFICANCE STATEMENT Recently, researchers across the biomedical and psychological sciences have become concerned with the reliability of results. One marker for reliability is statistical power: the probability of finding a statistically significant result given that the effect exists. Previous evidence suggests that statistical power is low across the field of neuroscience. Our results present a more comprehensive picture of statistical power in neuroscience: on average, studies are indeed underpowered-some very seriously so-but many studies show acceptable or even exemplary statistical power. We show that this heterogeneity in statistical power is common across most subfields in neuroscience. This new, more nuanced picture of statistical power in neuroscience could affect not only scientific understanding, but potentially policy and funding decisions for neuroscience research. Copyright © 2017 Nord, Valton et al.

  15. Software Reliability 2002

    NASA Technical Reports Server (NTRS)

    Wallace, Dolores R.

    2003-01-01

    In FY01 we learned that hardware reliability models need substantial changes to account for differences in software, thus making software reliability measurements more effective, accurate, and easier to apply. These reliability models are generally based on familiar distributions or parametric methods. An obvious question is 'What new statistical and probability models can be developed using non-parametric and distribution-free methods instead of the traditional parametric method?" Two approaches to software reliability engineering appear somewhat promising. The first study, begin in FY01, is based in hardware reliability, a very well established science that has many aspects that can be applied to software. This research effort has investigated mathematical aspects of hardware reliability and has identified those applicable to software. Currently the research effort is applying and testing these approaches to software reliability measurement, These parametric models require much project data that may be difficult to apply and interpret. Projects at GSFC are often complex in both technology and schedules. Assessing and estimating reliability of the final system is extremely difficult when various subsystems are tested and completed long before others. Parametric and distribution free techniques may offer a new and accurate way of modeling failure time and other project data to provide earlier and more accurate estimates of system reliability.

  16. Reliability assessments in qualitative health promotion research.

    PubMed

    Cook, Kay E

    2012-03-01

    This article contributes to the debate about the use of reliability assessments in qualitative research in general, and health promotion research in particular. In this article, I examine the use of reliability assessments in qualitative health promotion research in response to health promotion researchers' commonly held misconception that reliability assessments improve the rigor of qualitative research. All qualitative articles published in the journal Health Promotion International from 2003 to 2009 employing reliability assessments were examined. In total, 31.3% (20/64) articles employed some form of reliability assessment. The use of reliability assessments increased over the study period, ranging from <20% in 2003/2004 to 50% and above in 2008/2009, while at the same time the total number of qualitative articles decreased. The articles were then classified into four types of reliability assessments, including the verification of thematic codes, the use of inter-rater reliability statistics, congruence in team coding and congruence in coding across sites. The merits of each type were discussed, with the subsequent discussion focusing on the deductive nature of reliable thematic coding, the limited depth of immediately verifiable data and the usefulness of such studies to health promotion and the advancement of the qualitative paradigm.

  17. Statistically Controlling for Confounding Constructs Is Harder than You Think

    PubMed Central

    Westfall, Jacob; Yarkoni, Tal

    2016-01-01

    Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity. PMID:27031707

  18. Statistical studies of animal response data from USF toxicity screening test method

    NASA Technical Reports Server (NTRS)

    Hilado, C. J.; Machado, A. M.

    1978-01-01

    Statistical examination of animal response data obtained using Procedure B of the USF toxicity screening test method indicates that the data deviate only slightly from a normal or Gaussian distribution. This slight departure from normality is not expected to invalidate conclusions based on theoretical statistics. Comparison of times to staggering, convulsions, collapse, and death as endpoints shows that time to death appears to be the most reliable endpoint because it offers the lowest probability of missed observations and premature judgements.

  19. Validation of the World Health Organization tool for situational analysis to assess emergency and essential surgical care at district hospitals in Ghana.

    PubMed

    Osen, Hayley; Chang, David; Choo, Shelly; Perry, Henry; Hesse, Afua; Abantanga, Francis; McCord, Colin; Chrouser, Kristin; Abdullah, Fizan

    2011-03-01

    The World Health Organization (WHO) Tool for Situational Analysis to Assess Emergency and Essential Surgical Care (hereafter called the WHO Tool) has been used in more than 25 countries and is the largest effort to assess surgical care in the world. However, it has not yet been independently validated. Test-retest reliability is one way to validate the degree to which tests instruments are free from random error. The aim of the present field study was to determine the test-retest reliability of the WHO Tool. The WHO Tool was mailed to 10 district hospitals in Ghana. Written instructions were provided along with a letter from the Ghana Health Services requesting the hospital administrator to complete the survey tool. After ensuring delivery and completion of the forms, the study team readministered the WHO Tool at the time of an on-site visit less than 1 month later. The results of the two tests were compared to calculate kappa statistics for each of the 152 questions in the WHO Tool. The kappa statistic is a statistical measure of the degree of agreement above what would be expected based on chance alone. Ten hospitals were surveyed twice over a short interval (i.e., less than 1 month). Weighted and unweighted kappa statistics were calculated for 152 questions. The median unweighted kappa for the entire survey was 0.43 (interquartile range 0-0.84). The infrastructure section (24 questions) had a median kappa of 0.81; the human resources section (13 questions) had a median kappa of 0.77; the surgical procedures section (67 questions) had a median kappa of 0.00; and the emergency surgical equipment section (48 questions) had a median kappa of 0.81. Hospital capacity survey questions related to infrastructure characteristics had high reliability. However, questions related to process of care had poor reliability and may benefit from supplemental data gathered by direct observation. Limitations to the study include the small sample size: 10 district hospitals in a single country. Consistent and high correlations calculated from the field testing within the present analysis suggest that the WHO Tool for Situational Analysis is a reliable tool where it measures structure and setting, but it should be revised for measuring process of care.

  20. NDE detectability of fatigue type cracks in high strength alloys

    NASA Technical Reports Server (NTRS)

    Christner, B. K.; Rummel, W. D.

    1983-01-01

    Specimens suitable for investigating the reliability of production nondestructive evaluation (NDE) to detect tightly closed fatigue cracks in high strength alloys representative of those materials used in spacecraft engine/booster construction were produced. Inconel 718 was selected as representative of nickel base alloys and Haynes 188 was selected as representative of cobalt base alloys used in this application. Cleaning procedures were developed to insure the reusability of the test specimens and a flaw detection reliability assessment of the fluorescent penetrant inspection method was performed using the test specimens produced to characterize their use for future reliability assessments and to provide additional NDE flaw detection reliability data for high strength alloys. The statistical analysis of the fluorescent penetrant inspection data was performed to determine the detection reliabilities for each inspection at a 90% probability/95% confidence level.

  1. Space flight risk data collection and analysis project: Risk and reliability database

    NASA Technical Reports Server (NTRS)

    1994-01-01

    The focus of the NASA 'Space Flight Risk Data Collection and Analysis' project was to acquire and evaluate space flight data with the express purpose of establishing a database containing measurements of specific risk assessment - reliability - availability - maintainability - supportability (RRAMS) parameters. The developed comprehensive RRAMS database will support the performance of future NASA and aerospace industry risk and reliability studies. One of the primary goals has been to acquire unprocessed information relating to the reliability and availability of launch vehicles and the subsystems and components thereof from the 45th Space Wing (formerly Eastern Space and Missile Command -ESMC) at Patrick Air Force Base. After evaluating and analyzing this information, it was encoded in terms of parameters pertinent to ascertaining reliability and availability statistics, and then assembled into an appropriate database structure.

  2. The reliability and reproducibility of cephalometric measurements: a comparison of conventional and digital methods

    PubMed Central

    AlBarakati, SF; Kula, KS; Ghoneima, AA

    2012-01-01

    Objective The aim of this study was to assess the reliability and reproducibility of angular and linear measurements of conventional and digital cephalometric methods. Methods A total of 13 landmarks and 16 skeletal and dental parameters were defined and measured on pre-treatment cephalometric radiographs of 30 patients. The conventional and digital tracings and measurements were performed twice by the same examiner with a 6 week interval between measurements. The reliability within the method was determined using Pearson's correlation coefficient (r2). The reproducibility between methods was calculated by paired t-test. The level of statistical significance was set at p < 0.05. Results All measurements for each method were above 0.90 r2 (strong correlation) except maxillary length, which had a correlation of 0.82 for conventional tracing. Significant differences between the two methods were observed in most angular and linear measurements except for ANB angle (p = 0.5), angle of convexity (p = 0.09), anterior cranial base (p = 0.3) and the lower anterior facial height (p = 0.6). Conclusion In general, both methods of conventional and digital cephalometric analysis are highly reliable. Although the reproducibility of the two methods showed some statistically significant differences, most differences were not clinically significant. PMID:22184624

  3. The Impact of Statistical Adjustment on Conditional Standard Errors of Measurement in the Assessment of Physician Communication Skills

    ERIC Educational Resources Information Center

    Raymond, Mark R.; Clauser, Brian E.; Furman, Gail E.

    2010-01-01

    The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary…

  4. The influence of the Tribulus terrestris extract on the parameters of the functional preparedness and athletes' organism homeostasis.

    PubMed

    Milasius, K; Dadeliene, R; Skernevicius, Ju

    2009-01-01

    The influence of the Tribulus terrestris extract on the parameters of the functional preparadness and athletes' organism homeostase was investigated. It was established the positive impact of dietary supplement "Tribulus" (Optimum Nutrition, U.S.A.) using per 1 capsule 3 times a day during 20 days on athletes' physical power in various energy producing zones: anaerobic alactic muscular power and anaerobic alactic glycolytic power statistically reliable increased. Tribulus terrestris extract, after 20 days of consuming it, did not have essential effect on erythrocytes, haemoglobin and thrombocytes indices. During the experimental period statistically importantly increased percentage of granulocytes and decreased percentage of leucocytes show negative impact of this food supplement on changes of leucocytes formula in athletes' blood. Creatinkinase concentration in athletes' blood statistically importantly has increased and creatinine amount has had a tendency to decline during 20 days period of consuming Tribulus terrestris extract. The declining tendency of urea, cholesterol and bilirubin concentrations has appeared. The concentration of blood testosterone increased statistically reliable during the first half (10 days) of the experiment; it did not grow during the next 10 days while consuming Tribulus still.

  5. Statistical Characterization of the Mechanical Parameters of Intact Rock Under Triaxial Compression: An Experimental Proof of the Jinping Marble

    NASA Astrophysics Data System (ADS)

    Jiang, Quan; Zhong, Shan; Cui, Jie; Feng, Xia-Ting; Song, Leibo

    2016-12-01

    We investigated the statistical characteristics and probability distribution of the mechanical parameters of natural rock using triaxial compression tests. Twenty cores of Jinping marble were tested under each different levels of confining stress (i.e., 5, 10, 20, 30, and 40 MPa). From these full stress-strain data, we summarized the numerical characteristics and determined the probability distribution form of several important mechanical parameters, including deformational parameters, characteristic strength, characteristic strains, and failure angle. The statistical proofs relating to the mechanical parameters of rock presented new information about the marble's probabilistic distribution characteristics. The normal and log-normal distributions were appropriate for describing random strengths of rock; the coefficients of variation of the peak strengths had no relationship to the confining stress; the only acceptable random distribution for both Young's elastic modulus and Poisson's ratio was the log-normal function; and the cohesive strength had a different probability distribution pattern than the frictional angle. The triaxial tests and statistical analysis also provided experimental evidence for deciding the minimum reliable number of experimental sample and for picking appropriate parameter distributions to use in reliability calculations for rock engineering.

  6. Estimating predictive hydrological uncertainty by dressing deterministic and ensemble forecasts; a comparison, with application to Meuse and Rhine

    NASA Astrophysics Data System (ADS)

    Verkade, J. S.; Brown, J. D.; Davids, F.; Reggiani, P.; Weerts, A. H.

    2017-12-01

    Two statistical post-processing approaches for estimation of predictive hydrological uncertainty are compared: (i) 'dressing' of a deterministic forecast by adding a single, combined estimate of both hydrological and meteorological uncertainty and (ii) 'dressing' of an ensemble streamflow forecast by adding an estimate of hydrological uncertainty to each individual streamflow ensemble member. Both approaches aim to produce an estimate of the 'total uncertainty' that captures both the meteorological and hydrological uncertainties. They differ in the degree to which they make use of statistical post-processing techniques. In the 'lumped' approach, both sources of uncertainty are lumped by post-processing deterministic forecasts using their verifying observations. In the 'source-specific' approach, the meteorological uncertainties are estimated by an ensemble of weather forecasts. These ensemble members are routed through a hydrological model and a realization of the probability distribution of hydrological uncertainties (only) is then added to each ensemble member to arrive at an estimate of the total uncertainty. The techniques are applied to one location in the Meuse basin and three locations in the Rhine basin. Resulting forecasts are assessed for their reliability and sharpness, as well as compared in terms of multiple verification scores including the relative mean error, Brier Skill Score, Mean Continuous Ranked Probability Skill Score, Relative Operating Characteristic Score and Relative Economic Value. The dressed deterministic forecasts are generally more reliable than the dressed ensemble forecasts, but the latter are sharper. On balance, however, they show similar quality across a range of verification metrics, with the dressed ensembles coming out slightly better. Some additional analyses are suggested. Notably, these include statistical post-processing of the meteorological forecasts in order to increase their reliability, thus increasing the reliability of the streamflow forecasts produced with ensemble meteorological forcings.

  7. The Alcohol Use Disorder and Associated Disabilities Interview Schedule-IV (AUDADIS-IV): Reliability of New Psychiatric Diagnostic Modules and Risk Factors in a General Population Sample

    PubMed Central

    Ruan, W. June; Goldstein, Risë B.; Chou, S. Patricia; Smith, Sharon M.; Saha, Tulshi D.; Pickering, Roger P.; Dawson, Deborah A.; Huang, Boji; Stinson, Frederick S.; Grant, Bridget F.

    2008-01-01

    This study presents test-retest reliability statistics and information on internal consistency for new diagnostic modules and risk factor of alcohol, drug, and psychiatric disorders the Alcohol Use Disorder and Associated Disabilities Interview Schedule-IV (AUDADIS-IV). Test-retest statistics were derived from a random sample of 1,899 adults selected from 34,653 respondents who participated in the 2004–2005 Wave 2 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Internal consistency of continuous scales was assessed using the entire Wave 2 NESARC. Both test and retest interviews were conducted face-to-face. Test-retest and internal consistency results for diagnoses and symptom scales associated with posttraumatic stress disorder, attention-deficit/hyperactivity disorder, and borderline, narcissistic, and schizotypal personality disorders were predominantly good (kappa > 0.63; ICC > 0.69; alpha > 0.75) and reliability for risk factor measures fell within the good to excellent range (intraclass correlations = 0.50–0.94; alpha = 0.64–0.90). The high degree of reliability found in this study suggests that new AUDADIS-IV diagnostic measures can be useful tools in research settings. The availability of highly reliable measures of risk factors of alcohol, drug, and psychiatric disorders will contribute to the validity of conclusions drawn from future research in the domains of substance use disorder and psychiatric epidemiology. PMID:17706375

  8. Reliability and mode of failure of bonded monolithic and multilayer ceramics.

    PubMed

    Alessandretti, Rodrigo; Borba, Marcia; Benetti, Paula; Corazza, Pedro Henrique; Ribeiro, Raissa; Della Bona, Alvaro

    2017-02-01

    To evaluate the reliability of monolithic and multilayer ceramic structures used in the CAD-on technique (Ivoclar), and the mode of failure produced in ceramic structures bonded to a dentin analog material (NEMA-G10). Ceramic specimens were fabricated as follows (n=30): CAD-on- trilayer structure (IPS e.max ZirCAD/IPS e.max Crystall./Connect/IPS e.max CAD); YLD- bilayer structure (IPS e.max ZirCAD/IPS e.max Ceram); LDC- monolithic structure (IPS e.max CAD); and YZW- monolithic structure (Zenostar Zr Translucent). All ceramic specimens were bonded to G10 and subjected to compressive load in 37°C distilled water until the sound of the first crack, monitored acoustically. Failure load (L f ) values were recorded (N) and statistically analyzed using Weibull distribution, Kruskal-Wallis test, and Student-Newman-Keuls test (α=0.05). L f values of CAD-on and YZW structures were statistically similar (p=0.917), but higher than YLD and LDC (p<0.01). Weibull modulus (m) values were statistically similar for all experimental groups. Monolithic structures (LDC and YZW) failed from radial cracks. Failures in the CAD-on and YLD groups showed, predominantly, both radial and cone cracks. Monolithic zirconia (YZW) and CAD-on structures showed similar failure resistance and reliability, but a different fracture behavior. Copyright © 2016 The Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.

  9. Touch Precision Modulates Visual Bias.

    PubMed

    Misceo, Giovanni F; Jones, Maurice D

    2018-01-01

    The sensory precision hypothesis holds that different seen and felt cues about the size of an object resolve themselves in favor of the more reliable modality. To examine this precision hypothesis, 60 college students were asked to look at one size while manually exploring another unseen size either with their bare fingers or, to lessen the reliability of touch, with their fingers sleeved in rigid tubes. Afterwards, the participants estimated either the seen size or the felt size by finding a match from a visual display of various sizes. Results showed that the seen size biased the estimates of the felt size when the reliability of touch decreased. This finding supports the interaction between touch reliability and visual bias predicted by statistically optimal models of sensory integration.

  10. The Infeasibility of Quantifying the Reliability of Life-Critical Real-Time Software

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Finelli, George B.

    1991-01-01

    This paper affirms that the quantification of life-critical software reliability is infeasible using statistical methods whether applied to standard software or fault-tolerant software. The classical methods of estimating reliability are shown to lead to exhorbitant amounts of testing when applied to life-critical software. Reliability growth models are examined and also shown to be incapable of overcoming the need for excessive amounts of testing. The key assumption of software fault tolerance separately programmed versions fail independently is shown to be problematic. This assumption cannot be justified by experimentation in the ultrareliability region and subjective arguments in its favor are not sufficiently strong to justify it as an axiom. Also, the implications of the recent multiversion software experiments support this affirmation.

  11. [A retrospective study on the assessment of dysphagia after partial laryngectomy].

    PubMed

    Su, T T; Sun, Z F

    2017-11-07

    Objective: To retrospectively investigate the long-term swallowing function of patients with laryngeal carcinoma, who underwent partial laryngectomy, discuss the effectiveness and reliability of Kubota drinking test in the assessment of patients with dysphagia, who underwent partial laryngectomy, and analyze the influence of different ways of operation on swallowing function. Methods: Clinical data were retrospectively analyzed on 83 patients with laryngeal carcinoma, who underwent partial laryngectomy between September 2012 and August 2015. Questionnaire survey, Kubota drinking test and video fluoroscopic swallowing study (VFSS) were conducted for patients during a scheduled interview. Patients were grouped by two ways: the one was whether epiglottis was retained, and the other was whether either arytenoids or both were reserved. The influence of different surgical techniques on swallowing function was analyzed according to the results of Kubota drinking test. The agreement and reliability of Kubota drinking test were statistically analyzed with respect to VFSS treated as the gold standard. SPSS23.0 software was used to analyze the data. Results: Questionnaire results revealed that among 83 patients underwent partial laryngectomy 32.53% suffered from eating disorder, and 43.37% experienced painful swallowing. The incidence of dysphagia was 40.96% according to the results of Kubota drinking test. There was statistical difference between the group with epiglottis remained and that having epiglottis removed in terms of the absence of dysphagia and severity. The statistical values of normal, moderate and severe dysphagia were in the order of 18.160, 7.229, 12.344( P <0.05). Also, statistical difference existed between the groups with either and both arytenoids reserved in terms of the absence of dysphagia as well as that of intermediate severity, and their statistical values were 4.790 and 9.110( P <0.05). A certain degree of agreement and reliability was present between the results of Kubota drinking test and VFSS( Kappa =0.551, r =0.810). Conclusions: It was of considerable significance to reserve epiglottis and arytenoids for the retention of swallowing function for patients post partial laryngectomy. There are certain degree of agreement and reliability between the results of Kubota drinking test and VFSS. The test, therefore, could be used as a tool for screening patients suffering from dysphagia post partial laryngectomy.

  12. A Unified Statistical Rain-Attenuation Model for Communication Link Fade Predictions and Optimal Stochastic Fade Control Design Using a Location-Dependent Rain-Statistic Database

    NASA Technical Reports Server (NTRS)

    Manning, Robert M.

    1990-01-01

    A static and dynamic rain-attenuation model is presented which describes the statistics of attenuation on an arbitrarily specified satellite link for any location for which there are long-term rainfall statistics. The model may be used in the design of the optimal stochastic control algorithms to mitigate the effects of attenuation and maintain link reliability. A rain-statistics data base is compiled, which makes it possible to apply the model to any location in the continental U.S. with a resolution of 0-5 degrees in latitude and longitude. The model predictions are compared with experimental observations, showing good agreement.

  13. Probability of detection of internal voids in structural ceramics using microfocus radiography

    NASA Technical Reports Server (NTRS)

    Baaklini, G. Y.; Roth, D. J.

    1986-01-01

    The reliability of microfocous X-radiography for detecting subsurface voids in structural ceramic test specimens was statistically evaluated. The microfocus system was operated in the projection mode using low X-ray photon energies (20 keV) and a 10 micro m focal spot. The statistics were developed for implanted subsurface voids in green and sintered silicon carbide and silicon nitride test specimens. These statistics were compared with previously-obtained statistics for implanted surface voids in similar specimens. Problems associated with void implantation are discussed. Statistical results are given as probability-of-detection curves at a 95 precent confidence level for voids ranging in size from 20 to 528 micro m in diameter.

  14. Probability of detection of internal voids in structural ceramics using microfocus radiography

    NASA Technical Reports Server (NTRS)

    Baaklini, G. Y.; Roth, D. J.

    1985-01-01

    The reliability of microfocus x-radiography for detecting subsurface voids in structural ceramic test specimens was statistically evaluated. The microfocus system was operated in the projection mode using low X-ray photon energies (20 keV) and a 10 micro m focal spot. The statistics were developed for implanted subsurface voids in green and sintered silicon carbide and silicon nitride test specimens. These statistics were compared with previously-obtained statistics for implanted surface voids in similar specimens. Problems associated with void implantation are discussed. Statistical results are given as probability-of-detection curves at a 95 percent confidence level for voids ranging in size from 20 to 528 micro m in diameter.

  15. Interrater reliability of the mind map assessment rubric in a cohort of medical students.

    PubMed

    D'Antoni, Anthony V; Zipp, Genevieve Pinto; Olson, Valerie G

    2009-04-28

    Learning strategies are thinking tools that students can use to actively acquire information. Examples of learning strategies include mnemonics, charts, and maps. One strategy that may help students master the tsunami of information presented in medical school is the mind map learning strategy. Currently, there is no valid and reliable rubric to grade mind maps and this may contribute to their underutilization in medicine. Because concept maps and mind maps engage learners similarly at a metacognitive level, a valid and reliable concept map assessment scoring system was adapted to form the mind map assessment rubric (MMAR). The MMAR can assess mind map depth based upon concept-links, cross-links, hierarchies, examples, pictures, and colors. The purpose of this study was to examine interrater reliability of the MMAR. This exploratory study was conducted at a US medical school as part of a larger investigation on learning strategies. Sixty-six (N = 66) first-year medical students were given a 394-word text passage followed by a 30-minute presentation on mind mapping. After the presentation, subjects were again given the text passage and instructed to create mind maps based upon the passage. The mind maps were collected and independently scored using the MMAR by 3 examiners. Interrater reliability was measured using the intraclass correlation coefficient (ICC) statistic. Statistics were calculated using SPSS version 12.0 (Chicago, IL). Analysis of the mind maps revealed the following: concept-links ICC = .05 (95% CI, -.42 to .38), cross-links ICC = .58 (95% CI, .37 to .73), hierarchies ICC = .23 (95% CI, -.15 to .50), examples ICC = .53 (95% CI, .29 to .69), pictures ICC = .86 (95% CI, .79 to .91), colors ICC = .73 (95% CI, .59 to .82), and total score ICC = .86 (95% CI, .79 to .91). The high ICC value for total mind map score indicates strong MMAR interrater reliability. Pictures and colors demonstrated moderate to strong interrater reliability. We conclude that the MMAR may be a valid and reliable tool to assess mind maps in medicine. However, further research on the validity and reliability of the MMAR is necessary.

  16. The better way to determine the validity, reliability, objectivity and accuracy of measuring devices.

    PubMed

    Pazira, Parvin; Rostami Haji-Abadi, Mahdi; Zolaktaf, Vahid; Sabahi, Mohammadfarzan; Pazira, Toomaj

    2016-06-08

    In relation to statistical analysis, studies to determine the validity, reliability, objectivity and precision of new measuring devices are usually incomplete, due in part to using only correlation coefficient and ignoring the data dispersion. The aim of this study was to demonstrate the best way to determine the validity, reliability, objectivity and accuracy of an electro-inclinometer or other measuring devices. Another purpose of this study is to answer the question of whether reliability and objectivity represent accuracy of measuring devices. The validity of an electro-inclinometer was examined by mechanical and geometric methods. The objectivity and reliability of the device was assessed by calculating Cronbach's alpha for repeated measurements by three raters and by measurements on the same person by mechanical goniometer and the electro-inclinometer. Measurements were performed on "hip flexion with the extended knee" and "shoulder abduction with the extended elbow." The raters measured every angle three times within an interval of two hours. The three-way ANOVA was used to determine accuracy. The results of mechanical and geometric analysis showed that validity of the electro-inclinometer was 1.00 and level of error was less than one degree. Objectivity and reliability of electro-inclinometer was 0.999, while objectivity of mechanical goniometer was in the range of 0.802 to 0.966 and the reliability was 0.760 to 0.961. For hip flexion, the difference between raters in joints angle measurement by electro-inclinometer and mechanical goniometer was 1.74 and 16.33 degree (P<0.05), respectively. The differences for shoulder abduction measurement by electro-inclinometer and goniometer were 0.35 and 4.40 degree (P<0.05). Although both the objectivity and reliability are acceptable, the results showed that measurement error was very high in the mechanical goniometer. Therefore, it can be concluded that objectivity and reliability alone cannot determine the accuracy of a device and it is preferable to use other statistical methods to compare and evaluate the accuracy of these two devices.

  17. Interrater reliability of the mind map assessment rubric in a cohort of medical students

    PubMed Central

    D'Antoni, Anthony V; Zipp, Genevieve Pinto; Olson, Valerie G

    2009-01-01

    Background Learning strategies are thinking tools that students can use to actively acquire information. Examples of learning strategies include mnemonics, charts, and maps. One strategy that may help students master the tsunami of information presented in medical school is the mind map learning strategy. Currently, there is no valid and reliable rubric to grade mind maps and this may contribute to their underutilization in medicine. Because concept maps and mind maps engage learners similarly at a metacognitive level, a valid and reliable concept map assessment scoring system was adapted to form the mind map assessment rubric (MMAR). The MMAR can assess mind map depth based upon concept-links, cross-links, hierarchies, examples, pictures, and colors. The purpose of this study was to examine interrater reliability of the MMAR. Methods This exploratory study was conducted at a US medical school as part of a larger investigation on learning strategies. Sixty-six (N = 66) first-year medical students were given a 394-word text passage followed by a 30-minute presentation on mind mapping. After the presentation, subjects were again given the text passage and instructed to create mind maps based upon the passage. The mind maps were collected and independently scored using the MMAR by 3 examiners. Interrater reliability was measured using the intraclass correlation coefficient (ICC) statistic. Statistics were calculated using SPSS version 12.0 (Chicago, IL). Results Analysis of the mind maps revealed the following: concept-links ICC = .05 (95% CI, -.42 to .38), cross-links ICC = .58 (95% CI, .37 to .73), hierarchies ICC = .23 (95% CI, -.15 to .50), examples ICC = .53 (95% CI, .29 to .69), pictures ICC = .86 (95% CI, .79 to .91), colors ICC = .73 (95% CI, .59 to .82), and total score ICC = .86 (95% CI, .79 to .91). Conclusion The high ICC value for total mind map score indicates strong MMAR interrater reliability. Pictures and colors demonstrated moderate to strong interrater reliability. We conclude that the MMAR may be a valid and reliable tool to assess mind maps in medicine. However, further research on the validity and reliability of the MMAR is necessary. PMID:19400964

  18. The application of the statistical theory of extreme values to gust-load problems

    NASA Technical Reports Server (NTRS)

    Press, Harry

    1950-01-01

    An analysis is presented which indicates that the statistical theory of extreme values is applicable to the problems of predicting the frequency of encountering the larger gust loads and gust velocities for both specific test conditions as well as commercial transport operations. The extreme-value theory provides an analytic form for the distributions of maximum values of gust load and velocity. Methods of fitting the distribution are given along with a method of estimating the reliability of the predictions. The theory of extreme values is applied to available load data from commercial transport operations. The results indicate that the estimates of the frequency of encountering the larger loads are more consistent with the data and more reliable than those obtained in previous analyses. (author)

  19. Effects of momentary self-monitoring on empowerment in a randomized controlled trial in patients with depression.

    PubMed

    Simons, C J P; Hartmann, J A; Kramer, I; Menne-Lothmann, C; Höhn, P; van Bemmel, A L; Myin-Germeys, I; Delespaul, P; van Os, J; Wichers, M

    2015-11-01

    Interventions based on the experience sampling method (ESM) are ideally suited to provide insight into personal, contextualized affective patterns in the flow of daily life. Recently, we showed that an ESM-intervention focusing on positive affect was associated with a decrease in symptoms in patients with depression. The aim of the present study was to examine whether ESM-intervention increased patient empowerment. Depressed out-patients (n=102) receiving psychopharmacological treatment who had participated in a randomized controlled trial with three arms: (i) an experimental group receiving six weeks of ESM self-monitoring combined with weekly feedback sessions, (ii) a pseudo-experimental group participating in six weeks of ESM self-monitoring without feedback, and (iii) a control group (treatment as usual only). Patients were recruited in the Netherlands between January 2010 and February 2012. Self-report empowerment scores were obtained pre- and post-intervention. There was an effect of group×assessment period, indicating that the experimental (B=7.26, P=0.061, d=0.44, statistically imprecise) and pseudo-experimental group (B=11.19, P=0.003, d=0.76) increased more in reported empowerment compared to the control group. In the pseudo-experimental group, 29% of the participants showed a statistically reliable increase in empowerment score and 0% reliable decrease compared to 17% reliable increase and 21% reliable decrease in the control group. The experimental group showed 19% reliable increase and 4% reliable decrease. These findings tentatively suggest that self-monitoring to complement standard antidepressant treatment may increase patients' feelings of empowerment. Further research is necessary to investigate long-term empowering effects of self-monitoring in combination with person-tailored feedback. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  20. The challenge of mapping between two medical coding systems.

    PubMed

    Wojcik, Barbara E; Stein, Catherine R; Devore, Raymond B; Hassell, L Harrison

    2006-11-01

    Deployable medical systems patient conditions (PCs) designate groups of patients with similar medical conditions and, therefore, similar treatment requirements. PCs are used by the U.S. military to estimate field medical resources needed in combat operations. Information associated with each of the 389 PCs is based on subject matter expert opinion, instead of direct derivation from standard medical codes. Currently, no mechanisms exist to tie current or historical medical data to PCs. Our study objective was to determine whether reliable conversion between PC codes and International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) diagnosis codes is possible. Data were analyzed for three professional coders assigning all applicable ICD-9-CM diagnosis codes to each PC code. Inter-rater reliability was measured by using Cohen's K statistic and percent agreement. Methods were developed to calculate kappa statistics when multiple responses could be selected from many possible categories. Overall, we found moderate support for the possibility of reliable conversion between PCs and ICD-9-CM diagnoses (mean kappa = 0.61). Current PCs should be modified into a system that is verifiable with real data.

  1. Statistical issues in reporting quality data: small samples and casemix variation.

    PubMed

    Zaslavsky, A M

    2001-12-01

    To present two key statistical issues that arise in analysis and reporting of quality data. Casemix variation is relevant to quality reporting when the units being measured have differing distributions of patient characteristics that also affect the quality outcome. When this is the case, adjustment using stratification or regression may be appropriate. Such adjustments may be controversial when the patient characteristic does not have an obvious relationship to the outcome. Stratified reporting poses problems for sample size and reporting format, but may be useful when casemix effects vary across units. Although there are no absolute standards of reliability, high reliabilities (interunit F > or = 10 or reliability > or = 0.9) are desirable for distinguishing above- and below-average units. When small or unequal sample sizes complicate reporting, precision may be improved using indirect estimation techniques that incorporate auxiliary information, and 'shrinkage' estimation can help to summarize the strength of evidence about units with small samples. With broader understanding of casemix adjustment and methods for analyzing small samples, quality data can be analysed and reported more accurately.

  2. Validation of the Fatigue Impact Scale in Hungarian patients with multiple sclerosis.

    PubMed

    Losonczi, Erika; Bencsik, Krisztina; Rajda, Cecília; Lencsés, Gyula; Török, Margit; Vécsei, László

    2011-03-01

    Fatigue is one of the most frequent complaints of patients with multiple sclerosis (MS). The Fatigue Impact Scale (FIS), one of the 30 available fatigue questionnaires, is commonly applied because it evaluates multidimensional aspects of fatigue. The main purposes of this study were to test the validity, test-retest reliability, and internal consistency of the Hungarian version of the FIS. One hundred and eleven MS patients and 85 healthy control (HC) subjects completed the FIS and the Beck Depression Inventory, a large majority of them on two occasions, 3 months apart. The total FIS score and subscale scores differed statistically between the MS patients and the HC subjects in both FIS sessions. In the test-retest reliability assessment, statistically, the intraclass correlation coefficients were high in both the MS and HC groups. Cronbach's alpha values were also notably high. The results of this study indicate that the FIS can be regarded as a valid and reliable scale with which to improve our understanding of the impact of fatigue on the health-related quality of life in MS patients without severe disability.

  3. SCARE: A post-processor program to MSC/NASTRAN for the reliability analysis of structural ceramic components

    NASA Technical Reports Server (NTRS)

    Gyekenyesi, J. P.

    1985-01-01

    A computer program was developed for calculating the statistical fast fracture reliability and failure probability of ceramic components. The program includes the two-parameter Weibull material fracture strength distribution model, using the principle of independent action for polyaxial stress states and Batdorf's shear-sensitive as well as shear-insensitive crack theories, all for volume distributed flaws in macroscopically isotropic solids. Both penny-shaped cracks and Griffith cracks are included in the Batdorf shear-sensitive crack response calculations, using Griffith's maximum tensile stress or critical coplanar strain energy release rate criteria to predict mixed mode fracture. Weibull material parameters can also be calculated from modulus of rupture bar tests, using the least squares method with known specimen geometry and fracture data. The reliability prediction analysis uses MSC/NASTRAN stress, temperature and volume output, obtained from the use of three-dimensional, quadratic, isoparametric, or axisymmetric finite elements. The statistical fast fracture theories employed, along with selected input and output formats and options, are summarized. An example problem to demonstrate various features of the program is included.

  4. Reliability demonstration test for load-sharing systems with exponential and Weibull components

    PubMed Central

    Hu, Qingpei; Yu, Dan; Xie, Min

    2017-01-01

    Conducting a Reliability Demonstration Test (RDT) is a crucial step in production. Products are tested under certain schemes to demonstrate whether their reliability indices reach pre-specified thresholds. Test schemes for RDT have been studied in different situations, e.g., lifetime testing, degradation testing and accelerated testing. Systems designed with several structures are also investigated in many RDT plans. Despite the availability of a range of test plans for different systems, RDT planning for load-sharing systems hasn’t yet received the attention it deserves. In this paper, we propose a demonstration method for two specific types of load-sharing systems with components subject to two distributions: exponential and Weibull. Based on the assumptions and interpretations made in several previous works on such load-sharing systems, we set the mean time to failure (MTTF) of the total system as the demonstration target. We represent the MTTF as a summation of mean time between successive component failures. Next, we introduce generalized test statistics for both the underlying distributions. Finally, RDT plans for the two types of systems are established on the basis of these test statistics. PMID:29284030

  5. Reliability demonstration test for load-sharing systems with exponential and Weibull components.

    PubMed

    Xu, Jianyu; Hu, Qingpei; Yu, Dan; Xie, Min

    2017-01-01

    Conducting a Reliability Demonstration Test (RDT) is a crucial step in production. Products are tested under certain schemes to demonstrate whether their reliability indices reach pre-specified thresholds. Test schemes for RDT have been studied in different situations, e.g., lifetime testing, degradation testing and accelerated testing. Systems designed with several structures are also investigated in many RDT plans. Despite the availability of a range of test plans for different systems, RDT planning for load-sharing systems hasn't yet received the attention it deserves. In this paper, we propose a demonstration method for two specific types of load-sharing systems with components subject to two distributions: exponential and Weibull. Based on the assumptions and interpretations made in several previous works on such load-sharing systems, we set the mean time to failure (MTTF) of the total system as the demonstration target. We represent the MTTF as a summation of mean time between successive component failures. Next, we introduce generalized test statistics for both the underlying distributions. Finally, RDT plans for the two types of systems are established on the basis of these test statistics.

  6. Proceedings of the Biennial EO/EEO Research Symposium (2nd) Held in Cocoa Beach, Florida on December 2-4, 1997

    DTIC Science & Technology

    1998-04-01

    1 presents the descriptive statistics , reliabilities (in parentheses), and correlations for the data. The reliabilities for sexism and racism are...Opportunity Management Institute. Niebuhr, R. E., Knouse, S. B., Dansby, M. R., & Niebuhr, K. E. (1996). The relationship between racism / sexism and...subjects to evaluate the relationships between discriminatory climates ( sexism , racism ) and both group cohesion and group performance. Structural equation

  7. Reliability study of refractory gate gallium arsenide MESFETS

    NASA Technical Reports Server (NTRS)

    Yin, J. C. W.; Portnoy, W. M.

    1981-01-01

    Refractory gate MESFET's were fabricated as an alternative to aluminum gate devices, which have been found to be unreliable as RF power amplifiers. In order to determine the reliability of the new structures, statistics of failure and information about mechanisms of failure in refractory gate MESFET's are given. Test transistors were stressed under conditions of high temperature and forward gate current to enhance failure. Results of work at 150 C and 275 C are reported.

  8. Reliability study of refractory gate gallium arsenide MESFETS

    NASA Astrophysics Data System (ADS)

    Yin, J. C. W.; Portnoy, W. M.

    Refractory gate MESFET's were fabricated as an alternative to aluminum gate devices, which have been found to be unreliable as RF power amplifiers. In order to determine the reliability of the new structures, statistics of failure and information about mechanisms of failure in refractory gate MESFET's are given. Test transistors were stressed under conditions of high temperature and forward gate current to enhance failure. Results of work at 150 C and 275 C are reported.

  9. The Value of Certainty (Invited)

    NASA Astrophysics Data System (ADS)

    Barkstrom, B. R.

    2009-12-01

    It is clear that Earth science data are valued, in part, for their ability to provide some certainty about the past state of the Earth and about its probable future states. We can sharpen this notion by using seven categories of value ● Warning Service, requiring latency of three hours or less, as well as uninterrupted service ● Information Service, requiring latency less than about two weeks, as well as unterrupted service ● Process Information, requiring ability to distinguish between alternative processes ● Short-term Statistics, requiring ability to construct a reliable record of the statistics of a parameter for an interval of five years or less, e.g. crop insurance ● Mid-term Statistics, requiring ability to construct a reliable record of the statistics of a parameter for an interval of twenty-five years or less, e.g. power plant siting ● Long-term Statistics, requiring ability to construct a reliable record of the statistics of a parameter for an interval of a century or less, e.g. one hundred year flood planning ● Doomsday Statistics, requiring ability to construct a reliable statistical record that is useful for reducing the impact of `doomsday' scenarios While the first two of these categories place high value on having an uninterrupted flow of information, and the third places value on contributing to our understanding of physical processes, it is notable that the last four may be placed on a common footing by considering the ability of observations to reduce uncertainty. Quantitatively, we can often identify metrics for parameters of interest that are fairly simple. For example, ● Detection of change in the average value of a single parameter, such as global temperature ● Detection of a trend, whether linear or nonlinear, such as the trend in cloud forcing known as cloud feedback ● Detection of a change in extreme value statistics, such as flood frequency or drought severity For such quantities, we can quantify uncertainty in terms of the entropy which is calculated by creating a set of discrete bins for the value and then using error estimates to assign probabilities, pi, to each bin. The entropy, H, is simply H = ∑i pi log2(1/pi) The value of a new set of observations is the information gain, I, which is I = Hprior - Hposterior The probability distributions that appear in this calculation depend on rigorous evaluation of errors in the observations. While direct estimates of the monetary value of data that could be used in budget prioritizations may not capture the value of data to the scientific community, it appears that the information gain may be a useful start in providing a `common currency' for evaluating projects that serve very different communities. In addition, from the standpoint of governmental accounting, it appears reasonable to assume that much of the expense for scientific data become sunk costs shortly after operations begin and that the real, long-term value is created by the effort scientists expend in creating the software that interprets the data and in the effort expended in calibration and validation. These efforts are the ones that directly contribute to the information gain that provides the value of these data.

  10. Establishing Inter- and Intrarater Reliability for High-Stakes Testing Using Simulation.

    PubMed

    Kardong-Edgren, Suzan; Oermann, Marilyn H; Rizzolo, Mary Anne; Odom-Maryon, Tamara

    This article reports one method to develop a standardized training method to establish the inter- and intrarater reliability of a group of raters for high-stakes testing. Simulation is used increasingly for high-stakes testing, but without research into the development of inter- and intrarater reliability for raters. Eleven raters were trained using a standardized methodology. Raters scored 28 student videos over a six-week period. Raters then rescored all videos over a two-day period to establish both intra- and interrater reliability. One rater demonstrated poor intrarater reliability; a second rater failed all students. Kappa statistics improved from the moderate to substantial agreement range with the exclusion of the two outlier raters' scores. There may be faculty who, for different reasons, should not be included in high-stakes testing evaluations. All faculty are content experts, but not all are expert evaluators.

  11. Test-retest reliability of an fMRI paradigm for studies of cardiovascular reactivity.

    PubMed

    Sheu, Lei K; Jennings, J Richard; Gianaros, Peter J

    2012-07-01

    We examined the reliability of measures of fMRI, subjective, and cardiovascular reactions to standardized versions of a Stroop color-word task and a multisource interference task. A sample of 14 men and 12 women (30-49 years old) completed the tasks on two occasions, separated by a median of 88 days. The reliability of fMRI BOLD signal changes in brain areas engaged by the tasks was moderate, and aggregating fMRI BOLD signal changes across the tasks improved test-retest reliability metrics. These metrics included voxel-wise intraclass correlation coefficients (ICCs) and overlap ratio statistics. Task-aggregated ratings of subjective arousal, valence, and control, as well as cardiovascular reactions evoked by the tasks showed ICCs of 0.57 to 0.87 (ps < .001), indicating moderate-to-strong reliability. These findings support using these tasks as a battery for fMRI studies of cardiovascular reactivity. Copyright © 2012 Society for Psychophysiological Research.

  12. Probability interpretations of intraclass reliabilities.

    PubMed

    Ellis, Jules L

    2013-11-20

    Research where many organizations are rated by different samples of individuals such as clients, patients, or employees frequently uses reliabilities computed from intraclass correlations. Consumers of statistical information, such as patients and policy makers, may not have sufficient background for deciding which levels of reliability are acceptable. It is shown that the reliability is related to various probabilities that may be easier to understand, for example, the proportion of organizations that will be classed significantly above (or below) the mean and the probability that an organization is classed correctly given that it is classed significantly above (or below) the mean. One can view these probabilities as the amount of information of the classification and the correctness of the classification. These probabilities have an inverse relationship: given a reliability, one can 'buy' correctness at the cost of informativeness and conversely. This article discusses how this can be used to make judgments about the required level of reliabilities. Copyright © 2013 John Wiley & Sons, Ltd.

  13. AMOVA ["Accumulative Manifold Validation Analysis"]: An Advanced Statistical Methodology Designed to Measure and Test the Validity, Reliability, and Overall Efficacy of Inquiry-Based Psychometric Instruments

    ERIC Educational Resources Information Center

    Osler, James Edward, II

    2015-01-01

    This monograph provides an epistemological rational for the Accumulative Manifold Validation Analysis [also referred by the acronym "AMOVA"] statistical methodology designed to test psychometric instruments. This form of inquiry is a form of mathematical optimization in the discipline of linear stochastic modelling. AMOVA is an in-depth…

  14. Assessment of Reliable Change Using 95% Credible Intervals for the Differences in Proportions: A Statistical Analysis for Case-Study Methodology

    ERIC Educational Resources Information Center

    Unicomb, Rachael; Colyvas, Kim; Harrison, Elisabeth; Hewat, Sally

    2015-01-01

    Purpose: Case-study methodology studying change is often used in the field of speech-language pathology, but it can be criticized for not being statistically robust. Yet with the heterogeneous nature of many communication disorders, case studies allow clinicians and researchers to closely observe and report on change. Such information is valuable…

  15. Operation Reliability Assessment for Cutting Tools by Applying a Proportional Covariate Model to Condition Monitoring Information

    PubMed Central

    Cai, Gaigai; Chen, Xuefeng; Li, Bing; Chen, Baojia; He, Zhengjia

    2012-01-01

    The reliability of cutting tools is critical to machining precision and production efficiency. The conventional statistic-based reliability assessment method aims at providing a general and overall estimation of reliability for a large population of identical units under given and fixed conditions. However, it has limited effectiveness in depicting the operational characteristics of a cutting tool. To overcome this limitation, this paper proposes an approach to assess the operation reliability of cutting tools. A proportional covariate model is introduced to construct the relationship between operation reliability and condition monitoring information. The wavelet packet transform and an improved distance evaluation technique are used to extract sensitive features from vibration signals, and a covariate function is constructed based on the proportional covariate model. Ultimately, the failure rate function of the cutting tool being assessed is calculated using the baseline covariate function obtained from a small sample of historical data. Experimental results and a comparative study show that the proposed method is effective for assessing the operation reliability of cutting tools. PMID:23201980

  16. Time-to-pregnancy and pregnancy outcomes in a South African population

    PubMed Central

    2010-01-01

    Background Time-to-pregnancy (TTP) has never been studied in an African setting and there are no data on the rates of adverse pregnancy outcomes in South Africa. The study objectives were to measure TTP and the rates of adverse pregnancy outcomes in South Africa, and to determine the reliability of the questionnaire tool. Methods The study was cross-sectional and applied systematic stratified sampling to obtain a representative sample of reproductive age women for a South African population. Data on socio-demographic, work, health and reproductive variables were collected on 1121 women using a standardized questionnaire. A small number (n = 73) of randomly selected questionnaires was repeated to determine reliability of the questionnaire. Data was described using simple summary statistics while Kappa and intra-class correlation statistics were calculated for reliability. Results Of the 1121 women, 47 (4.2%) had never been pregnant. Mean gravidity was 2.3 while mean parity was 2.0 There were a total of 2467 pregnancies; most (87%) resulted in live births, 9.5% in spontaneous abortion and 2.2% in still births. The proportion of planned pregnancies was 39% and the median TTP was 6 months. The reliability of the questionnaire for TTP data was good; 63% for all participants and 97% when censored at 14 months. Overall reliability of reporting adverse pregnancy outcomes was very high, ranging from 90 - 98% for most outcomes. Conclusion This is the first comprehensive population-based reproductive health study in South Africa, to describe the biologic fertility of the population, and provides rates for planned pregnancies and adverse pregnancy outcomes. The reliability of the study questionnaire was substantial, with most outcomes within 70 - 100% reliability index. The study provides important public information for health practitioners and researchers in reproductive health. It also highlights the need for public health intervention programmes and epidemiological research on biologic fertility and adverse pregnancy outcomes in the population. PMID:20858279

  17. Psychometrics of the MHSIP Adult Consumer Survey.

    PubMed

    Jerrell, Jeanette M

    2006-10-01

    The reliability and validity of the Mental Health Statistics Improvement Program (MHSIP) Adult Consumer Survey were assessed in a statewide convenience sample of 459 persons with severe mental illness served through a public mental health system. Consistent with previous findings and the intent of its developers, three factors were identified that demonstrate good internal consistency, moderate test-retest reliability, and good convergent validity with consumer perceptions of other aspects of their care. The reliability and validity of the MHSIP Adult Consumer Survey documented in this study underscore its scientific and practical utility as an abbreviated tool for assessing access, quality and appropriateness, and outcome in mental health service systems.

  18. High reliability and high performance of 9xx-nm single emitter laser diodes

    NASA Astrophysics Data System (ADS)

    Bao, L.; Leisher, P.; Wang, J.; Devito, M.; Xu, D.; Grimshaw, M.; Dong, W.; Guan, X.; Zhang, S.; Bai, C.; Bai, J. G.; Wise, D.; Martinsen, R.

    2011-03-01

    Improved performance and reliability of 9xx nm single emitter laser diodes are presented. To date, over 15,000 hours of accelerated multi-cell lifetest reliability data has been collected, with drive currents from 14A to 18A and junction temperatures ranging from 60°C to 110°C. Out of 208 devices, 14 failures have been observed so far. Using established accelerated lifetest analysis techniques, the effects of temperature and power acceleration are assessed. The Mean Time to Failure (MTTF) is determined to be >30 years, for use condition 10W and junction temperature 353K (80°C), with 90% statistical confidence.

  19. Field reliability of Ricor microcoolers

    NASA Astrophysics Data System (ADS)

    Pundak, N.; Porat, Z.; Barak, M.; Zur, Y.; Pasternak, G.

    2009-05-01

    Over the recent 25 years Ricor has fielded in excess of 50,000 Stirling cryocoolers, among which approximately 30,000 units are of micro integral rotary driven type. The statistical population of the fielded units is counted in thousands/ hundreds per application category. In contrast to MTTF values as gathered and presented based on standard reliability demonstration tests, where the failure of the weakest component dictates the end of product life, in the case of field reliability, where design and workmanship failures are counted and considered, the values are usually reported in number of failures per million hours of operation. These values are important and relevant to the prediction of service capabilities and plan.

  20. Validation of Brief Multidimensional Spirituality/Religiousness Inventory (BMMRS) in Italian Adult Participants and in Participants with Medical Diseases.

    PubMed

    Vespa, Anna; Giulietti, Maria Velia; Spatuzzi, Roberta; Fabbietti, Paolo; Meloni, Cristina; Gattafoni, Pisana; Ottaviani, Marica

    2017-06-01

    This study aimed at assessing the reliability and construct validity of Brief Multidimensional Measure of Religiousness/Spirituality (BMMRS) on Italian sample. 353 Italian participants: 58.9% affected by different diseases and 41.1% healthy subjects. The results of descriptive statistics of internal consistency reliabilities (Chronbach's coefficient) of the BMMRS revealed a remarkable consistency and reliability of different scales DSE, SpC, SC, CSC, VB, SPY-WELL and a good Inter-Class Correlations ≥70 maintaining a good stability of the measures over the time. BMMRS is a useful inventory for the evaluation of the principal spiritual dimensions.

  1. An Analysis of Operational Suitability for Test and Evaluation of Highly Reliable Systems

    DTIC Science & Technology

    1994-03-04

    Exposition," Journal of the American Statistical A iation-59: 353-375 (June 1964). 17. SYS 229, Test and Evaluation Management Coursebook , School of Systems...in hours, 0 is 2-5 the desired MTBCF in hours, R is the number of critical failures, and a is the P[type-I error] of the X2 statistic with 2*R+2...design of experiments (DOE) tables and the use of Bayesian statistics to increase the confidence level of the test results that will be obtained from

  2. Reliability analysis of composite structures

    NASA Technical Reports Server (NTRS)

    Kan, Han-Pin

    1992-01-01

    A probabilistic static stress analysis methodology has been developed to estimate the reliability of a composite structure. Closed form stress analysis methods are the primary analytical tools used in this methodology. These structural mechanics methods are used to identify independent variables whose variations significantly affect the performance of the structure. Once these variables are identified, scatter in their values is evaluated and statistically characterized. The scatter in applied loads and the structural parameters are then fitted to appropriate probabilistic distribution functions. Numerical integration techniques are applied to compute the structural reliability. The predicted reliability accounts for scatter due to variability in material strength, applied load, fabrication and assembly processes. The influence of structural geometry and mode of failure are also considerations in the evaluation. Example problems are given to illustrate various levels of analytical complexity.

  3. Agreement, the F-Measure, and Reliability in Information Retrieval

    PubMed Central

    Hripcsak, George; Rothschild, Adam S.

    2005-01-01

    Information retrieval studies that involve searching the Internet or marking phrases usually lack a well-defined number of negative cases. This prevents the use of traditional interrater reliability metrics like the κ statistic to assess the quality of expert-generated gold standards. Such studies often quantify system performance as precision, recall, and F-measure, or as agreement. It can be shown that the average F-measure among pairs of experts is numerically identical to the average positive specific agreement among experts and that κ approaches these measures as the number of negative cases grows large. Positive specific agreement—or the equivalent F-measure—may be an appropriate way to quantify interrater reliability and therefore to assess the reliability of a gold standard in these studies. PMID:15684123

  4. General motor function assessment scale--reliability of a Norwegian version.

    PubMed

    Langhammer, Birgitta; Lindmark, Birgitta

    2014-01-01

    The General Motor Function assessment scale (GMF) measures activity-related dependence, pain and insecurity among older people in frail health. The aim of the present study was to translate the GMF into a Norwegian version (N-GMF) and establish its reliability and clinical feasibility. The procedure used in translating the GMF was a forward and backward process, testing a convenience sample of 30 frail elderly people with it. The intra-rater reliability tests were performed by three physiotherapists, and the inter-reliability test was done by the same three plus nine independent colleagues. The statistical analyses were performed with a pairwise analysis for intra- and inter-rater reliability, using Cronbach's α, Percentage Agreement (PA), Svensson's rank transformable method and Cohen's κ. The Cronbach's α coefficients for the different subscales of N-GMF were 0.68 for Dependency, 0.73 for Pain and 0.75 for Insecurity. Intra-rater reliability: The variation in the PA for the total score was 40-70% in Dependence, 30-40% in Pain and 30-60% in Insecurity. The Relative Rank Variant (RV) indicated a modest individual bias and an augmented rank-order agreement coefficient ra of 0.96, 0.96 and 0.99, respectively. The variation in the κ statistics was 0.27-0.62 for Dependence, 0.17-0.35 for Pain and 0.13-0.47 for Insecurity. Inter-rater reliability: The PA between different testers in Dependence, Pain and Insecurity was 74%, 89% and 74%, respectively. The augmented rank-order agreement coefficients were: for Dependence r(a) = 0.97; for Pain, r(a) = 0.99; and for Insecurity, r(a) = 0.99. The N-GMF is a fairly reliable instrument for use with frail elderly people, with intra-rater and inter-rater reliability moderate in Dependence and slight to fair in Pain and Insecurity. The clinical usefulness was stressed in regard to its main focus, the frail elderly, and for communication within a multidisciplinary team. Implications for Rehabilitation The Norwegian-General Motor Function Assessment Scale (N-GMF) is a reliable instrument. The N-GMF is an instrument for screening and assessment of activity-related dependence, pain and insecurity in frail older people. The N-GMF may be used as a tool of communication in a multidisciplinary team.

  5. Chicanoizing Drug Abuse Programs

    ERIC Educational Resources Information Center

    Aron, William S.; And Others

    1974-01-01

    Focus is on the community of La Colonia in Oxnard, California because it has (1) a high concentration of Mexican Americans, (2) a high incidence of drug addiction, and (3) reliable statistical data available. (NQ)

  6. Tutorial on use of intraclass correlation coefficients for assessing intertest reliability and its application in functional near-infrared spectroscopy-based brain imaging

    NASA Astrophysics Data System (ADS)

    Li, Lin; Zeng, Li; Lin, Zi-Jing; Cazzell, Mary; Liu, Hanli

    2015-05-01

    Test-retest reliability of neuroimaging measurements is an important concern in the investigation of cognitive functions in the human brain. To date, intraclass correlation coefficients (ICCs), originally used in inter-rater reliability studies in behavioral sciences, have become commonly used metrics in reliability studies on neuroimaging and functional near-infrared spectroscopy (fNIRS). However, as there are six popular forms of ICC, the adequateness of the comprehensive understanding of ICCs will affect how one may appropriately select, use, and interpret ICCs toward a reliability study. We first offer a brief review and tutorial on the statistical rationale of ICCs, including their underlying analysis of variance models and technical definitions, in the context of assessment on intertest reliability. Second, we provide general guidelines on the selection and interpretation of ICCs. Third, we illustrate the proposed approach by using an actual research study to assess intertest reliability of fNIRS-based, volumetric diffuse optical tomography of brain activities stimulated by a risk decision-making protocol. Last, special issues that may arise in reliability assessment using ICCs are discussed and solutions are suggested.

  7. The psychometric testing of the diabetes health promotion self-care scale.

    PubMed

    Wang, Ruey-Hsia; Lin, Li-Ying; Cheng, Chung-Ping; Hsu, Min-Tao; Kao, Chia-Chan

    2012-06-01

    Health-promoting behavior is an important strategy to maintain and enhance health of patients with Type 2 diabetes. Few instruments have been developed to measure health promotion self-care behavior of patients with Type 2 diabetes. Developing and psychometric testing of the Chinese version of the Diabetes Health Promotion Self-Care Scale (DHPSC) for patients with Type 2 diabetes. Four hundred and eighty-nine patients with Type 2 diabetes were recruited from endocrine clinics in four hospitals in Kaohsiung City in southern Taiwan. Exploratory and confirmatory factor analyses were used to assess the construct validity of the scale. Correlations between the DHPSC and the satisfaction subscale of Diabetes Quality of Life, Diabetes Empowerment Scale, and HbA1c were calculated to evaluate concurrent validity. Internal consistency and test-retest reliability were used to assess the reliability of the scale. The study was conducted in 2007 and 2008. A proposed second-order factor model with seven subscales and 26 items fit the data well. The seven subscales were interpersonal relationships, diet, blood glucose self-monitoring, personal health responsibility, exercise, adherence to the recommended regimens, and foot care. The DHPSC statistically significantly correlated with the satisfaction subscale of Diabetes Quality of Life and the Diabetes Empowerment Scale. HbA1c only statistically significantly correlated with the subscale of health responsibility. Reliability was supported by acceptable Cronbach's alpha (range, .78-.94) and test-retest reliability (range, .76-.95). The DHPSC has satisfactory reliability and validity. Healthcare providers can use the DHPSC to comprehensively assess the health promotion self-care behaviors of patients with Type 2 diabetes.

  8. Rasch Analysis of the General Self-Efficacy Scale in Workers with Traumatic Limb Injuries.

    PubMed

    Wu, Tzu-Yi; Yu, Wan-Hui; Huang, Chien-Yu; Hou, Wen-Hsuan; Hsieh, Ching-Lin

    2016-09-01

    Purpose The purpose of this study was to apply Rasch analysis to examine the unidimensionality and reliability of the General Self-Efficacy Scale (GSE) in workers with traumatic limb injuries. Furthermore, if the items of the GSE fitted the Rasch model's assumptions, we transformed the raw sum ordinal scores of the GSE into Rasch interval scores. Methods A total of 1076 participants completed the GSE at 1 month post injury. Rasch analysis was used to examine the unidimensionality and person reliability of the GSE. The unidimensionality of the GSE was verified by determining whether the items fit the Rasch model's assumptions: (1) item fit indices: infit and outfit mean square (MNSQ) ranged from 0.6 to 1.4; and (2) the eigenvalue of the first factor extracted from principal component analysis (PCA) for residuals was <2. Person reliability was calculated. Results The unidimensionality of the 10-item GSE was supported in terms of good item fit statistics (infit and outfit MNSQ ranging from 0.92 to 1.32) and acceptable eigenvalues (1.6) of the first factor of the PCA, with person reliability = 0.89. Consequently, the raw sum scores of the GSE were transformed into Rasch scores. Conclusions The results indicated that the items of GSE are unidimensional and have acceptable person reliability in workers with traumatic limb injuries. Additionally, the raw sum scores of the GSE can be transformed into Rasch interval scores for prospective users to quantify workers' levels of self-efficacy and to conduct further statistical analyses.

  9. Measurement of patient safety: a systematic review of the reliability and validity of adverse event detection with record review

    PubMed Central

    Hanskamp-Sebregts, Mirelle; Zegers, Marieke; Vincent, Charles; van Gurp, Petra J; de Vet, Henrica C W; Wollersheim, Hub

    2016-01-01

    Objectives Record review is the most used method to quantify patient safety. We systematically reviewed the reliability and validity of adverse event detection with record review. Design A systematic review of the literature. Methods We searched PubMed, EMBASE, CINAHL, PsycINFO and the Cochrane Library and from their inception through February 2015. We included all studies that aimed to describe the reliability and/or validity of record review. Two reviewers conducted data extraction. We pooled κ values (κ) and analysed the differences in subgroups according to number of reviewers, reviewer experience and training level, adjusted for the prevalence of adverse events. Results In 25 studies, the psychometric data of the Global Trigger Tool (GTT) and the Harvard Medical Practice Study (HMPS) were reported and 24 studies were included for statistical pooling. The inter-rater reliability of the GTT and HMPS showed a pooled κ of 0.65 and 0.55, respectively. The inter-rater agreement was statistically significantly higher when the group of reviewers within a study consisted of a maximum five reviewers. We found no studies reporting on the validity of the GTT and HMPS. Conclusions The reliability of record review is moderate to substantial and improved when a small group of reviewers carried out record review. The validity of the record review method has never been evaluated, while clinical data registries, autopsy or direct observations of patient care are potential reference methods that can be used to test concurrent validity. PMID:27550650

  10. Adaptation and Assessment of Reliability and Validity of the Greek Version of the Ohkuma Questionnaire for Dysphagia Screening

    PubMed Central

    Papadopoulou, Soultana L.; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam

    2016-01-01

    Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements (p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients. PMID:28050209

  11. Adaptation and Assessment of Reliability and Validity of the Greek Version of the Ohkuma Questionnaire for Dysphagia Screening.

    PubMed

    Papadopoulou, Soultana L; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam

    2017-01-01

    Introduction  The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective  The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods  Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results  According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p  = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements ( p  = 0.02 and p  = 0.016). Conclusion  The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients.

  12. System and Software Reliability (C103)

    NASA Technical Reports Server (NTRS)

    Wallace, Dolores

    2003-01-01

    Within the last decade better reliability models (hardware. software, system) than those currently used have been theorized and developed but not implemented in practice. Previous research on software reliability has shown that while some existing software reliability models are practical, they are no accurate enough. New paradigms of development (e.g. OO) have appeared and associated reliability models have been proposed posed but not investigated. Hardware models have been extensively investigated but not integrated into a system framework. System reliability modeling is the weakest of the three. NASA engineers need better methods and tools to demonstrate that the products meet NASA requirements for reliability measurement. For the new models for the software component of the last decade, there is a great need to bring them into a form that they can be used on software intensive systems. The Statistical Modeling and Estimation of Reliability Functions for Systems (SMERFS'3) tool is an existing vehicle that may be used to incorporate these new modeling advances. Adapting some existing software reliability modeling changes to accommodate major changes in software development technology may also show substantial improvement in prediction accuracy. With some additional research, the next step is to identify and investigate system reliability. System reliability models could then be incorporated in a tool such as SMERFS'3. This tool with better models would greatly add value in assess in GSFC projects.

  13. A Modified Goodness-of-Fit Test for the Lognormal Distribution with Unknown Scale and Location Parameters.

    DTIC Science & Technology

    1986-12-01

    Force and other branches of the military are placing an increased emphasis on system reliablity and maintainability. In studying current systems ...used in the research of proposed systems , by predicting MTTF and MTTR of the new parts and thus, predict the reliability of those parts. The statistics...effectiveness of new systems . Aitchison’s book on the lognormal distribution, printed and used by Cambridge University, highlighted the distributions

  14. Stress Rupture Life Reliability Measures for Composite Overwrapped Pressure Vessels

    NASA Technical Reports Server (NTRS)

    Murthy, Pappu L. N.; Thesken, John C.; Phoenix, S. Leigh; Grimes-Ledesma, Lorie

    2007-01-01

    Composite Overwrapped Pressure Vessels (COPVs) are often used for storing pressurant gases onboard spacecraft. Kevlar (DuPont), glass, carbon and other more recent fibers have all been used as overwraps. Due to the fact that overwraps are subjected to sustained loads for an extended period during a mission, stress rupture failure is a major concern. It is therefore important to ascertain the reliability of these vessels by analysis, since the testing of each flight design cannot be completed on a practical time scale. The present paper examines specifically a Weibull statistics based stress rupture model and considers the various uncertainties associated with the model parameters. The paper also examines several reliability estimate measures that would be of use for the purpose of recertification and for qualifying flight worthiness of these vessels. Specifically, deterministic values for a point estimate, mean estimate and 90/95 percent confidence estimates of the reliability are all examined for a typical flight quality vessel under constant stress. The mean and the 90/95 percent confidence estimates are computed using Monte-Carlo simulation techniques by assuming distribution statistics of model parameters based also on simulation and on the available data, especially the sample sizes represented in the data. The data for the stress rupture model are obtained from the Lawrence Livermore National Laboratories (LLNL) stress rupture testing program, carried out for the past 35 years. Deterministic as well as probabilistic sensitivities are examined.

  15. Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale.

    PubMed

    White, Sarah A; van den Broek, Nynke R

    2004-05-30

    Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was conducted to evaluate the usefulness of a specific measurement tool (the WHO Colour Scale) is then used to illustrate the application of these methods. The WHO Colour Scale was developed under the auspices of the WHO to provide a simple portable and reliable method of detecting anaemia. This Colour Scale is a discrete interval scale, whereas the actual haemoglobin values it is used to estimate are on a continuous interval scale and can be measured accurately using electrical laboratory equipment. The methods we consider are: linear regression, correlation coefficients, paired t-tests plotting differences against mean values and deriving limits of agreement; kappa and weighted kappa statistics, sensitivity and specificity, an intraclass correlation coefficient and the repeatability coefficient. We note that although the definition and properties of each of these methods is well established inappropriate methods continue to be used in medical literature for assessing reliability and validity, as evidenced in the context of the evaluation of the WHO Colour Scale. Copyright 2004 John Wiley & Sons, Ltd.

  16. Quantitative evaluation of pairs and RS steganalysis

    NASA Astrophysics Data System (ADS)

    Ker, Andrew D.

    2004-06-01

    We give initial results from a new project which performs statistically accurate evaluation of the reliability of image steganalysis algorithms. The focus here is on the Pairs and RS methods, for detection of simple LSB steganography in grayscale bitmaps, due to Fridrich et al. Using libraries totalling around 30,000 images we have measured the performance of these methods and suggest changes which lead to significant improvements. Particular results from the project presented here include notes on the distribution of the RS statistic, the relative merits of different "masks" used in the RS algorithm, the effect on reliability when previously compressed cover images are used, and the effect of repeating steganalysis on the transposed image. We also discuss improvements to the Pairs algorithm, restricting it to spatially close pairs of pixels, which leads to a substantial performance improvement, even to the extent of surpassing the RS statistic which was previously thought superior for grayscale images. We also describe some of the questions for a general methodology of evaluation of steganalysis, and potential pitfalls caused by the differences between uncompressed, compressed, and resampled cover images.

  17. How Valid are Estimates of Occupational Illness?

    ERIC Educational Resources Information Center

    Hilaski, Harvey J.; Wang, Chao Ling

    1982-01-01

    Examines some of the methods of estimating occupational diseases and suggests that a consensus on the adequacy and reliability of estimates by the Bureau of Labor Statistics and others is not likely. (SK)

  18. 38 CFR 1.15 - Standards for program evaluation.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... program operates. (3) Validity. The degree of statistical validity should be assessed within the research... decisions. (4) Reliability. Use of the same research design by others should yield the same findings. (g...

  19. Reliability of void detection in structural ceramics using scanning laser acoustic microscopy

    NASA Technical Reports Server (NTRS)

    Roth, D. J.; Klima, S. J.; Kiser, J. D.; Baaklini, G. Y.

    1985-01-01

    The reliability of scanning laser acoustic microscopy (SLAM) for detecting surface voids in structural ceramic test specimens was statistically evaluated. Specimens of sintered silicon nitride and sintered silicon carbide, seeded with surface voids, were examined by SLAM at an ultrasonic frequency of 100 MHz in the as fired condition and after surface polishing. It was observed that polishing substantially increased void detectability. Voids as small as 100 micrometers in diameter were detected in polished specimens with 0.90 probability at a 0.95 confidence level. In addition, inspection times were reduced up to a factor of 10 after polishing. The applicability of the SLAM technique for detection of naturally occurring flaws of similar dimensions to the seeded voids is discussed. A FORTRAN program listing is given for calculating and plotting flaw detection statistics.

  20. Analyzing Responses of Chemical Sensor Arrays

    NASA Technical Reports Server (NTRS)

    Zhou, Hanying

    2007-01-01

    NASA is developing a third-generation electronic nose (ENose) capable of continuous monitoring of the International Space Station s cabin atmosphere for specific, harmful airborne contaminants. Previous generations of the ENose have been described in prior NASA Tech Briefs issues. Sensor selection is critical in both (prefabrication) sensor material selection and (post-fabrication) data analysis of the ENose, which detects several analytes that are difficult to detect, or that are at very low concentration ranges. Existing sensor selection approaches usually include limited statistical measures, where selectivity is more important but reliability and sensitivity are not of concern. When reliability and sensitivity can be major limiting factors in detecting target compounds reliably, the existing approach is not able to provide meaningful selection that will actually improve data analysis results. The approach and software reported here consider more statistical measures (factors) than existing approaches for a similar purpose. The result is a more balanced and robust sensor selection from a less than ideal sensor array. The software offers quick, flexible, optimal sensor selection and weighting for a variety of purposes without a time-consuming, iterative search by performing sensor calibrations to a known linear or nonlinear model, evaluating the individual sensor s statistics, scoring the individual sensor s overall performance, finding the best sensor array size to maximize class separation, finding optimal weights for the remaining sensor array, estimating limits of detection for the target compounds, evaluating fingerprint distance between group pairs, and finding the best event-detecting sensors.

  1. Mathematical Aspects of Reliability-Centered Maintenance

    DTIC Science & Technology

    1977-01-01

    exponential distribu~tion, .whose parameter (-hazard rate) can be realistically estimated., La ma SuWiaWItib~ This distribution is als4.. frequently...statistical methods to the study ýf hysicA3 reality was beset with .philosc\\phicsl problems arising from the irrefutable observacion that there isibut one...STATISTICS, 2nd ed. New York: John Wiley & Sons ; 1954. 5. Kolmogorov, A. Interpolation und Extrapolation von stationwren zuf-lligen Folgen. BULL. DE

  2. Modified Bayesian Kriging for Noisy Response Problems for Reliability Analysis

    DTIC Science & Technology

    2015-01-01

    52242, USA nicholas-gaul@uiowa.edu Mary Kathryn Cowles Department of Statistics & Actuarial Science College of Liberal Arts and Sciences , The...Forrester, A. I. J., & Keane, A. J. (2009). Recent advances in surrogate-based optimization. Progress in Aerospace Sciences , 45(1–3), 50-79. doi...Wiley. [27] Sacks, J., Welch, W. J., Toby J. Mitchell, & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science , 4

  3. Assessment Alternatives for a High Skill MOS

    DTIC Science & Technology

    1975-12-01

    tests. CR measurement advocates frequently claim that variance dependent statistics are inapplicable In CR test- ing because CR test scores have...rather than statistically . The Spearman- Brown reliability coefficient was .70. 17 In 1964, Shriver, Fink and Trexler (76) modified the M-33...ATTN: ATSW-SE-L 1 USA Cmd ft General Stf C- IVge . Ft Leavenworth, ATTN: Ed Advisor 1 USA Combined Arms Cmbt Dev Act, Ft Leavenworth, ATTN: DepCdr

  4. Development and validation of the Vellore Occupational Therapy Evaluation Scale to assess functioning in people with mental illness.

    PubMed

    Samuel, Reema; Russell, Paul Ss; Paraseth, Tapan Kumar; Ernest, Sharmila; Jacob, K S

    2016-08-26

    Available occupational therapy assessment scales focus on specific areas of functioning. There is a need for comprehensive evaluation of diverse aspects of functioning in people with mental illness. To develop a comprehensive assessment scale to evaluate diverse aspects of functioning among people with mental illness and to assess its validity and reliability. Available instruments, which evaluate diverse aspects of functioning in people with mental illness, were retrieved. Relevant items, which evaluate specific functions, were selected by a committee of mental health experts and combined to form a comprehensive instrument. Face and content validity and feasibility were assessed and the new instrument was piloted among 60 patients with mental illness. The final version of the instrument was employed in 151 consecutive clients, between 18 and 60 years of age, who were also assessed using Global Assessment of Functioning (GAF), Occupational Therapy Task Observation Scale (OTTOS), Social Functioning Questionnaire (SFQ), Rosenberg Self Esteem Scale (RSES) and Pai and Kapur Family Burden Interview Schedule (FBIS) by two therapists. The inter-rater reliability and test-retest reliability of the new instrument (Vellore Occupational Therapy Evaluation Scale (VOTES)) were also evaluated. The new scale had good internal consistency (Cronbach's alpha = .817), inter-rater reliability .928 (.877-.958) and test-retest reliability .928 (.868-.961). The correlation between the general behaviour domain (Pearson's Correlation Coefficient [PCC] = -.763, p = .000), task behaviour (PCC = -.829, p = .000), social skills (PCC = -.351, p = .000), intrapersonal skills (PCC = -.208, p = .010), instrumental activities of daily living (IADL) (PCC = -.329, p = .038) and leisure activities (PCC = -.433, p = .005) scores of VOTES with the corresponding domains in the scales used for comparison was statistically significant. The correlation between the total score of VOTES and the total scores of OTTOS, SFQ and RSES was also statistically significant suggesting convergent validity. The correlation between the total score of VOTES with the total score of FBI is not statistically significant, implying good divergent validity. VOTES seems to be a promising tool to assess overall functioning of people with mental illness. © The Author(s) 2016.

  5. Inter-rater reliability of data elements from a prototype of the Paul Coverdell National Acute Stroke Registry

    PubMed Central

    Reeves, Mathew J; Mullard, Andrew J; Wehner, Susan

    2008-01-01

    Background The Paul Coverdell National Acute Stroke Registry (PCNASR) is a U.S. based national registry designed to monitor and improve the quality of acute stroke care delivered by hospitals. The registry monitors care through specific performance measures, the accuracy of which depends in part on the reliability of the individual data elements used to construct them. This study describes the inter-rater reliability of data elements collected in Michigan's state-based prototype of the PCNASR. Methods Over a 6-month period, 15 hospitals participating in the Michigan PCNASR prototype submitted data on 2566 acute stroke admissions. Trained hospital staff prospectively identified acute stroke admissions, abstracted chart information, and submitted data to the registry. At each hospital 8 randomly selected cases were re-abstracted by an experienced research nurse. Inter-rater reliability was estimated by the kappa statistic for nominal variables, and intraclass correlation coefficient (ICC) for ordinal and continuous variables. Factors that can negatively impact the kappa statistic (i.e., trait prevalence and rater bias) were also evaluated. Results A total of 104 charts were available for re-abstraction. Excellent reliability (kappa or ICC > 0.75) was observed for many registry variables including age, gender, black race, hemorrhagic stroke, discharge medications, and modified Rankin Score. Agreement was at least moderate (i.e., 0.75 > kappa ≥; 0.40) for ischemic stroke, TIA, white race, non-ambulance arrival, hospital transfer and direct admit. However, several variables had poor reliability (kappa < 0.40) including stroke onset time, stroke team consultation, time of initial brain imaging, and discharge destination. There were marked systematic differences between hospital abstractors and the audit abstractor (i.e., rater bias) for many of the data elements recorded in the emergency department. Conclusion The excellent reliability of many of the data elements supports the use of the PCNASR to monitor and improve care. However, the poor reliability for several variables, particularly time-related events in the emergency department, indicates the need for concerted efforts to improve the quality of data collection. Specific recommendations include improvements to data definitions, abstractor training, and the development of ED-based real-time data collection systems. PMID:18547421

  6. Compound estimation procedures in reliability

    NASA Technical Reports Server (NTRS)

    Barnes, Ron

    1990-01-01

    At NASA, components and subsystems of components in the Space Shuttle and Space Station generally go through a number of redesign stages. While data on failures for various design stages are sometimes available, the classical procedures for evaluating reliability only utilize the failure data on the present design stage of the component or subsystem. Often, few or no failures have been recorded on the present design stage. Previously, Bayesian estimators for the reliability of a single component, conditioned on the failure data for the present design, were developed. These new estimators permit NASA to evaluate the reliability, even when few or no failures have been recorded. Point estimates for the latter evaluation were not possible with the classical procedures. Since different design stages of a component (or subsystem) generally have a good deal in common, the development of new statistical procedures for evaluating the reliability, which consider the entire failure record for all design stages, has great intuitive appeal. A typical subsystem consists of a number of different components and each component has evolved through a number of redesign stages. The present investigations considered compound estimation procedures and related models. Such models permit the statistical consideration of all design stages of each component and thus incorporate all the available failure data to obtain estimates for the reliability of the present version of the component (or subsystem). A number of models were considered to estimate the reliability of a component conditioned on its total failure history from two design stages. It was determined that reliability estimators for the present design stage, conditioned on the complete failure history for two design stages have lower risk than the corresponding estimators conditioned only on the most recent design failure data. Several models were explored and preliminary models involving bivariate Poisson distribution and the Consael Process (a bivariate Poisson process) were developed. Possible short comings of the models are noted. An example is given to illustrate the procedures. These investigations are ongoing with the aim of developing estimators that extend to components (and subsystems) with three or more design stages.

  7. Development and validation of an objective instrument to measure surgical performance at tonsillectomy.

    PubMed

    Roberson, David W; Kentala, Erna; Forbes, Peter

    2005-12-01

    The goals of this project were 1) to develop and validate an objective instrument to measure surgical performance at tonsillectomy, 2) to assess its interobserver and interobservation reliability and construct validity, and 3) to select those items with best reliability and most independent information to design a simplified form suitable for routine use in otolaryngology surgical evaluation. Prospective, observational data collection for an educational quality improvement project. The evaluation instrument was based on previous instruments developed in general surgery with input from attending otolaryngologic surgeons and experts in medical education. It was pilot tested and subjected to iterative improvements. After the instrument was finalized, a total of 55 tonsillectomies were observed and scored during academic year 2002 to 2003: 45 cases by residents at different points during their rotation, 5 by fellows, and 5 by faculty. Results were assessed for interobserver reliability, interobservation reliability, and construct validity. Factor analysis was used to identify items with independent information. Interobserver and interobservation reliability was high. On technical items, faculty substantially outperformed fellows, who in turn outperformed residents (P < .0001 for both comparisons). On the "global" scale (overall assessment), residents improved an average of 1 full point (on a 5 point scale) during a 3 month rotation (P = .01). In the subscale of "patient care," results were less clear cut: fellows outperformed residents, who in turn outperformed faculty, but only the fellows to faculty comparison was statistically significant (P = .04), and residents did not clearly improve over time (P = .36). Factor analysis demonstrated that technical items and patient care items factor separately and thus represent separate skill domains in surgery. It is possible to objectively measure surgical skill at tonsillectomy with high reliability and good construct validity. Factor analysis demonstrated that patient care is a distinct domain in surgical skill. Although the interobserver reliability for some patient care items reached statistical significance, it was not high enough for "high stakes testing" purposes. Using reliability and factor analysis results, we propose a simplified instrument for use in evaluating trainees in otolaryngologic surgery.

  8. Reliability of Leg and Vertical Stiffness During High Speed Treadmill Running.

    PubMed

    Pappas, Panagiotis; Dallas, Giorgos; Paradisis, Giorgos

    2017-04-01

    In research, the accurate and reliable measurement of leg and vertical stiffness could contribute to valid interpretations. The current study aimed at determining the intraparticipant variability (ie, intraday and interday reliabilities) of leg and vertical stiffness, as well as related parameters, during high speed treadmill running, using the "sine-wave" method. Thirty-one males ran on a treadmill at 6.67 m∙s -1 , and the contact and flight times were measured. To determine the intraday reliability, three 10-s running bouts with 10-min recovery were performed. In addition, to examine the interday reliability, three 10-s running bouts on 3 separate days with 48-h interbout intervals were performed. The reliability statistics included repeated-measure analysis of variance, average intertrial correlations, intraclass correlation coefficients (ICCs), Cronbach's α reliability coefficient, and the coefficient of variation (CV%). Both intraday and interday reliabilities were high for leg and vertical stiffness (ICC > 0.939 and CV < 4.3%), as well as related variables (ICC > 0.934 and CV < 3.9%). It was thus inferred that the measurements of leg and vertical stiffness, as well as the related parameters obtained using the "sine-wave" method during treadmill running at 6.67 m∙s -1 , were highly reliable, both within and across days.

  9. Assessing Attitudes towards Statistics among Medical Students: Psychometric Properties of the Serbian Version of the Survey of Attitudes Towards Statistics (SATS)

    PubMed Central

    Stanisavljevic, Dejana; Trajkovic, Goran; Marinkovic, Jelena; Bukumiric, Zoran; Cirkovic, Andja; Milic, Natasa

    2014-01-01

    Background Medical statistics has become important and relevant for future doctors, enabling them to practice evidence based medicine. Recent studies report that students’ attitudes towards statistics play an important role in their statistics achievements. The aim of the study was to test the psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS) in order to acquire a valid instrument to measure attitudes inside the Serbian educational context. Methods The validation study was performed on a cohort of 417 medical students who were enrolled in an obligatory introductory statistics course. The SATS adaptation was based on an internationally accepted methodology for translation and cultural adaptation. Psychometric properties of the Serbian version of the SATS were analyzed through the examination of factorial structure and internal consistency. Results Most medical students held positive attitudes towards statistics. The average total SATS score was above neutral (4.3±0.8), and varied from 1.9 to 6.2. Confirmatory factor analysis validated the six-factor structure of the questionnaire (Affect, Cognitive Competence, Value, Difficulty, Interest and Effort). Values for fit indices TLI (0.940) and CFI (0.961) were above the cut-off of ≥0.90. The RMSEA value of 0.064 (0.051–0.078) was below the suggested value of ≤0.08. Cronbach’s alpha of the entire scale was 0.90, indicating scale reliability. In a multivariate regression model, self-rating of ability in mathematics and current grade point average were significantly associated with the total SATS score after adjusting for age and gender. Conclusion Present study provided the evidence for the appropriate metric properties of the Serbian version of SATS. Confirmatory factor analysis validated the six-factor structure of the scale. The SATS might be reliable and a valid instrument for identifying medical students’ attitudes towards statistics in the Serbian educational context. PMID:25405489

  10. Assessing attitudes towards statistics among medical students: psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS).

    PubMed

    Stanisavljevic, Dejana; Trajkovic, Goran; Marinkovic, Jelena; Bukumiric, Zoran; Cirkovic, Andja; Milic, Natasa

    2014-01-01

    Medical statistics has become important and relevant for future doctors, enabling them to practice evidence based medicine. Recent studies report that students' attitudes towards statistics play an important role in their statistics achievements. The aim of the study was to test the psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS) in order to acquire a valid instrument to measure attitudes inside the Serbian educational context. The validation study was performed on a cohort of 417 medical students who were enrolled in an obligatory introductory statistics course. The SATS adaptation was based on an internationally accepted methodology for translation and cultural adaptation. Psychometric properties of the Serbian version of the SATS were analyzed through the examination of factorial structure and internal consistency. Most medical students held positive attitudes towards statistics. The average total SATS score was above neutral (4.3±0.8), and varied from 1.9 to 6.2. Confirmatory factor analysis validated the six-factor structure of the questionnaire (Affect, Cognitive Competence, Value, Difficulty, Interest and Effort). Values for fit indices TLI (0.940) and CFI (0.961) were above the cut-off of ≥0.90. The RMSEA value of 0.064 (0.051-0.078) was below the suggested value of ≤0.08. Cronbach's alpha of the entire scale was 0.90, indicating scale reliability. In a multivariate regression model, self-rating of ability in mathematics and current grade point average were significantly associated with the total SATS score after adjusting for age and gender. Present study provided the evidence for the appropriate metric properties of the Serbian version of SATS. Confirmatory factor analysis validated the six-factor structure of the scale. The SATS might be reliable and a valid instrument for identifying medical students' attitudes towards statistics in the Serbian educational context.

  11. The Americleft Speech Project: A Training and Reliability Study.

    PubMed

    Chapman, Kathy L; Baylis, Adriane; Trost-Cardamone, Judith; Cordero, Kelly Nett; Dixon, Angela; Dobbelsteyn, Cindy; Thurmes, Anna; Wilson, Kristina; Harding-Bell, Anne; Sweeney, Triona; Stoddard, Gregory; Sell, Debbie

    2016-01-01

    To describe the results of two reliability studies and to assess the effect of training on interrater reliability scores. The first study (1) examined interrater and intrarater reliability scores (weighted and unweighted kappas) and (2) compared interrater reliability scores before and after training on the use of the Cleft Audit Protocol for Speech-Augmented (CAPS-A) with British English-speaking children. The second study examined interrater and intrarater reliability on a modified version of the CAPS-A (CAPS-A Americleft Modification) with American and Canadian English-speaking children. Finally, comparisons were made between the interrater and intrarater reliability scores obtained for Study 1 and Study 2. The participants were speech-language pathologists from the Americleft Speech Project. In Study 1, interrater reliability scores improved for 6 of the 13 parameters following training on the CAPS-A protocol. Comparison of the reliability results for the two studies indicated lower scores for Study 2 compared with Study 1. However, this appeared to be an artifact of the kappa statistic that occurred due to insufficient variability in the reliability samples for Study 2. When percent agreement scores were also calculated, the ratings appeared similar across Study 1 and Study 2. The findings of this study suggested that improvements in interrater reliability could be obtained following a program of systematic training. However, improvements were not uniform across all parameters. Acceptable levels of reliability were achieved for those parameters most important for evaluation of velopharyngeal function.

  12. The Americleft Speech Project: A Training and Reliability Study

    PubMed Central

    Chapman, Kathy L.; Baylis, Adriane; Trost-Cardamone, Judith; Cordero, Kelly Nett; Dixon, Angela; Dobbelsteyn, Cindy; Thurmes, Anna; Wilson, Kristina; Harding-Bell, Anne; Sweeney, Triona; Stoddard, Gregory; Sell, Debbie

    2017-01-01

    Objective To describe the results of two reliability studies and to assess the effect of training on interrater reliability scores. Design The first study (1) examined interrater and intrarater reliability scores (weighted and unweighted kappas) and (2) compared interrater reliability scores before and after training on the use of the Cleft Audit Protocol for Speech–Augmented (CAPS-A) with British English-speaking children. The second study examined interrater and intrarater reliability on a modified version of the CAPS-A (CAPS-A Americleft Modification) with American and Canadian English-speaking children. Finally, comparisons were made between the interrater and intrarater reliability scores obtained for Study 1 and Study 2. Participants The participants were speech-language pathologists from the Americleft Speech Project. Results In Study 1, interrater reliability scores improved for 6 of the 13 parameters following training on the CAPS-A protocol. Comparison of the reliability results for the two studies indicated lower scores for Study 2 compared with Study 1. However, this appeared to be an artifact of the kappa statistic that occurred due to insufficient variability in the reliability samples for Study 2. When percent agreement scores were also calculated, the ratings appeared similar across Study 1 and Study 2. Conclusion The findings of this study suggested that improvements in interrater reliability could be obtained following a program of systematic training. However, improvements were not uniform across all parameters. Acceptable levels of reliability were achieved for those parameters most important for evaluation of velopharyngeal function. PMID:25531738

  13. Students' attitudes towards learning statistics

    NASA Astrophysics Data System (ADS)

    Ghulami, Hassan Rahnaward; Hamid, Mohd Rashid Ab; Zakaria, Roslinazairimah

    2015-05-01

    Positive attitude towards learning is vital in order to master the core content of the subject matters under study. This is unexceptional in learning statistics course especially at the university level. Therefore, this study investigates the students' attitude towards learning statistics. Six variables or constructs have been identified such as affect, cognitive competence, value, difficulty, interest, and effort. The instrument used for the study is questionnaire that was adopted and adapted from the reliable instrument of Survey of Attitudes towards Statistics(SATS©). This study is conducted to engineering undergraduate students in one of the university in the East Coast of Malaysia. The respondents consist of students who were taking the applied statistics course from different faculties. The results are analysed in terms of descriptive analysis and it contributes to the descriptive understanding of students' attitude towards the teaching and learning process of statistics.

  14. Hazing DEOCS 4.1 Construct Validity Summary

    DTIC Science & Technology

    2017-08-01

    Hazing DEOCS 4.1 Construct Validity Summary DEFENSE EQUAL OPPORTUNITY MANAGEMENT INSTITUTE DIRECTORATE OF...the analysis. Tables 4 – 6 provide additional information regarding the descriptive statistics and reliability of the Hazing items. Table 7 provides

  15. Reliability Estimation of Aero-engine Based on Mixed Weibull Distribution Model

    NASA Astrophysics Data System (ADS)

    Yuan, Zhongda; Deng, Junxiang; Wang, Dawei

    2018-02-01

    Aero-engine is a complex mechanical electronic system, based on analysis of reliability of mechanical electronic system, Weibull distribution model has an irreplaceable role. Till now, only two-parameter Weibull distribution model and three-parameter Weibull distribution are widely used. Due to diversity of engine failure modes, there is a big error with single Weibull distribution model. By contrast, a variety of engine failure modes can be taken into account with mixed Weibull distribution model, so it is a good statistical analysis model. Except the concept of dynamic weight coefficient, in order to make reliability estimation result more accurately, three-parameter correlation coefficient optimization method is applied to enhance Weibull distribution model, thus precision of mixed distribution reliability model is improved greatly. All of these are advantageous to popularize Weibull distribution model in engineering applications.

  16. Reliability Centered Maintenance - Methodologies

    NASA Technical Reports Server (NTRS)

    Kammerer, Catherine C.

    2009-01-01

    Journal article about Reliability Centered Maintenance (RCM) methodologies used by United Space Alliance, LLC (USA) in support of the Space Shuttle Program at Kennedy Space Center. The USA Reliability Centered Maintenance program differs from traditional RCM programs because various methodologies are utilized to take advantage of their respective strengths for each application. Based on operational experience, USA has customized the traditional RCM methodology into a streamlined lean logic path and has implemented the use of statistical tools to drive the process. USA RCM has integrated many of the L6S tools into both RCM methodologies. The tools utilized in the Measure, Analyze, and Improve phases of a Lean Six Sigma project lend themselves to application in the RCM process. All USA RCM methodologies meet the requirements defined in SAE JA 1011, Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes. The proposed article explores these methodologies.

  17. [Methods of statistical analysis in differential diagnostics of the degree of brain glioma anaplasia during preoperative stage].

    PubMed

    Glavatskiĭ, A Ia; Guzhovskaia, N V; Lysenko, S N; Kulik, A V

    2005-12-01

    The authors proposed a possible preoperative diagnostics of the degree of supratentorial brain gliom anaplasia using statistical analysis methods. It relies on a complex examination of 934 patients with I-IV degree anaplasias, which had been treated in the Institute of Neurosurgery from 1990 to 2004. The use of statistical analysis methods for differential diagnostics of the degree of brain gliom anaplasia may optimize a diagnostic algorithm, increase reliability of obtained data and in some cases avoid carrying out irrational operative intrusions. Clinically important signs for the use of statistical analysis methods directed to preoperative diagnostics of brain gliom anaplasia have been defined

  18. Fatigue criterion to system design, life and reliability

    NASA Technical Reports Server (NTRS)

    Zaretsky, E. V.

    1985-01-01

    A generalized methodology to structural life prediction, design, and reliability based upon a fatigue criterion is advanced. The life prediction methodology is based in part on work of W. Weibull and G. Lundberg and A. Palmgren. The approach incorporates the computed life of elemental stress volumes of a complex machine element to predict system life. The results of coupon fatigue testing can be incorporated into the analysis allowing for life prediction and component or structural renewal rates with reasonable statistical certainty.

  19. Comparing Methods for Assessing Reliability Uncertainty Based on Pass/Fail Data Collected Over Time

    DOE PAGES

    Abes, Jeff I.; Hamada, Michael S.; Hills, Charles R.

    2017-12-20

    In this paper, we compare statistical methods for analyzing pass/fail data collected over time; some methods are traditional and one (the RADAR or Rationale for Assessing Degradation Arriving at Random) was recently developed. These methods are used to provide uncertainty bounds on reliability. We make observations about the methods' assumptions and properties. Finally, we illustrate the differences between two traditional methods, logistic regression and Weibull failure time analysis, and the RADAR method using a numerical example.

  20. Comparing Methods for Assessing Reliability Uncertainty Based on Pass/Fail Data Collected Over Time

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Abes, Jeff I.; Hamada, Michael S.; Hills, Charles R.

    In this paper, we compare statistical methods for analyzing pass/fail data collected over time; some methods are traditional and one (the RADAR or Rationale for Assessing Degradation Arriving at Random) was recently developed. These methods are used to provide uncertainty bounds on reliability. We make observations about the methods' assumptions and properties. Finally, we illustrate the differences between two traditional methods, logistic regression and Weibull failure time analysis, and the RADAR method using a numerical example.

  1. Generalised Category Attack—Improving Histogram-Based Attack on JPEG LSB Embedding

    NASA Astrophysics Data System (ADS)

    Lee, Kwangsoo; Westfeld, Andreas; Lee, Sangjin

    We present a generalised and improved version of the category attack on LSB steganography in JPEG images with straddled embedding path. It detects more reliably low embedding rates and is also less disturbed by double compressed images. The proposed methods are evaluated on several thousand images. The results are compared to both recent blind and specific attacks for JPEG embedding. The proposed attack permits a more reliable detection, although it is based on first order statistics only. Its simple structure makes it very fast.

  2. Validity and reliability of a method for assessment of cervical vertebral maturation.

    PubMed

    Zhao, Xiao-Guang; Lin, Jiuxiang; Jiang, Jiu-Hui; Wang, Qingzhu; Ng, Sut Hong

    2012-03-01

    To evaluate the validity and reliability of the cervical vertebral maturation (CVM) method with a longitudinal sample. Eighty-six cephalograms from 18 subjects (5 males and 13 females) were selected from the longitudinal database. Total mandibular length was measured on each film; an increased rate served as the gold standard in examination of the validity of the CVM method. Eleven orthodontists, after receiving intensive training in the CVM method, evaluated all films twice. Kendall's W and the weighted kappa statistic were employed. Kendall's W values were higher than 0.8 at both times, indicating strong interobserver reproducibility, but interobserver agreement was documented twice at less than 50%. A wide range of intraobserver agreement was noted (40.7%-79.1%), and substantial intraobserver reproducibility was proved by kappa values (0.53-0.86). With regard to validity, moderate agreement was reported between the gold standard and observer staging at the initial time (kappa values 0.44-0.61). However, agreement seemed to be unacceptable for clinical use, especially in cervical stage 3 (26.8%). Even though the validity and reliability of the CVM method proved statistically acceptable, we suggest that many other growth indicators should be taken into consideration in evaluating adolescent skeletal maturation.

  3. Test-retest reliability of auditory brainstem responses to chirp stimuli in newborns.

    PubMed

    Cobb, Kensi M; Stuart, Andrew

    2014-11-01

    The purpose of this study was to examine the test-retest reliability of auditory brainstem responses (ABRs) to air- and bone-conducted chirp stimuli in newborns as a function of intensity. A repeated measures quasi-experimental design was employed. Thirty healthy newborns participated. ABRs were evoked using 60, 45, and 30 dB nHL air-conducted CE-Chirps and 45, 30, and 15 dB nHL bone-conducted CE-Chirps at a rate of 57.7/s. Measures were repeated by a second tester. Statistically significant correlations (p <.0001) and predictive linear relations (p <.0001) were found between testers for wave V latencies and amplitudes to air- and bone-conducted CE-Chirps. There were also no statistically significant differences between testers with wave V latencies and amplitudes to air- and bone-conducted CE-Chirps (p >.05). As expected, significant differences in wave V latencies and amplitudes were seen as a function of stimulus intensity for air- and bone-conducted CE-Chirps (p <.0001). These results suggest that ABRs to air- and bone-conducted CE-Chirps can be reliably repeated in newborns with different testers. The CE-Chirp may be valuable for both screening and diagnostic audiologic assessments of newborns.

  4. Reliability of the hospital nutrition environment scan for cafeterias, vending machines, and gift shops.

    PubMed

    Winston, Courtney P; Sallis, James F; Swartz, Michael D; Hoelscher, Deanna M; Peskin, Melissa F

    2013-08-01

    According to ecological models, the physical environment plays a major role in determining individual health behaviors. As such, researchers have started targeting the consumer nutrition environment of large-scale foodservice operations when implementing obesity-prevention programs. In 2010, the American Hospital Association released a call-to-action encouraging health care facilities to join in this movement and improve their facilities' consumer nutrition environments. The Hospital Nutrition Environment Scan (HNES) for Cafeterias, Vending Machines, and Gift Shops was developed in 2011, and the present study evaluated the inter-rater reliability of this instrument. Two trained raters visited 39 hospitals in southern California and completed the HNES. Percent agreement, kappa statistics, and intraclass correlation coefficients were calculated. Percent agreement between raters ranged from 74.4% to 100% and kappa statistics ranged from 0.458 to 1.0. The intraclass correlation coefficient for the overall nutrition composite scores was 0.961. Given these results, the HNES demonstrated acceptable reliability metrics and can now be disseminated to assess the current state of hospital consumer nutrition environments. Copyright © 2013 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.

  5. A Reliability Model for Ni-BaTiO3-Based (BME) Ceramic Capacitors

    NASA Technical Reports Server (NTRS)

    Liu, Donhang

    2014-01-01

    The evaluation of multilayer ceramic capacitors (MLCCs) with base-metal electrodes (BMEs) for potential NASA space project applications requires an in-depth understanding of their reliability. The reliability of an MLCC is defined as the ability of the dielectric material to retain its insulating properties under stated environmental and operational conditions for a specified period of time t. In this presentation, a general mathematic expression of a reliability model for a BME MLCC is developed and discussed. The reliability model consists of three parts: (1) a statistical distribution that describes the individual variation of properties in a test group of samples (Weibull, log normal, normal, etc.), (2) an acceleration function that describes how a capacitors reliability responds to external stresses such as applied voltage and temperature (All units in the test group should follow the same acceleration function if they share the same failure mode, independent of individual units), and (3) the effect and contribution of the structural and constructional characteristics of a multilayer capacitor device, such as the number of dielectric layers N, dielectric thickness d, average grain size r, and capacitor chip size S. In general, a two-parameter Weibull statistical distribution model is used in the description of a BME capacitors reliability as a function of time. The acceleration function that relates a capacitors reliability to external stresses is dependent on the failure mode. Two failure modes have been identified in BME MLCCs: catastrophic and slow degradation. A catastrophic failure is characterized by a time-accelerating increase in leakage current that is mainly due to existing processing defects (voids, cracks, delamination, etc.), or the extrinsic defects. A slow degradation failure is characterized by a near-linear increase in leakage current against the stress time; this is caused by the electromigration of oxygen vacancies (intrinsic defects). The two identified failure modes follow different acceleration functions. Catastrophic failures follow the traditional power-law relationship to the applied voltage. Slow degradation failures fit well to an exponential law relationship to the applied electrical field. Finally, the impact of capacitor structure on the reliability of BME capacitors is discussed with respect to the number of dielectric layers in an MLCC unit, the number of BaTiO3 grains per dielectric layer, and the chip size of the capacitor device.

  6. DATMAN: A reliability data analysis program using Bayesian updating

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Becker, M.; Feltus, M.A.

    1996-12-31

    Preventive maintenance (PM) techniques focus on the prevention of failures, in particular, system components that are important to plant functions. Reliability-centered maintenance (RCM) improves on the PM techniques by introducing a set of guidelines by which to evaluate the system functions. It also minimizes intrusive maintenance, labor, and equipment downtime without sacrificing system performance when its function is essential for plant safety. Both the PM and RCM approaches require that system reliability data be updated as more component failures and operation time are acquired. Systems reliability and the likelihood of component failures can be calculated by Bayesian statistical methods, whichmore » can update these data. The DATMAN computer code has been developed at Penn State to simplify the Bayesian analysis by performing tedious calculations needed for RCM reliability analysis. DATMAN reads data for updating, fits a distribution that best fits the data, and calculates component reliability. DATMAN provides a user-friendly interface menu that allows the user to choose from several common prior and posterior distributions, insert new failure data, and visually select the distribution that matches the data most accurately.« less

  7. On the Simulation-Based Reliability of Complex Emergency Logistics Networks in Post-Accident Rescues.

    PubMed

    Wang, Wei; Huang, Li; Liang, Xuedong

    2018-01-06

    This paper investigates the reliability of complex emergency logistics networks, as reliability is crucial to reducing environmental and public health losses in post-accident emergency rescues. Such networks' statistical characteristics are analyzed first. After the connected reliability and evaluation indices for complex emergency logistics networks are effectively defined, simulation analyses of network reliability are conducted under two different attack modes using a particular emergency logistics network as an example. The simulation analyses obtain the varying trends in emergency supply times and the ratio of effective nodes and validates the effects of network characteristics and different types of attacks on network reliability. The results demonstrate that this emergency logistics network is both a small-world and a scale-free network. When facing random attacks, the emergency logistics network steadily changes, whereas it is very fragile when facing selective attacks. Therefore, special attention should be paid to the protection of supply nodes and nodes with high connectivity. The simulation method provides a new tool for studying emergency logistics networks and a reference for similar studies.

  8. On the Simulation-Based Reliability of Complex Emergency Logistics Networks in Post-Accident Rescues

    PubMed Central

    Wang, Wei; Huang, Li; Liang, Xuedong

    2018-01-01

    This paper investigates the reliability of complex emergency logistics networks, as reliability is crucial to reducing environmental and public health losses in post-accident emergency rescues. Such networks’ statistical characteristics are analyzed first. After the connected reliability and evaluation indices for complex emergency logistics networks are effectively defined, simulation analyses of network reliability are conducted under two different attack modes using a particular emergency logistics network as an example. The simulation analyses obtain the varying trends in emergency supply times and the ratio of effective nodes and validates the effects of network characteristics and different types of attacks on network reliability. The results demonstrate that this emergency logistics network is both a small-world and a scale-free network. When facing random attacks, the emergency logistics network steadily changes, whereas it is very fragile when facing selective attacks. Therefore, special attention should be paid to the protection of supply nodes and nodes with high connectivity. The simulation method provides a new tool for studying emergency logistics networks and a reference for similar studies. PMID:29316614

  9. Reliability of reports of childhood trauma in bipolar disorder: A test-retest study over 18 months.

    PubMed

    Shannon, Ciaran; Hanna, Donncha; Tumelty, Leo; Waldron, Daniel; Maguire, Chrissie; Mowlds, William; Meenagh, Ciaran; Mulholland, Ciaran

    2016-01-01

    This study aimed to explore the reliability of self-reported trauma histories in a population with a diagnosis of bipolar disorder using the Childhood Trauma Questionnaire. Previous studies in other populations suggest high reliability of trauma histories over time, and it was postulated that a similar high reliability would be demonstrated in this population. A total of 39 patients with a confirmed diagnosis (Diagnostic and Statistical Manual of Mental Disorders, 4th Edition, criteria) were followed up and readministered the Childhood Trauma Questionnaire after 18 months. Cohen's kappa scores and intraclass correlations suggested reasonable test-retest reliability over the 18-month time period of the study for all types of childhood abuse, namely, emotional, physical, and sexual abuse and physical and emotional neglect. Intraclass correlations ranged from r = .50 (sexual abuse) to r = .96 (physical abuse). Cohen's kappas ranged from .44 (sexual abuse) to .76 (physical abuse). Retrospective reports of childhood trauma can be seen as reliable and are in keeping with results found with other mental health populations.

  10. Can Counter-Gang Models be Applied to Counter ISIS’s Internet Recruitment Campaign

    DTIC Science & Technology

    2016-06-10

    limitation that exists is the lack of reliable statistics from social media companies in regards to the quantity of ISIS-affiliated sites, which exist on... statistics , they have approximately 320-million monthly active users with thirty-five-plus languages supported and 77 percent of accounts located...Justice and Delinquency Prevention program. For deterrence-based models, the primary point of research is focused deterrence models with emphasis placed

  11. Statistical distribution of time to crack initiation and initial crack size using service data

    NASA Technical Reports Server (NTRS)

    Heller, R. A.; Yang, J. N.

    1977-01-01

    Crack growth inspection data gathered during the service life of the C-130 Hercules airplane were used in conjunction with a crack propagation rule to estimate the distribution of crack initiation times and of initial crack sizes. A Bayesian statistical approach was used to calculate the fraction of undetected initiation times as a function of the inspection time and the reliability of the inspection procedure used.

  12. [Research & development on computer expert system for forensic bones estimation].

    PubMed

    Zhao, Jun-ji; Zhang, Jan-zheng; Liu, Nin-guo

    2005-08-01

    To build an expert system for forensic bones estimation. By using the object oriented method, employing statistical data of forensic anthropology, combining the statistical data frame knowledge representation with productions and also using the fuzzy matching and DS evidence theory method. Software for forensic estimation of sex, age and height with opened knowledge base was designed. This system is reliable and effective, and it would be a good assistant of the forensic technician.

  13. Peer Review of EPA's Draft BMDS Document: Exponential ...

    EPA Pesticide Factsheets

    BMDS is one of the Agency's premier tools for estimating risk assessments, therefore the validity and reliability of its statistical models are of paramount importance. This page provides links to peer review of the BMDS applications and its models as they were developed and eventually released documenting the rigorous review process taken to provide the best science tools available for statistical modeling. This page provides links to peer review of the BMDS applications and its models as they were developed and eventually released documenting the rigorous review process taken to provide the best science tools available for statistical modeling.

  14. Wood Products Analysis

    NASA Technical Reports Server (NTRS)

    1990-01-01

    Structural Reliability Consultants' computer program creates graphic plots showing the statistical parameters of glue laminated timbers, or 'glulam.' The company president, Dr. Joseph Murphy, read in NASA Tech Briefs about work related to analysis of Space Shuttle surface tile strength performed for Johnson Space Center by Rockwell International Corporation. Analysis led to a theory of 'consistent tolerance bounds' for statistical distributions, applicable in industrial testing where statistical analysis can influence product development and use. Dr. Murphy then obtained the Tech Support Package that covers the subject in greater detail. The TSP became the basis for Dr. Murphy's computer program PC-DATA, which he is marketing commercially.

  15. Routine hospital data - is it good enough for trials? An example using England's Hospital Episode Statistics in the SHIFT trial of Family Therapy vs. Treatment as Usual in adolescents following self-harm.

    PubMed

    Wright-Hughes, Alexandra; Graham, Elizabeth; Cottrell, David; Farrin, Amanda

    2018-04-01

    Use of routine data sources within clinical research is increasing and is endorsed by the National Institute for Health Research to increase trial efficiencies; however there is limited evidence for its use in clinical trials, especially in relation to self-harm. One source of routine data, Hospital Episode Statistics, is collated and distributed by NHS Digital and contains details of admissions, outpatient, and Accident and Emergency attendances provided periodically by English National Health Service hospitals. We explored the reliability and accuracy of Hospital Episode Statistics, compared to data collected directly from hospital records, to assess whether it would provide complete, accurate, and reliable means of acquiring hospital attendances for self-harm - the primary outcome for the SHIFT (Self-Harm Intervention: Family Therapy) trial evaluating Family Therapy for adolescents following self-harm. Participant identifiers were linked to Hospital Episode Statistics Accident and Emergency, and Admissions data, and episodes combined to describe participants' complete hospital attendance. Attendance data were initially compared to data previously gathered by trial researchers from pre-identified hospitals. Final comparison was conducted of subsequent attendances collected through Hospital Episode Statistics and researcher follow-up. Consideration was given to linkage rates; number and proportion of attendances retrieved; reliability of Accident and Emergency, and Admissions data; percentage of self-harm episodes recorded and coded appropriately; and percentage of required data items retrieved. Participants were first linked to Hospital Episode Statistics with an acceptable match rate of 95%, identifying a total of 341 complete hospital attendances, compared to 139 reported by the researchers at the time. More than double the proportion of Hospital Episode Statistics Accident and Emergency episodes could not be classified in relation to self-harm (75%) compared to 34.9% of admitted episodes, and of overall attendances, 18% were classified as self-harm related and 20% not related, while ambiguity or insufficient information meant 62% were unclassified. Of 39 self-harm-related attendances reported by the researchers, Hospital Episode Statistics identified 24 (62%) as self-harm related while 15 (38%) were unclassified. Based on final data received, 1490 complete hospital attendances were identified and comparison to researcher follow-up found Hospital Episode Statistics underestimated the number of self-harm attendances by 37.2% (95% confidence interval 32.6%-41.9%). Advantages of routine data collection via NHS Digital included the acquisition of more comprehensive and timely trial outcome data, identifying more than double the number of hospital attendances than researchers. Disadvantages included ambiguity in the classification of self-harm relatedness. Our resulting primary outcome data collection strategy used routine data to identify hospital attendances supplemented by targeted researcher data collection for attendances requiring further self-harm classification.

  16. The development of a quality appraisal tool for studies of diagnostic reliability (QAREL).

    PubMed

    Lucas, Nicholas P; Macaskill, Petra; Irwig, Les; Bogduk, Nikolai

    2010-08-01

    In systematic reviews of the reliability of diagnostic tests, no quality assessment tool has been used consistently. The aim of this study was to develop a specific quality appraisal tool for studies of diagnostic reliability. Key principles for the quality of studies of diagnostic reliability were identified with reference to epidemiologic principles, existing quality appraisal checklists, and the Standards for Reporting of Diagnostic Accuracy (STARD) and Quality Assessment of Diagnostic Accuracy Studies (QUADAS) resources. Specific items that encompassed each of the principles were developed. Experts in diagnostic research provided feedback on the items that were to form the appraisal tool. This process was iterative and continued until consensus among experts was reached. The Quality Appraisal of Reliability Studies (QAREL) checklist includes 11 items that explore seven principles. Items cover the spectrum of subjects, spectrum of examiners, examiner blinding, order effects of examination, suitability of the time interval among repeated measurements, appropriate test application and interpretation, and appropriate statistical analysis. QAREL has been developed as a specific quality appraisal tool for studies of diagnostic reliability. The reliability of this tool in different contexts needs to be evaluated. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  17. Reliability evaluation methodology for NASA applications

    NASA Technical Reports Server (NTRS)

    Taneja, Vidya S.

    1992-01-01

    Liquid rocket engine technology has been characterized by the development of complex systems containing large number of subsystems, components, and parts. The trend to even larger and more complex system is continuing. The liquid rocket engineers have been focusing mainly on performance driven designs to increase payload delivery of a launch vehicle for a given mission. In otherwords, although the failure of a single inexpensive part or component may cause the failure of the system, reliability in general has not been considered as one of the system parameters like cost or performance. Up till now, quantification of reliability has not been a consideration during system design and development in the liquid rocket industry. Engineers and managers have long been aware of the fact that the reliability of the system increases during development, but no serious attempts have been made to quantify reliability. As a result, a method to quantify reliability during design and development is needed. This includes application of probabilistic models which utilize both engineering analysis and test data. Classical methods require the use of operating data for reliability demonstration. In contrast, the method described in this paper is based on similarity, analysis, and testing combined with Bayesian statistical analysis.

  18. Spanish-language services assessment for children and adolescents (SACA): reliability of parent and adolescent reports.

    PubMed

    Bean, Donna L; Rotheram-Borus, Mary Jane; Leibowitz, Arleen; Horwitz, Sarah M; Weidmer, Beverly

    2003-02-01

    To assess test-retest reliability of the service utilization screening section of the Services Assessment for Children and Adolescents (SACA) interview among Spanish-speaking parents and adolescents, correspondence between parent and adolescent reports, and the correlation between reliability and participants' demographic and service use characteristics. The English SACA was translated and administered from September 1999 through January 2000 in Los Angeles County, California, on two separate occasions to eligible parents with a child (4-17 years old) who was a client of a local public mental health authority. Adolescents of these parents (12-17 years old) were also interviewed. Reliability was measured by the kappa statistic. Adult and adolescent reports about lifetime and previous year service setting use exhibited good reliability, but concordance of parents and adolescents did not. Children's service utilization appears to be correlated with reliability of parent reports, and child gender appears to be correlated with reliability of adolescent reports. The SACA appears to be a useful tool for screening Spanish-speaking families about child and adolescent mental health service use. These findings must be considered preliminary until replicated in a larger sample of culturally diverse Spanish-speaking families.

  19. We need more replication research - A case for test-retest reliability.

    PubMed

    Leppink, Jimmie; Pérez-Fuster, Patricia

    2017-06-01

    Following debates in psychology on the importance of replication research, we have also started to see pleas for a more prominent role for replication research in medical education. To enable replication research, it is of paramount importance to carefully study the reliability of the instruments we use. Cronbach's alpha has been the most widely used estimator of reliability in the field of medical education, notably as some kind of quality label of test or questionnaire scores based on multiple items or of the reliability of assessment across exam stations. However, as this narrative review outlines, Cronbach's alpha or alternative reliability statistics may complement but not replace psychometric methods such as factor analysis. Moreover, multiple-item measurements should be preferred above single-item measurements, and when using single-item measurements, coefficients as Cronbach's alpha should not be interpreted as indicators of the reliability of a single item when that item is administered after fundamentally different activities, such as learning tasks that differ in content. Finally, if we want to follow up on recent pleas for more replication research, we have to start studying the test-retest reliability of the instruments we use.

  20. The reliability of the Glasgow Coma Scale: a systematic review.

    PubMed

    Reith, Florence C M; Van den Brande, Ruben; Synnot, Anneliese; Gruen, Russell; Maas, Andrew I R

    2016-01-01

    The Glasgow Coma Scale (GCS) provides a structured method for assessment of the level of consciousness. Its derived sum score is applied in research and adopted in intensive care unit scoring systems. Controversy exists on the reliability of the GCS. The aim of this systematic review was to summarize evidence on the reliability of the GCS. A literature search was undertaken in MEDLINE, EMBASE and CINAHL. Observational studies that assessed the reliability of the GCS, expressed by a statistical measure, were included. Methodological quality was evaluated with the consensus-based standards for the selection of health measurement instruments checklist and its influence on results considered. Reliability estimates were synthesized narratively. We identified 52 relevant studies that showed significant heterogeneity in the type of reliability estimates used, patients studied, setting and characteristics of observers. Methodological quality was good (n = 7), fair (n = 18) or poor (n = 27). In good quality studies, kappa values were ≥0.6 in 85%, and all intraclass correlation coefficients indicated excellent reliability. Poor quality studies showed lower reliability estimates. Reliability for the GCS components was higher than for the sum score. Factors that may influence reliability include education and training, the level of consciousness and type of stimuli used. Only 13% of studies were of good quality and inconsistency in reported reliability estimates was found. Although the reliability was adequate in good quality studies, further improvement is desirable. From a methodological perspective, the quality of reliability studies needs to be improved. From a clinical perspective, a renewed focus on training/education and standardization of assessment is required.

  1. Reliability and Validity of the Turkish Version of the Job Performance Scale Instrument.

    PubMed

    Harmanci Seren, Arzu Kader; Tuna, Rujnan; Eskin Bacaksiz, Feride

    2018-02-01

    Objective measurement of the job performance of nursing staff using valid and reliable instruments is important in the evaluation of healthcare quality. A current, valid, and reliable instrument that specifically measures the performance of nurses is required for this purpose. The aim of this study was to determine the validity and reliability of the Turkish version of the Job Performance Instrument. This study used a methodological design and a sample of 240 nurses working at different units in four hospitals in Istanbul, Turkey. A descriptive data form, the Job Performance Scale, and the Employee Performance Scale were used to collect data. Data were analyzed using IBM SPSS Statistics Version 21.0 and LISREL Version 8.51. On the basis of the data analysis, the instrument was revised. Some items were deleted, and subscales were combined. The Turkish version of the Job Performance Instrument was determined to be valid and reliable to measure the performance of nurses. The instrument is suitable for evaluating current nursing roles.

  2. Unreliable evoked responses in autism

    PubMed Central

    Dinstein, Ilan; Heeger, David J.; Lorenzi, Lauren; Minshew, Nancy J.; Malach, Rafael; Behrmann, Marlene

    2012-01-01

    Summary Autism has been described as a disorder of general neural processing, but the particular processing characteristics that might be abnormal in autism have mostly remained obscure. Here, we present evidence of one such characteristic: poor evoked response reliability. We compared cortical response amplitude and reliability (consistency across trials) in visual, auditory, and somatosensory cortices of high-functioning individuals with autism and controls. Mean response amplitudes were statistically indistinguishable across groups, yet trial-by-trial response reliability was significantly weaker in autism, yielding smaller signal-to-noise ratios in all sensory systems. Response reliability differences were evident only in evoked cortical responses and not in ongoing resting-state activity. These findings reveal that abnormally unreliable cortical responses, even to elementary non-social sensory stimuli, may represent a fundamental physiological alteration of neural processing in autism. The results motivate a critical expansion of autism research to determine whether (and how) basic neural processing properties such as reliability, plasticity, and adaptation/habituation are altered in autism. PMID:22998867

  3. [The reliability of a questionnaire regarding Colombian children's physical activity].

    PubMed

    Herazo-Beltrán, Aliz Y; Domínguez-Anaya, Regina

    2012-10-01

    Reporting the Physical Activity Questionnaire for school children's (PAQ-C) test-retest reliability and internal consistency. This was a descriptive study of 100 school-aged children aged 9 to 11 years old attending a school in Cartagena, Colombia. The sample was randomly selected. The PAQ-C was given twice, one week apart, after the informed consent forms had been signing by the children's parents and school officials. Cronbach's alpha coefficient of reliability was used for assessing internal consistency and an intra-class correlation coefficient for test-retest reliability SPSS (version 17.0) was used for statistical analysis. The questionnaire scored 0.73 internal consistencies during the first measurement and 0.78 on the second; intra-class correlation coefficient was 0.60. There were differences between boys and girls regarding both measurements. The PAQ-C had acceptable internal consistency and test-retest reliability, thereby making it useful for measuring children's self-reported physical activity and a valuable tool for population studies in Colombia.

  4. Constructing a Reward-Related Quality of Life Statistic in Daily Life—a Proof of Concept Study Using Positive Affect

    PubMed Central

    Verhagen, Simone J. W.; Simons, Claudia J. P.; van Zelst, Catherine; Delespaul, Philippe A. E. G.

    2017-01-01

    Background: Mental healthcare needs person-tailored interventions. Experience Sampling Method (ESM) can provide daily life monitoring of personal experiences. This study aims to operationalize and test a measure of momentary reward-related Quality of Life (rQoL). Intuitively, quality of life improves by spending more time on rewarding experiences. ESM clinical interventions can use this information to coach patients to find a realistic, optimal balance of positive experiences (maximize reward) in daily life. rQoL combines the frequency of engaging in a relevant context (a ‘behavior setting’) with concurrent (positive) affect. High rQoL occurs when the most frequent behavior settings are combined with positive affect or infrequent behavior settings co-occur with low positive affect. Methods: Resampling procedures (Monte Carlo experiments) were applied to assess the reliability of rQoL using various behavior setting definitions under different sampling circumstances, for real or virtual subjects with low-, average- and high contextual variability. Furthermore, resampling was used to assess whether rQoL is a distinct concept from positive affect. Virtual ESM beep datasets were extracted from 1,058 valid ESM observations for virtual and real subjects. Results: Behavior settings defined by Who-What contextual information were most informative. Simulations of at least 100 ESM observations are needed for reliable assessment. Virtual ESM beep datasets of a real subject can be defined by Who-What-Where behavior setting combinations. Large sample sizes are necessary for reliable rQoL assessments, except for subjects with low contextual variability. rQoL is distinct from positive affect. Conclusion: rQoL is a feasible concept. Monte Carlo experiments should be used to assess the reliable implementation of an ESM statistic. Future research in ESM should asses the behavior of summary statistics under different sampling situations. This exploration is especially relevant in clinical implementation, where often only small datasets are available. PMID:29163294

  5. Standardized quality-assessment system to evaluate pressure ulcer care in the nursing home.

    PubMed

    Bates-Jensen, Barbara M; Cadogan, Mary; Jorge, Jennifer; Schnelle, John F

    2003-09-01

    To demonstrate reliability and feasibility of a standardized protocol to assess and score quality indicators relevant to pressure ulcer (PU) care processes in nursing homes (NHs). Descriptive. Eight NHs. One hundred ninety-one NH residents for whom the PU Resident Assessment Protocol of the Minimum Data Set was initiated. Nine quality indicators (two related to screening and prevention of PU, two focused on assessment, and five addressing management) were scored using medical record data, direct human observation, and wireless thigh monitor observation data. Feasibility and reliability of medical record, observation, and thigh monitor protocols were determined. The percentage of participants who passed each of the indicators, indicating care consistent with practice guidelines, ranged from 0% to 98% across all indicators. In general, participants in NHs passed fewer indicators and had more problems with medical record accuracy before a PU was detected (screening/prevention indicators) than they did once an ulcer was documented (assessment and management indicators). Reliability of the medical record protocol showed kappa statistics ranging from 0.689 to 1.00 and percentage agreement from 80% to 100%. Direct observation protocols yielded kappa statistics of 0.979 and 0.928. Thigh monitor protocols showed kappa statistics ranging from 0.609 to 0.842. Training was variable, with the observation protocol requiring 1 to 2 hours, medical records requiring joint review of 20 charts with average time to complete the review of 20 minutes, and the thigh monitor data requiring 1 week for training in data preparation and interpretation. The standardized quality assessment system generated scores for nine PU quality indicators with good reliability and provided explicit scoring rules that permit reproducible conclusions about PU care. The focus of the indicators on care processes that are under the control of NH staff made the protocol useful for external survey and internal quality improvement purposes, and the thigh monitor observational technology provided a method for monitoring repositioning care processes that were otherwise difficult to monitor and manage.

  6. Constructing a Reward-Related Quality of Life Statistic in Daily Life-a Proof of Concept Study Using Positive Affect.

    PubMed

    Verhagen, Simone J W; Simons, Claudia J P; van Zelst, Catherine; Delespaul, Philippe A E G

    2017-01-01

    Background: Mental healthcare needs person-tailored interventions. Experience Sampling Method (ESM) can provide daily life monitoring of personal experiences. This study aims to operationalize and test a measure of momentary reward-related Quality of Life (rQoL). Intuitively, quality of life improves by spending more time on rewarding experiences. ESM clinical interventions can use this information to coach patients to find a realistic, optimal balance of positive experiences (maximize reward) in daily life. rQoL combines the frequency of engaging in a relevant context (a 'behavior setting') with concurrent (positive) affect. High rQoL occurs when the most frequent behavior settings are combined with positive affect or infrequent behavior settings co-occur with low positive affect. Methods: Resampling procedures (Monte Carlo experiments) were applied to assess the reliability of rQoL using various behavior setting definitions under different sampling circumstances, for real or virtual subjects with low-, average- and high contextual variability. Furthermore, resampling was used to assess whether rQoL is a distinct concept from positive affect. Virtual ESM beep datasets were extracted from 1,058 valid ESM observations for virtual and real subjects. Results: Behavior settings defined by Who-What contextual information were most informative. Simulations of at least 100 ESM observations are needed for reliable assessment. Virtual ESM beep datasets of a real subject can be defined by Who-What-Where behavior setting combinations. Large sample sizes are necessary for reliable rQoL assessments, except for subjects with low contextual variability. rQoL is distinct from positive affect. Conclusion: rQoL is a feasible concept. Monte Carlo experiments should be used to assess the reliable implementation of an ESM statistic. Future research in ESM should asses the behavior of summary statistics under different sampling situations. This exploration is especially relevant in clinical implementation, where often only small datasets are available.

  7. [Study of the reliability in one dimensional size measurement with digital slit lamp microscope].

    PubMed

    Wang, Tao; Qi, Chaoxiu; Li, Qigen; Dong, Lijie; Yang, Jiezheng

    2010-11-01

    To study the reliability of digital slit lamp microscope as a tool for quantitative analysis in one dimensional size measurement. Three single-blinded observers acquired and repeatedly measured the images with a size of 4.00 mm and 10.00 mm on the vernier caliper, which simulatated the human eye pupil and cornea diameter under China-made digital slit lamp microscope in the objective magnification of 4 times, 10 times, 16 times, 25 times, 40 times and 4 times, 10 times, 16 times, respectively. The correctness and precision of measurement were compared. The images with 4 mm size were measured by three investigators and the average values were located between 3.98 to 4.06. For the images with 10.00 mm size, the average values fell within 10.00 ~ 10.04. Measurement results of 4.00 mm images showed, except A4, B25, C16 and C25, significant difference was noted between the measured value and the true value. Regarding measurement results of 10.00 mm iamges indicated, except A10, statistical significance was found between the measured value and the true value. In terms of comparing the results of the same size measured at different magnifications by the same investigator, except for investigators A's measurements of 10.00 mm dimension, the measurement results by all the remaining investigators presented statistical significance at different magnifications. Compared measurements of the same size with different magnifications, measurements of 4.00 mm in 4-fold magnification had no significant difference among the investigators', the remaining results were statistically significant. The coefficient of variation of all measurement results were less than 5%; as magnification increased, the coefficient of variation decreased. The measurement of digital slit lamp microscope in one-dimensional size has good reliability,and should be performed for reliability analysis before used for quantitative analysis to reduce systematic errors.

  8. Estimation of lifetime distributions on 1550-nm DFB laser diodes using Monte-Carlo statistic computations

    NASA Astrophysics Data System (ADS)

    Deshayes, Yannick; Verdier, Frederic; Bechou, Laurent; Tregon, Bernard; Danto, Yves; Laffitte, Dominique; Goudard, Jean Luc

    2004-09-01

    High performance and high reliability are two of the most important goals driving the penetration of optical transmission into telecommunication systems ranging from 880 nm to 1550 nm. Lifetime prediction defined as the time at which a parameter reaches its maximum acceptable shirt still stays the main result in terms of reliability estimation for a technology. For optoelectronic emissive components, selection tests and life testing are specifically used for reliability evaluation according to Telcordia GR-468 CORE requirements. This approach is based on extrapolation of degradation laws, based on physics of failure and electrical or optical parameters, allowing both strong test time reduction and long-term reliability prediction. Unfortunately, in the case of mature technology, there is a growing complexity to calculate average lifetime and failure rates (FITs) using ageing tests in particular due to extremely low failure rates. For present laser diode technologies, time to failure tend to be 106 hours aged under typical conditions (Popt=10 mW and T=80°C). These ageing tests must be performed on more than 100 components aged during 10000 hours mixing different temperatures and drive current conditions conducting to acceleration factors above 300-400. These conditions are high-cost, time consuming and cannot give a complete distribution of times to failure. A new approach consists in use statistic computations to extrapolate lifetime distribution and failure rates in operating conditions from physical parameters of experimental degradation laws. In this paper, Distributed Feedback single mode laser diodes (DFB-LD) used for 1550 nm telecommunication network working at 2.5 Gbit/s transfer rate are studied. Electrical and optical parameters have been measured before and after ageing tests, performed at constant current, according to Telcordia GR-468 requirements. Cumulative failure rates and lifetime distributions are computed using statistic calculations and equations of drift mechanisms versus time fitted from experimental measurements.

  9. An Examination of the True Reliability of Lower Limb Stiffness Measures During Overground Hopping.

    PubMed

    Diggin, David; Anderson, Ross; Harrison, Andrew J

    2016-06-01

    Evidence suggests reports describing the reliability of leg-spring (kleg) and joint stiffness (kjoint) measures are contaminated by artifacts originating from digital filtering procedures. In addition, the intraday reliability of kleg and kjoint requires investigation. This study examined the effects of experimental procedures on the inter- and intraday reliability of kleg and kjoint. Thirty-two participants completed 2 trials of single-legged hopping at 1.5, 2.2, and 3.0 Hz at the same time of day across 3 days. On the final test day a fourth experimental bout took place 6 hours before or after participants' typical testing time. Kinematic and kinetic data were collected throughout. Stiffness was calculated using models of kleg and kjoint. Classifications of measurement agreement were established using thresholds for absolute and relative reliability statistics. Results illustrated that kleg and kankle exhibited strong agreement. In contrast, kknee and khip demonstrated weak-to-moderate consistency. Results suggest limits in kjoint reliability persist despite employment of appropriate filtering procedures. Furthermore, diurnal fluctuations in lower-limb muscle-tendon stiffness exhibit little effect on intraday reliability. The present findings support the existence of kleg as an attractor state during hopping, achieved through fluctuations in kjoint variables. Limits to kjoint reliability appear to represent biological function rather than measurement artifact.

  10. TCOPPE school environmental audit tool: assessing safety and walkability of school environments.

    PubMed

    Lee, Chanam; Kim, Hyung Jin; Dowdy, Diane M; Hoelscher, Deanna M; Ory, Marcia G

    2013-09-01

    Several environmental audit instruments have been developed for assessing streets, parks and trails, but none for schools. This paper introduces a school audit tool that includes 3 subcomponents: 1) street audit, 2) school site audit, and 3) map audit. It presents the conceptual basis and the development process of this instrument, and the methods and results of the reliability assessments. Reliability tests were conducted by 2 trained auditors on 12 study schools (high-low income and urban-suburban-rural settings). Kappa statistics (categorical, factual items) and ICC (Likert-scale, perceptual items) were used to assess a) interrater, b) test-retest, and c) peak vs. off-peak hour reliability tests. For the interrater reliability test, the average Kappa was 0.839 and the ICC was 0.602. For the test-retest reliability, the average Kappa was 0.903 and the ICC was 0.774. The peak-off peak reliability was 0.801. Rural schools showed the most consistent results in the peak-off peak and test-retest assessments. For interrater tests, urban schools showed the highest ICC, and rural schools showed the highest Kappa. Most items achieved moderate to high levels of reliabilities in all study schools. With proper training, this audit can be used to assess school environments reliably for research, outreach, and policy-support purposes.

  11. Block observations of neighbourhood physical disorder are associated with neighbourhood crime, firearm injuries and deaths, and teen births.

    PubMed

    Wei, Evelyn; Hipwell, Alison; Pardini, Dustin; Beyers, Jennifer M; Loeber, Rolf

    2005-10-01

    To provide reliability information for a brief observational measure of physical disorder and determine its relation with neighbourhood level crime and health variables after controlling for census based measures of concentrated poverty and minority concentration. Psychometric analysis of block observation data comprising a brief measure of neighbourhood physical disorder, and cross sectional analysis of neighbourhood physical disorder, neighbourhood crime and birth statistics, and neighbourhood level poverty and minority concentration. Pittsburgh, Pennsylvania, US (2000 population=334 563). Pittsburgh neighbourhoods (n=82) and their residents (as reflected in neighbourhood level statistics). The physical disorder index showed adequate reliability and validity and was associated significantly with rates of crime, firearm injuries and homicides, and teen births, while controlling for concentrated poverty and minority population. This brief measure of neighbourhood physical disorder may help increase our understanding of how community level factors reflect health and crime outcomes.

  12. Earthquake forecasting during the complex Amatrice-Norcia seismic sequence

    PubMed Central

    Marzocchi, Warner; Taroni, Matteo; Falcone, Giuseppe

    2017-01-01

    Earthquake forecasting is the ultimate challenge for seismologists, because it condenses the scientific knowledge about the earthquake occurrence process, and it is an essential component of any sound risk mitigation planning. It is commonly assumed that, in the short term, trustworthy earthquake forecasts are possible only for typical aftershock sequences, where the largest shock is followed by many smaller earthquakes that decay with time according to the Omori power law. We show that the current Italian operational earthquake forecasting system issued statistically reliable and skillful space-time-magnitude forecasts of the largest earthquakes during the complex 2016–2017 Amatrice-Norcia sequence, which is characterized by several bursts of seismicity and a significant deviation from the Omori law. This capability to deliver statistically reliable forecasts is an essential component of any program to assist public decision-makers and citizens in the challenging risk management of complex seismic sequences. PMID:28924610

  13. Earthquake forecasting during the complex Amatrice-Norcia seismic sequence.

    PubMed

    Marzocchi, Warner; Taroni, Matteo; Falcone, Giuseppe

    2017-09-01

    Earthquake forecasting is the ultimate challenge for seismologists, because it condenses the scientific knowledge about the earthquake occurrence process, and it is an essential component of any sound risk mitigation planning. It is commonly assumed that, in the short term, trustworthy earthquake forecasts are possible only for typical aftershock sequences, where the largest shock is followed by many smaller earthquakes that decay with time according to the Omori power law. We show that the current Italian operational earthquake forecasting system issued statistically reliable and skillful space-time-magnitude forecasts of the largest earthquakes during the complex 2016-2017 Amatrice-Norcia sequence, which is characterized by several bursts of seismicity and a significant deviation from the Omori law. This capability to deliver statistically reliable forecasts is an essential component of any program to assist public decision-makers and citizens in the challenging risk management of complex seismic sequences.

  14. [Child nutritional status in contexts of urban poverty: a reliable indicator of family health?

    PubMed

    Huergo, Juliana; Casabona, Eugenia Lourdes

    2016-03-01

    This work questions the premise that the nutritional status of children under six years of age is a reliable indicator of family health. To do so, a research strategy based in case studies was carried out, following a qualitative design (participant observation and semistructured interviews using intentional sampling) and framed within the interpretivist paradigm. The anthropometric measurements of 20 children under six years of age attending the local Child Care Center in Villa La Tela, Córdoba were evaluated. Nutritional status was understood as an object that includes socially determined biological processes, and was therefore posited analytically as a cross between statistical data and its social determination. As a statistic, child nutritional status is merely descriptive; to assist in the understanding of its social determination, it must be placed in dialectical relationship with the spheres of sociability proposed to analyze the reproduction of health problems.

  15. Development and Evaluation of a Questionnaire for Measuring Suboptimal Health Status in Urban Chinese

    PubMed Central

    Yan, Yu-Xiang; Liu, You-Qin; Li, Man; Hu, Pei-Feng; Guo, Ai-Min; Yang, Xing-Hua; Qiu, Jing-Jun; Yang, Shan-Shan; Shen, Jian; Zhang, Li-Ping; Wang, Wei

    2009-01-01

    Background Suboptimal health status (SHS) is characterized by ambiguous health complaints, general weakness, and lack of vitality, and has become a new public health challenge in China. It is believed to be a subclinical, reversible stage of chronic disease. Studies of intervention and prognosis for SHS are expected to become increasingly important. Consequently, a reliable and valid instrument to assess SHS is essential. We developed and evaluated a questionnaire for measuring SHS in urban Chinese. Methods Focus group discussions and a literature review provided the basis for the development of the questionnaire. Questionnaire validity and reliability were evaluated in a small pilot study and in a larger cross-sectional study of 3000 individuals. Analyses included tests for reliability and internal consistency, exploratory and confirmatory factor analysis, and tests for discriminative ability and convergent validity. Results The final questionnaire included 25 items on SHS (SHSQ-25), and encompassed 5 subscales: fatigue, the cardiovascular system, the digestive tract, the immune system, and mental status. Overall, 2799 of 3000 participants completed the questionnaire (93.3%). Test-retest reliability coefficients of individual items ranged from 0.89 to 0.98. Item-subscale correlations ranged from 0.51 to 0.72, and Cronbach’s α was 0.70 or higher for all subscales. Factor analysis established 5 distinct domains, as conceptualized in our model. One-way ANOVA showed statistically significant differences in scale scores between 3 occupation groups; these included total scores and subscores (P < 0.01). The correlation between the SHS scores and experienced stress was statistically significant (r = 0.57, P < 0.001). Conclusions The SHSQ-25 is a reliable and valid instrument for measuring sub-health status in urban Chinese. PMID:19749497

  16. Validity and reliability of an application review process using dedicated reviewers in one stage of a multi-stage admissions model.

    PubMed

    Zeeman, Jacqueline M; McLaughlin, Jacqueline E; Cox, Wendy C

    2017-11-01

    With increased emphasis placed on non-academic skills in the workplace, a need exists to identify an admissions process that evaluates these skills. This study assessed the validity and reliability of an application review process involving three dedicated application reviewers in a multi-stage admissions model. A multi-stage admissions model was utilized during the 2014-2015 admissions cycle. After advancing through the academic review, each application was independently reviewed by two dedicated application reviewers utilizing a six-construct rubric (written communication, extracurricular and community service activities, leadership experience, pharmacy career appreciation, research experience, and resiliency). Rubric scores were extrapolated to a three-tier ranking to select candidates for on-site interviews. Kappa statistics were used to assess interrater reliability. A three-facet Many-Facet Rasch Model (MFRM) determined reviewer severity, candidate suitability, and rubric construct difficulty. The kappa statistic for candidates' tier rank score (n = 388 candidates) was 0.692 with a perfect agreement frequency of 84.3%. There was substantial interrater reliability between reviewers for the tier ranking (kappa: 0.654-0.710). Highest construct agreement occurred in written communication (kappa: 0.924-0.984). A three-facet MFRM analysis explained 36.9% of variance in the ratings, with 0.06% reflecting application reviewer scoring patterns (i.e., severity or leniency), 22.8% reflecting candidate suitability, and 14.1% reflecting construct difficulty. Utilization of dedicated application reviewers and a defined tiered rubric provided a valid and reliable method to effectively evaluate candidates during the application review process. These analyses provide insight into opportunities for improving the application review process among schools and colleges of pharmacy. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Reliability and Repeatability of Cone Density Measurements in Patients With Stargardt Disease and RPGR-Associated Retinopathy.

    PubMed

    Tanna, Preena; Kasilian, Melissa; Strauss, Rupert; Tee, James; Kalitzeos, Angelos; Tarima, Sergey; Visotcky, Alexis; Dubra, Alfredo; Carroll, Joseph; Michaelides, Michel

    2017-07-01

    To assess reliability and repeatability of cone density measurements by using confocal and (nonconfocal) split-detector adaptive optics scanning light ophthalmoscopy (AOSLO) imaging. It will be determined whether cone density values are significantly different between modalities in Stargardt disease (STGD) and retinitis pigmentosa GTPase regulator (RPGR)-associated retinopathy. Twelve patients with STGD (aged 9-52 years) and eight with RPGR-associated retinopathy (aged 11-31 years) were imaged using both confocal and split-detector AOSLO simultaneously. Four graders manually identified cone locations in each image that were used to calculate local densities. Each imaging modality was evaluated independently. The data set consisted of 1584 assessments of 99 STGD images (each image in two modalities and four graders who graded each image twice) and 928 RPGR assessments of 58 images (each image in two modalities and four graders who graded each image twice). For STGD assessments the reliability for confocal and split-detector AOSLO was 67.9% and 95.9%, respectively, and the repeatability was 71.2% and 97.3%, respectively. The differences in the measured cone density values between modalities were statistically significant for one grader. For RPGR assessments the reliability for confocal and split-detector AOSLO was 22.1% and 88.5%, respectively, and repeatability was 63.2% and 94.5%, respectively. The differences in cone density between modalities were statistically significant for all graders. Split-detector AOSLO greatly improved the reliability and repeatability of cone density measurements in both disorders and will be valuable for natural history studies and clinical trials using AOSLO. However, it appears that these indices may be disease dependent, implying the need for similar investigations in other conditions.

  18. Reliability and Repeatability of Cone Density Measurements in Patients With Stargardt Disease and RPGR-Associated Retinopathy

    PubMed Central

    Tanna, Preena; Kasilian, Melissa; Strauss, Rupert; Tee, James; Kalitzeos, Angelos; Tarima, Sergey; Visotcky, Alexis; Dubra, Alfredo; Carroll, Joseph; Michaelides, Michel

    2017-01-01

    Purpose To assess reliability and repeatability of cone density measurements by using confocal and (nonconfocal) split-detector adaptive optics scanning light ophthalmoscopy (AOSLO) imaging. It will be determined whether cone density values are significantly different between modalities in Stargardt disease (STGD) and retinitis pigmentosa GTPase regulator (RPGR)–associated retinopathy. Methods Twelve patients with STGD (aged 9–52 years) and eight with RPGR-associated retinopathy (aged 11–31 years) were imaged using both confocal and split-detector AOSLO simultaneously. Four graders manually identified cone locations in each image that were used to calculate local densities. Each imaging modality was evaluated independently. The data set consisted of 1584 assessments of 99 STGD images (each image in two modalities and four graders who graded each image twice) and 928 RPGR assessments of 58 images (each image in two modalities and four graders who graded each image twice). Results For STGD assessments the reliability for confocal and split-detector AOSLO was 67.9% and 95.9%, respectively, and the repeatability was 71.2% and 97.3%, respectively. The differences in the measured cone density values between modalities were statistically significant for one grader. For RPGR assessments the reliability for confocal and split-detector AOSLO was 22.1% and 88.5%, respectively, and repeatability was 63.2% and 94.5%, respectively. The differences in cone density between modalities were statistically significant for all graders. Conclusions Split-detector AOSLO greatly improved the reliability and repeatability of cone density measurements in both disorders and will be valuable for natural history studies and clinical trials using AOSLO. However, it appears that these indices may be disease dependent, implying the need for similar investigations in other conditions. PMID:28738413

  19. Reliability and validity of measures used in assessing dental anxiety in 5- to 15-year-old Croatian children.

    PubMed

    Majstorovic, M; Veerkamp, J S; Skrinjaric, I

    2003-12-01

    The aim of the study was to evaluate reliability and validity of different questionnaires and predict related causes, as concomitant factors in assessing different aspects of children's dental anxiety. Children were interviewed on dental anxiety, dispositional risk factors and satisfaction with the dentist after dental treatment had been accomplished. Parents were interviewed on dental anxiety as well. The study population included 165 children (91 boys) aged 5 to 15 years, referred to a university dental clinic by general dental practitioners because of a history of fear and uncooperative behaviour during previous dental visits. Children were treated by two dentists, both experienced in treating fearful children. Statistical analysis was performed in Statistics for Windows, Release 5.5 and Release 7.5. Pearson's correlation coefficients were calculated for validity and Cronbach alpha for reliability of the measures. Spearman Brown prophecy formula was used for correction of the alpha scores. Results The children's total average CFSS-DS score was 27.02, with no significant difference with respect to gender. The highest Cronbach alpha scores regarding reliability were obtained for the S-DAI, the CFSS-DS and the PDAS. Pearson's correlations regarding validity presented significant correlations between the CMFQ, the CDAS and the S-DAI, between the OAS, the CDAS and the S-DAI, as well as between the OAS and the DVSS-SV. Previous negative medical experience had significant influence on children's dental anxiety, supporting Rachman's conditioning theory. Anxious children were more likely to show behaviour problems (aggression) and more introvert in expressing their judgement regarding the dentist. Both the S-DAI and the CFSS-DS, which were standardized in the Croatian population sample, showed the highest reliability in assessment of children's dental anxiety.

  20. Measuring the food and built environments in urban centres: reliability and validity of the EURO-PREVOB Community Questionnaire.

    PubMed

    Pomerleau, J; Knai, C; Foster, C; Rutter, H; Darmon, N; Derflerova Brazdova, Z; Hadziomeragic, A F; Pekcan, G; Pudule, I; Robertson, A; Brunner, E; Suhrcke, M; Gabrijelcic Blenkus, M; Lhotska, L; Maiani, G; Mistura, L; Lobstein, T; Martin, B W; Elinder, L S; Logstrup, S; Racioppi, F; McKee, M

    2013-03-01

    The authors designed an instrument to measure objectively aspects of the built and food environments in urban areas, the EURO-PREVOB Community Questionnaire, within the EU-funded project 'Tackling the social and economic determinants of nutrition and physical activity for the prevention of obesity across Europe' (EURO-PREVOB). This paper describes its development, reliability, validity, feasibility and relevance to public health and obesity research. The Community Questionnaire is designed to measure key aspects of the food and built environments in urban areas of varying levels of affluence or deprivation, within different countries. The questionnaire assesses (1) the food environment and (2) the built environment. Pilot tests of the EURO-PREVOB Community Questionnaire were conducted in five to 10 purposively sampled urban areas of different socio-economic status in each of Ankara, Brno, Marseille, Riga, and Sarajevo. Inter-rater reliability was compared between two pairs of fieldworkers in each city centre using three methods: inter-observer agreement (IOA), kappa statistics, and intraclass correlation coefficients (ICCs). Data were collected successfully in all five cities. Overall reliability of the EURO-PREVOB Community Questionnaire was excellent (inter-observer agreement (IOA) > 0.87; intraclass correlation coefficients (ICC)s > 0.91 and kappa statistics > 0.7. However, assessment of certain aspects of the quality of the built environment yielded slightly lower IOA coefficients than the quantitative aspects. The EURO-PREVOB Community Questionnaire was found to be a reliable and practical observational tool for measuring differences in community-level data on environmental factors that can impact on dietary intake and physical activity. The next step is to evaluate its predictive power by collecting behavioural and anthropometric data relevant to obesity and its determinants. Copyright © 2013 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

  1. Lumbar lordosis and sacral slope in lumbar spinal stenosis: standard values and measurement accuracy.

    PubMed

    Bredow, J; Oppermann, J; Scheyerer, M J; Gundlfinger, K; Neiss, W F; Budde, S; Floerkemeier, T; Eysel, P; Beyer, F

    2015-05-01

    Radiological study. To asses standard values, intra- and interobserver reliability and reproducibility of sacral slope (SS) and lumbar lordosis (LL) and the correlation of these parameters in patients with lumbar spinal stenosis (LSS). Anteroposterior and lateral X-rays of the lumbar spine of 102 patients with LSS were included in this retrospective, radiologic study. Measurements of SS and LL were carried out by five examiners. Intraobserver correlation and correlation between LL and SS were calculated with Pearson's r linear correlation coefficient and intraclass correlation coefficients (ICC) were calculated for inter- and intraobserver reliability. In addition, patients were examined in subgroups with respect to previous surgery and the current therapy. Lumbar lordosis averaged 45.6° (range 2.5°-74.9°; SD 14.2°), intraobserver correlation was between Pearson r = 0.93 and 0.98. The measurement of SS averaged 35.3° (range 13.8°-66.9°; SD 9.6°), intraobserver correlation was between Pearson r = 0.89 and 0.96. Intraobserver reliability ranged from 0.966 to 0.992 ICC in LL measurements and 0.944-0.983 ICC in SS measurements. There was an interobserver reliability ICC of 0.944 in LL and 0.990 in SS. Correlation between LL and SS averaged r = 0.79. No statistically significant differences were observed between the analyzed subgroups. Manual measurement of LL and SS in patients with LSS on lateral radiographs is easily performed with excellent intra- and interobserver reliability. Correlation between LL and SS is very high. Differences between patients with and without previous decompression were not statistically significant.

  2. Test-retest reliability of high angular resolution diffusion imaging acquisition within medial temporal lobe connections assessed via tract based spatial statistics, probabilistic tractography and a novel graph theory metric.

    PubMed

    Kuhn, T; Gullett, J M; Nguyen, P; Boutzoukas, A E; Ford, A; Colon-Perez, L M; Triplett, W; Carney, P R; Mareci, T H; Price, C C; Bauer, R M

    2016-06-01

    This study examined the reliability of high angular resolution diffusion tensor imaging (HARDI) data collected on a single individual across several sessions using the same scanner. HARDI data was acquired for one healthy adult male at the same time of day on ten separate days across a one-month period. Environmental factors (e.g. temperature) were controlled across scanning sessions. Tract Based Spatial Statistics (TBSS) was used to assess session-to-session variability in measures of diffusion, fractional anisotropy (FA) and mean diffusivity (MD). To address reliability within specific structures of the medial temporal lobe (MTL; the focus of an ongoing investigation), probabilistic tractography segmented the Entorhinal cortex (ERc) based on connections with Hippocampus (HC), Perirhinal (PRc) and Parahippocampal (PHc) cortices. Streamline tractography generated edge weight (EW) metrics for the aforementioned ERc connections and, as comparison regions, connections between left and right rostral and caudal anterior cingulate cortex (ACC). Coefficients of variation (CoV) were derived for the surface area and volumes of these ERc connectivity-defined regions (CDR) and for EW across all ten scans, expecting that scan-to-scan reliability would yield low CoVs. TBSS revealed no significant variation in FA or MD across scanning sessions. Probabilistic tractography successfully reproduced histologically-verified adjacent medial temporal lobe circuits. Tractography-derived metrics displayed larger ranges of scanner-to-scanner variability. Connections involving HC displayed greater variability than metrics of connection between other investigated regions. By confirming the test retest reliability of HARDI data acquisition, support for the validity of significant results derived from diffusion data can be obtained.

  3. Manual unloading of the lumbar spine: can it identify immediate responders to mechanical traction in a low back pain population? A study of reliability and criterion referenced predictive validity

    PubMed Central

    Swanson, Brian T.; Riley, Sean P.; Cote, Mark P.; Leger, Robin R.; Moss, Isaac L.; Carlos,, John

    2016-01-01

    Background To date, no research has examined the reliability or predictive validity of manual unloading tests of the lumbar spine to identify potential responders to lumbar mechanical traction. Purpose To determine: (1) the intra and inter-rater reliability of a manual unloading test of the lumbar spine and (2) the criterion referenced predictive validity for the manual unloading test. Methods Ten volunteers with low back pain (LBP) underwent a manual unloading test to establish reliability. In a separate procedure, 30 consecutive patients with LBP (age 50·86±11·51) were assessed for pain in their most provocative standing position (visual analog scale (VAS) 49·53±25·52 mm). Patients were assessed with a manual unloading test in their most provocative position followed by a single application of intermittent mechanical traction. Post traction, pain in the provocative position was reassessed and utilized as the outcome criterion. Results The test of unloading demonstrated substantial intra and inter-rater reliability K = 1·00, P = 0·002, K = 0·737, P = 0·001, respectively. There were statistically significant within group differences for pain response following traction for patients with a positive manual unloading test (P<0·001), while patients with a negative manual unloading test did not demonstrate a statistically significant change (P>0·05). There were significant between group differences for proportion of responders to traction based on manual unloading response (P = 0·031), and manual unloading response demonstrated a moderate to strong relationship with traction response Phi = 0·443, P = 0·015. Discussion and conclusion The manual unloading test appears to be a reliable test and has a moderate to strong correlation with pain relief that exceeds minimal clinically important difference (MCID) following traction supporting the validity of this test. PMID:27559274

  4. Bayes Analysis and Reliability Implications of Stress-Rupture Testing a Kevlar/Epoxy COPV Using Temperature and Pressure Acceleration

    NASA Technical Reports Server (NTRS)

    Phoenix, S. Leigh; Kezirian, Michael T.; Murthy, Pappu L. N.

    2009-01-01

    Composite Overwrapped Pressure Vessels (COPVs) that have survived a long service time under pressure generally must be recertified before service is extended. Flight certification is dependent on the reliability analysis to quantify the risk of stress rupture failure in existing flight vessels. Full certification of this reliability model would require a statistically significant number of lifetime tests to be performed and is impractical given the cost and limited flight hardware for certification testing purposes. One approach to confirm the reliability model is to perform a stress rupture test on a flight COPV. Currently, testing of such a Kevlar49 (Dupont)/epoxy COPV is nearing completion. The present paper focuses on a Bayesian statistical approach to analyze the possible failure time results of this test and to assess the implications in choosing between possible model parameter values that in the past have had significant uncertainty. The key uncertain parameters in this case are the actual fiber stress ratio at operating pressure, and the Weibull shape parameter for lifetime; the former has been uncertain due to ambiguities in interpreting the original and a duplicate burst test. The latter has been uncertain due to major differences between COPVs in the database and the actual COPVs in service. Any information obtained that clarifies and eliminates uncertainty in these parameters will have a major effect on the predicted reliability of the service COPVs going forward. The key result is that the longer the vessel survives, the more likely the more optimistic stress ratio model is correct. At the time of writing, the resulting effect on predicted future reliability is dramatic, increasing it by about one "nine," that is, reducing the predicted probability of failure by an order of magnitude. However, testing one vessel does not change the uncertainty on the Weibull shape parameter for lifetime since testing several vessels would be necessary.

  5. Scoping study on trends in the economic value of electricity reliability to the U.S. economy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eto, Joseph; Koomey, Jonathan; Lehman, Bryan

    During the past three years, working with more than 150 organizations representing public and private stakeholders, EPRI has developed the Electricity Technology Roadmap. The Roadmap identifies several major strategic challenges that must be successfully addressed to ensure a sustainable future in which electricity continues to play an important role in economic growth. Articulation of these anticipated trends and challenges requires a detailed understanding of the role and importance of reliable electricity in different sectors of the economy. This report is intended to contribute to that understanding by analyzing key aspects of trends in the economic value of electricity reliability inmore » the U.S. economy. We first present a review of recent literature on electricity reliability costs. Next, we describe three distinct end-use approaches for tracking trends in reliability needs: (1) an analysis of the electricity-use requirements of office equipment in different commercial sectors; (2) an examination of the use of aggregate statistical indicators of industrial electricity use and economic activity to identify high reliability-requirement customer market segments; and (3) a case study of cleanrooms, which is a cross-cutting market segment known to have high reliability requirements. Finally, we present insurance industry perspectives on electricity reliability as an example of a financial tool for addressing customers' reliability needs.« less

  6. Inter-session reliability and sex-related differences in hamstrings total reaction time, pre-motor time and motor time during eccentric isokinetic contractions in recreational athlete.

    PubMed

    Ayala, Francisco; De Ste Croix, Mark; Sainz de Baranda, Pilar; Santonja, Fernando

    2014-04-01

    The purposes were twofold: (a) to ascertain the inter-session reliability of hamstrings total reaction time, pre-motor time and motor time; and (b) to examine sex-related differences in the hamstrings reaction times profile. Twenty-four men and 24 women completed the study. Biceps femoris and semitendinosus total reaction time, pre-motor time and motor time measured during eccentric isokinetic contractions were recorded on three different occasions. Inter-session reliability was examined through typical percentage error (CVTE), percentage change in the mean (CM) and intraclass correlations (ICC). For both biceps femoris and semitendinosus, total reaction time, pre-motor time and motor time measures demonstrated moderate inter-session reliability (CVTE<10%; CM<3%; ICC>0.7). The results also indicated that, although not statistically significant, women reported consistently longer hamstrings total reaction time (23.5ms), pre-motor time (12.7ms) and motor time (7.5ms) values than men. Therefore, an observed change larger than 5%, 9% and 8% for total reaction time, pre-motor time and motor time respectively from baseline scores after performing a training program would indicate that a real change was likely. Furthermore, while not statistically significant, sex differences were noted in the hamstrings reaction time profile which may play a role in the greater incidence of ACL injuries in women. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. FLiGS Score: A New Method of Outcome Assessment for Lip Carcinoma–Treated Patients

    PubMed Central

    Grassi, Rita; Toia, Francesca; Di Rosa, Luigi; Cordova, Adriana

    2015-01-01

    Background: Lip cancer and its treatment have considerable functional and cosmetic effects with resultant nutritional and physical detriments. As we continue to investigate new treatment regimens, we are simultaneously required to assess postoperative outcomes to design interventions that lessen the adverse impact of this disease process. We wish to introduce Functional Lip Glasgow Scale (FLiGS) score as a new method of outcome assessment to measure the effect of lip cancer and its treatment on patients’ daily functioning. Methods: Fifty patients affected by lip squamous cell carcinoma were recruited between 2009 and 2013. Patients were asked to fill the FLiGS questionnaire before surgery, 1 month, 6 months, and 1 year after surgery. The subscores were used to calculate a total FLiGS score of global oral disability. Statistical analysis was performed to test validity and reliability. Results: FLiGS scores improved significantly from preoperative to 12 months postoperative values (P = 0.000). Statistical evidence of validity was provided through rs (Spearman correlation coefficient) that resulted >0.30 for all surveys and for which P < 0.001. FLiGS score reliability was shown through examination of internal consistency and test-retest reliability. Conclusions: FLiGS score is a simple way of assessing functional impairment related to lip cancer before and after surgery; it is sensitive, valid, reliable, and clinically relevant: it provides useful information to orient the physician in the postoperative management and in the rehabilitation program. PMID:26034652

  8. Cross-cultural adaptation and validation of the Injury-Psychological Readiness to Return to Sport scale to Persian language.

    PubMed

    Naghdi, Soofia; Nakhostin Ansari, Noureddin; Farhadi, Yasaman; Ebadi, Safoora; Entezary, Ebrahim; Glazer, Douglas

    2016-10-01

    The aim of the present study was to develop and provide validation statistics for the Persian Injury-Psychological Readiness to Return to Sport scale (I-PRRS) following a cross-sectional and prospective cohort study design. The I-PRRS was forward/back-translated and culturally adapted into Persian language. The Persian I-PRRS was administered to 100 injured athletes (93 male; age 26.0 ± 5.6 years; time since injury 4.84 ± 6.4 months) and 50 healthy athletes (36 male; mean age 25.7 ± 6.0 years). The Persian I-PRRS was re-administered to 50 injured athletes at 1 week to examine test-retest reliability. There were no floor or ceiling effects confirming the content validity of Persian I-PRRS. The internal consistency reliability was good. Excellent test-retest reliability and agreement were demonstrated. The statistically significant difference in Persian I-PRRS total scores between the injured athletes and healthy athletes provides an evidence of discriminative validity. The Persian I-PRRS total scores were positively correlated with the Farsi Mood Scale (FARMS) total scores, showing construct validity. The principal component analysis indicated a two-factor solution consisting of "Confidence to play" and "Confidence in the injured body part and skill level". The Persian I-PRRS showed excellent reliability and validity and can be used to assess injured athletes' psychological readiness to return to sport among Persian-speaking populations.

  9. Turkish Version of Kolcaba's Immobilization Comfort Questionnaire: A Validity and Reliability Study.

    PubMed

    Tosun, Betül; Aslan, Özlem; Tunay, Servet; Akyüz, Aygül; Özkan, Hüseyin; Bek, Doğan; Açıksöz, Semra

    2015-12-01

    The purpose of this study was to determine the validity and reliability of the Turkish version of the Immobilization Comfort Questionnaire (ICQ). The sample used in this methodological study consisted of 121 patients undergoing lower extremity arthroscopy in a training and research hospital. The validity study of the questionnaire assessed language validity, structural validity and criterion validity. Structural validity was evaluated via exploratory factor analysis. Criterion validity was evaluated by assessing the correlation between the visual analog scale (VAS) scores (i.e., the comfort and pain VAS scores) and the ICQ scores using Spearman's correlation test. The Kaiser-Meyer-Olkin coefficient and Bartlett's test of sphericity were used to determine the suitability of the data for factor analysis. Internal consistency was evaluated to determine reliability. The data were analyzed with SPSS version 15.00 for Windows. Descriptive statistics were presented as frequencies, percentages, means and standard deviations. A p value ≤ .05 was considered statistically significant. A moderate positive correlation was found between the ICQ scores and the VAS comfort scores; a moderate negative correlation was found between the ICQ and the VAS pain measures in the criterion validity analysis. Cronbach α values of .75 and .82 were found for the first and second measurements, respectively. The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems. Copyright © 2015. Published by Elsevier B.V.

  10. Cultural Adaptation of the Cardiff Acne Disability Index to a Hindi Speaking Population: A Pilot Study.

    PubMed

    Gupta, Aayush; Sharma, Yugal K; Dash, K; Verma, Sampurna

    2015-01-01

    Acne vulgaris is known to impair many aspects of the quality of life (QoL) of its patients. To translate the Cardiff Acne Disability Index (CADI) from English into Hindi and to assess its validity and reliability in Hindi speaking patients with acne from India. Hindi version of CADI, translated and linguistically validated as per published international guidelines, along with a previously translated Hindi version of dermatology life quality index (DLQI) and a demographic questionnaire were administered to acne patients. The internal consistency reliability of the Hindi version of CADI and its concurrent validity were assessed by Cronbach's alpha co-efficient and Spearman's correlation co-efficient respectively. Construct validity was examined by factor analysis. Statistical analysis was carried out using the Statistical Package for the Social Sciences (SPSS) version 20 (SPSS Inc., Chicago, IL, USA) for Windows. One hundred Hindi speaking patients with various grades of acne participated in the study. Hindi version of CADI showed high internal consistency reliability (Cronbach's alpha co-efficient = 0.722). Mean item-to-total correlation co-efficient ranged from 0.502 to 0.760. Concurrent validity of the scale was supported by a significant correlation with the Hindi DLQI. Factor analysis revealed the presence of two dimensions underlying the factor structure of the scale. Hindi CADI is equivalent to the original English version and constitutes a reliable and valid tool for clinical assessment of the impact of acne on QoL.

  11. Validity and reliability of three definitions of hip osteoarthritis: cross sectional and longitudinal approach.

    PubMed

    Reijman, M; Hazes, J M W; Pols, H A P; Bernsen, R M D; Koes, B W; Bierma-Zeinstra, S M A

    2004-11-01

    To compare the reliability and validity in a large open population of three frequently used radiological definitions of hip osteoarthritis (OA): Kellgren and Lawrence grade, minimal joint space (MJS), and Croft grade; and to investigate whether the validity of the three definitions of hip OA is sex dependent. from the Rotterdam study (aged > or= 55 years, n = 3585) were evaluated. The inter-rater reliability was tested in a random set of 148 x rays. The validity was expressed as the ability to identify patients who show clinical symptoms of hip OA (construct validity) and as the ability to predict total hip replacement (THR) at follow up (predictive validity). Inter-rater reliability was similar for the Kellgren and Lawrence grade and MJS (kappa statistics 0.68 and 0.62, respectively) but lower for Croft's grade (kappa statistic, 0.51). The Kellgren and Lawrence grade and MJS showed the strongest associations with clinical symptoms of hip OA. Sex appeared to be an effect modifier for Kellgren and Lawrence and MJS definitions, women showing a stronger association between grading and symptoms than men. However, the sex dependency was attributed to differences in height between women and men. The Kellgren and Lawrence grade showed the highest predictive value for THR at follow up. Based on these findings, Kellgren and Lawrence still appears to be a useful OA definition for epidemiological studies focusing on the presence of hip OA.

  12. Rate of deaths due to child abuse and neglect in children 0-3 years of age in Germany.

    PubMed

    Banaschak, Sibylle; Janßen, Katharina; Schulte, Babette; Rothschild, Markus A

    2015-09-01

    In recent years, increasing attention has been paid to the issue of (fatal) child abuse and neglect, largely due to the media attention garnered by some headline-grabbing cases. If media statements are to be believed, such cases may be an increasing phenomenon. With these published accounts in mind, publicly available statistics should be analysed with respect to the question of whether reliable statements can be formulated based on these figures. It is hypothesised that certain data, e.g., the Innocenti report published by UNICEF in 2003, may be based on unreliable data sources. For this reason, the generation of such data, and the reliability of the data itself, should also be discussed. Our focus was on publicly available German mortality and police crime statistics (Polizeiliche Kriminalstatistik). These data were classified with respect to child age, data origin, and cause of death (murder, culpable homicide, etc.). In our opinion, the available data could not be considered in formulating reliable scientific statements about fatal child abuse and neglect, given the lack of detail and the flawed nature of the basic data. Increasing the number of autopsies of children 0-3 years of age should be considered as a means to ensure the capture of valid, practical, and reliable data. This could bring about some enlightenment and assist in the development of preemptive strategies to decrease the incidence of (fatal) child abuse and neglect.

  13. Parameter estimation techniques based on optimizing goodness-of-fit statistics for structural reliability

    NASA Technical Reports Server (NTRS)

    Starlinger, Alois; Duffy, Stephen F.; Palko, Joseph L.

    1993-01-01

    New methods are presented that utilize the optimization of goodness-of-fit statistics in order to estimate Weibull parameters from failure data. It is assumed that the underlying population is characterized by a three-parameter Weibull distribution. Goodness-of-fit tests are based on the empirical distribution function (EDF). The EDF is a step function, calculated using failure data, and represents an approximation of the cumulative distribution function for the underlying population. Statistics (such as the Kolmogorov-Smirnov statistic and the Anderson-Darling statistic) measure the discrepancy between the EDF and the cumulative distribution function (CDF). These statistics are minimized with respect to the three Weibull parameters. Due to nonlinearities encountered in the minimization process, Powell's numerical optimization procedure is applied to obtain the optimum value of the EDF. Numerical examples show the applicability of these new estimation methods. The results are compared to the estimates obtained with Cooper's nonlinear regression algorithm.

  14. Reviewing Reliability and Validity of Information for University Educational Evaluation

    NASA Astrophysics Data System (ADS)

    Otsuka, Yusaku

    To better utilize evaluations in higher education, it is necessary to share the methods of reviewing reliability and validity of examination scores and grades, and to accumulate and share data for confirming results. Before the GPA system is first introduced into a university or college, the reliability of examination scores and grades, especially for essay examinations, must be assured. Validity is a complicated concept, so should be assured in various ways, including using professional audits, theoretical models, and statistical data analysis. Because individual students and teachers are continually improving, using evaluations to appraise their progress is not always compatible with using evaluations in appraising the implementation of accountability in various departments or the university overall. To better utilize evaluations and improve higher education, evaluations should be integrated into the current system by sharing the vision of an academic learning community and promoting interaction between students and teachers based on sufficiently reliable and validated evaluation tools.

  15. Study samples are too small to produce sufficiently precise reliability coefficients.

    PubMed

    Charter, Richard A

    2003-04-01

    In a survey of journal articles, test manuals, and test critique books, the author found that a mean sample size (N) of 260 participants had been used for reliability studies on 742 tests. The distribution was skewed because the median sample size for the total sample was only 90. The median sample sizes for the internal consistency, retest, and interjudge reliabilities were 182, 64, and 36, respectively. The author presented sample size statistics for the various internal consistency methods and types of tests. In general, the author found that the sample sizes that were used in the internal consistency studies were too small to produce sufficiently precise reliability coefficients, which in turn could cause imprecise estimates of examinee true-score confidence intervals. The results also suggest that larger sample sizes have been used in the last decade compared with those that were used in earlier decades.

  16. Fatigue reliability of deck structures subjected to correlated crack growth

    NASA Astrophysics Data System (ADS)

    Feng, G. Q.; Garbatov, Y.; Guedes Soares, C.

    2013-12-01

    The objective of this work is to analyse fatigue reliability of deck structures subjected to correlated crack growth. The stress intensity factors of the correlated cracks are obtained by finite element analysis and based on which the geometry correction functions are derived. The Monte Carlo simulations are applied to predict the statistical descriptors of correlated cracks based on the Paris-Erdogan equation. A probabilistic model of crack growth as a function of time is used to analyse the fatigue reliability of deck structures accounting for the crack propagation correlation. A deck structure is modelled as a series system of stiffened panels, where a stiffened panel is regarded as a parallel system composed of plates and are longitudinal. It has been proven that the method developed here can be conveniently applied to perform the fatigue reliability assessment of structures subjected to correlated crack growth.

  17. Visual assessment of hemiplegic gait following stroke: pilot study.

    PubMed

    Hughes, K A; Bell, F

    1994-10-01

    A form that will guide clinicians through a reliable and valid visual assessment of hemiplegic gait was designed. Six hemiplegic patients were filmed walking along an instrumented walkway. These films were shown to three physiotherapists who used the form to rate the patients' gait. Each physiotherapist rated the six patients at both stages of recovery, repeating this a further two times. This resulted in 108 completed forms. Within-rater reliability is statistically significant for some raters and some individual form sections. Between-rater reliability is significant for some sections. Detailed analysis has shown that parts of the form have caused reduced reliability. These are mainly sections that ask for severity judgments or are duplicated. Some indication of normal gait should be included on the form. To test validity fully the form should be tested on a group of patients who all have significant changes in each objective gait measurement.

  18. Monitoring larval populations of the Douglas-fir tussock moth and the western spruce budworm on permanent plots: sampling methods and statistical properties of data

    Treesearch

    A.R. Mason; H.G. Paul

    1994-01-01

    Procedures for monitoring larval populations of the Douglas-fir tussock moth and the western spruce budworm are recommended based on many years experience in sampling these species in eastern Oregon and Washington. It is shown that statistically reliable estimates of larval density can be made for a population by sampling host trees in a series of permanent plots in a...

  19. Nonstationary envelope process and first excursion probability

    NASA Technical Reports Server (NTRS)

    Yang, J.

    1972-01-01

    A definition of the envelope of nonstationary random processes is proposed. The establishment of the envelope definition makes it possible to simulate the nonstationary random envelope directly. Envelope statistics, such as the density function, joint density function, moment function, and level crossing rate, which are relevent to analyses of catastrophic failure, fatigue, and crack propagation in structures, are derived. Applications of the envelope statistics to the prediction of structural reliability under random loadings are discussed in detail.

  20. PV System Component Fault and Failure Compilation and Analysis.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klise, Geoffrey Taylor; Lavrova, Olga; Gooding, Renee Lynne

    This report describes data collection and analysis of solar photovoltaic (PV) equipment events, which consist of faults and fa ilures that occur during the normal operation of a distributed PV system or PV power plant. We present summary statistics from locations w here maintenance data is being collected at various intervals, as well as reliability statistics gathered from that da ta, consisting of fault/failure distributions and repair distributions for a wide range of PV equipment types.

  1. The statistical properties of vortex flows in the solar atmosphere

    NASA Astrophysics Data System (ADS)

    Wedemeyer, Sven; Kato, Yoshiaki; Steiner, Oskar

    2015-08-01

    Rotating magnetic field structures associated with vortex flows on the Sun, also known as “magnetic tornadoes”, may serve as waveguides for MHD waves and transport mass and energy upwards through the atmosphere. Magnetic tornadoes may therefore potentially contribute to the heating of the upper atmospheric layers in quiet Sun regions.Magnetic tornadoes are observed over a large range of spatial and temporal scales in different layers in quiet Sun regions. However, their statistical properties such as size, lifetime, and rotation speed are not well understood yet because observations of these small-scale events are technically challenging and limited by the spatial and temporal resolution of current instruments. Better statistics based on a combination of high-resolution observations and state-of-the-art numerical simulations is the key to a reliable estimate of the energy input in the lower layers and of the energy deposition in the upper layers. For this purpose, we have developed a fast and reliable tool for the determination and visualization of the flow field in (observed) image sequences. This technique, which combines local correlation tracking (LCT) and line integral convolution (LIC), facilitates the detection and study of dynamic events on small scales, such as propagating waves. Here, we present statistical properties of vortex flows in different layers of the solar atmosphere and try to give realistic estimates of the energy flux which is potentially available for heating of the upper solar atmosphere

  2. The Use of Cronbach's Alpha When Developing and Reporting Research Instruments in Science Education

    NASA Astrophysics Data System (ADS)

    Taber, Keith S.

    2017-06-01

    Cronbach's alpha is a statistic commonly quoted by authors to demonstrate that tests and scales that have been constructed or adopted for research projects are fit for purpose. Cronbach's alpha is regularly adopted in studies in science education: it was referred to in 69 different papers published in 4 leading science education journals in a single year (2015)—usually as a measure of reliability. This article explores how this statistic is used in reporting science education research and what it represents. Authors often cite alpha values with little commentary to explain why they feel this statistic is relevant and seldom interpret the result for readers beyond citing an arbitrary threshold for an acceptable value. Those authors who do offer readers qualitative descriptors interpreting alpha values adopt a diverse and seemingly arbitrary terminology. More seriously, illustrative examples from the science education literature demonstrate that alpha may be acceptable even when there are recognised problems with the scales concerned. Alpha is also sometimes inappropriately used to claim an instrument is unidimensional. It is argued that a high value of alpha offers limited evidence of the reliability of a research instrument, and that indeed a very high value may actually be undesirable when developing a test of scientific knowledge or understanding. Guidance is offered to authors reporting, and readers evaluating, studies that present Cronbach's alpha statistic as evidence of instrument quality.

  3. Automated Quantification of the Landing Error Scoring System With a Markerless Motion-Capture System.

    PubMed

    Mauntel, Timothy C; Padua, Darin A; Stanley, Laura E; Frank, Barnett S; DiStefano, Lindsay J; Peck, Karen Y; Cameron, Kenneth L; Marshall, Stephen W

    2017-11-01

      The Landing Error Scoring System (LESS) can be used to identify individuals with an elevated risk of lower extremity injury. The limitation of the LESS is that raters identify movement errors from video replay, which is time-consuming and, therefore, may limit its use by clinicians. A markerless motion-capture system may be capable of automating LESS scoring, thereby removing this obstacle.   To determine the reliability of an automated markerless motion-capture system for scoring the LESS.   Cross-sectional study.   United States Military Academy.   A total of 57 healthy, physically active individuals (47 men, 10 women; age = 18.6 ± 0.6 years, height = 174.5 ± 6.7 cm, mass = 75.9 ± 9.2 kg).   Participants completed 3 jump-landing trials that were recorded by standard video cameras and a depth camera. Their movement quality was evaluated by expert LESS raters (standard video recording) using the LESS rubric and by software that automates LESS scoring (depth-camera data). We recorded an error for a LESS item if it was present on at least 2 of 3 jump-landing trials. We calculated κ statistics, prevalence- and bias-adjusted κ (PABAK) statistics, and percentage agreement for each LESS item. Interrater reliability was evaluated between the 2 expert rater scores and between a consensus expert score and the markerless motion-capture system score.   We observed reliability between the 2 expert LESS raters (average κ = 0.45 ± 0.35, average PABAK = 0.67 ± 0.34; percentage agreement = 0.83 ± 0.17). The markerless motion-capture system had similar reliability with consensus expert scores (average κ = 0.48 ± 0.40, average PABAK = 0.71 ± 0.27; percentage agreement = 0.85 ± 0.14). However, reliability was poor for 5 LESS items in both LESS score comparisons.   A markerless motion-capture system had the same level of reliability as expert LESS raters, suggesting that an automated system can accurately assess movement. Therefore, clinicians can use the markerless motion-capture system to reliably score the LESS without being limited by the time requirements of manual LESS scoring.

  4. Cycle life test. [of secondary spacecraft cells

    NASA Technical Reports Server (NTRS)

    Harkness, J. D.

    1977-01-01

    Statistical information concerning cell performance characteristics and limitations of secondary spacecraft cells is presented. Weaknesses in cell design as well as battery weaknesses encountered in various satellite programs are reported. Emphasis is placed on improving the reliability of space batteries.

  5. Accommodation in Untextured Stimulus Fields.

    DTIC Science & Technology

    1979-05-01

    that accommodation is notably inaccurate with reduced illumination, textural cue removal, or small aper ture viewing. These situational ametropias are...dark focus. Although, for any individual, large correlations exist among these ametropias , statistically reliable differen ces occur among them as well

  6. Failure rates for accelerated acceptance testing of silicon transistors

    NASA Technical Reports Server (NTRS)

    Toye, C. R.

    1968-01-01

    Extrapolation tables for the control of silicon transistor product reliability have been compiled. The tables are based on a version of the Arrhenius statistical relation and are intended to be used for low- and medium-power silicon transistors.

  7. Standards of Research.

    ERIC Educational Resources Information Center

    Crain, Robert L.; Hawley, Willis D.

    1982-01-01

    Criticizes James Coleman's study, "Public and Private Schools," and points out methodological weaknesses in sampling, testing, data reliability, and statistical methods. Questions assumptions which have led to conclusions justifying federal support, especially tuition tax credits, to private schools. Raises the issue of ethical standards…

  8. Counting the new mobile workforce

    DOT National Transportation Integrated Search

    1997-04-01

    This publication has been released by the Bureau of Transportation Statistics (BTS). The report examines existing federal surveys in order to identify those surveys to which work-at-home questions have been or can be added to generate more reliable i...

  9. Johnson Space Center's Risk and Reliability Analysis Group 2008 Annual Report

    NASA Technical Reports Server (NTRS)

    Valentine, Mark; Boyer, Roger; Cross, Bob; Hamlin, Teri; Roelant, Henk; Stewart, Mike; Bigler, Mark; Winter, Scott; Reistle, Bruce; Heydorn,Dick

    2009-01-01

    The Johnson Space Center (JSC) Safety & Mission Assurance (S&MA) Directorate s Risk and Reliability Analysis Group provides both mathematical and engineering analysis expertise in the areas of Probabilistic Risk Assessment (PRA), Reliability and Maintainability (R&M) analysis, and data collection and analysis. The fundamental goal of this group is to provide National Aeronautics and Space Administration (NASA) decisionmakers with the necessary information to make informed decisions when evaluating personnel, flight hardware, and public safety concerns associated with current operating systems as well as with any future systems. The Analysis Group includes a staff of statistical and reliability experts with valuable backgrounds in the statistical, reliability, and engineering fields. This group includes JSC S&MA Analysis Branch personnel as well as S&MA support services contractors, such as Science Applications International Corporation (SAIC) and SoHaR. The Analysis Group s experience base includes nuclear power (both commercial and navy), manufacturing, Department of Defense, chemical, and shipping industries, as well as significant aerospace experience specifically in the Shuttle, International Space Station (ISS), and Constellation Programs. The Analysis Group partners with project and program offices, other NASA centers, NASA contractors, and universities to provide additional resources or information to the group when performing various analysis tasks. The JSC S&MA Analysis Group is recognized as a leader in risk and reliability analysis within the NASA community. Therefore, the Analysis Group is in high demand to help the Space Shuttle Program (SSP) continue to fly safely, assist in designing the next generation spacecraft for the Constellation Program (CxP), and promote advanced analytical techniques. The Analysis Section s tasks include teaching classes and instituting personnel qualification processes to enhance the professional abilities of our analysts as well as performing major probabilistic assessments used to support flight rationale and help establish program requirements. During 2008, the Analysis Group performed more than 70 assessments. Although all these assessments were important, some were instrumental in the decisionmaking processes for the Shuttle and Constellation Programs. Two of the more significant tasks were the Space Transportation System (STS)-122 Low Level Cutoff PRA for the SSP and the Orion Pad Abort One (PA-1) PRA for the CxP. These two activities, along with the numerous other tasks the Analysis Group performed in 2008, are summarized in this report. This report also highlights several ongoing and upcoming efforts to provide crucial statistical and probabilistic assessments, such as the Extravehicular Activity (EVA) PRA for the Hubble Space Telescope service mission and the first fully integrated PRAs for the CxP's Lunar Sortie and ISS missions.

  10. Statistical methods for quantitative mass spectrometry proteomic experiments with labeling.

    PubMed

    Oberg, Ann L; Mahoney, Douglas W

    2012-01-01

    Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.

  11. Neural network approaches versus statistical methods in classification of multisource remote sensing data

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.

    1990-01-01

    Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.

  12. Tooth-size discrepancy: A comparison between manual and digital methods

    PubMed Central

    Correia, Gabriele Dória Cabral; Habib, Fernando Antonio Lima; Vogel, Carlos Jorge

    2014-01-01

    Introduction Technological advances in Dentistry have emerged primarily in the area of diagnostic tools. One example is the 3D scanner, which can transform plaster models into three-dimensional digital models. Objective This study aimed to assess the reliability of tooth size-arch length discrepancy analysis measurements performed on three-dimensional digital models, and compare these measurements with those obtained from plaster models. Material and Methods To this end, plaster models of lower dental arches and their corresponding three-dimensional digital models acquired with a 3Shape R700T scanner were used. All of them had lower permanent dentition. Four different tooth size-arch length discrepancy calculations were performed on each model, two of which by manual methods using calipers and brass wire, and two by digital methods using linear measurements and parabolas. Results Data were statistically assessed using Friedman test and no statistically significant differences were found between the two methods (P > 0.05), except for values found by the linear digital method which revealed a slight, non-significant statistical difference. Conclusions Based on the results, it is reasonable to assert that any of these resources used by orthodontists to clinically assess tooth size-arch length discrepancy can be considered reliable. PMID:25279529

  13. Reproducibility of ZrO2-based freeze casting for biomaterials.

    PubMed

    Naleway, Steven E; Fickas, Kate C; Maker, Yajur N; Meyers, Marc A; McKittrick, Joanna

    2016-04-01

    The processing technique of freeze casting has been intensely researched for its potential to create porous scaffold and infiltrated composite materials for biomedical implants and structural materials. However, in order for this technique to be employed medically or commercially, it must be able to reliably produce materials in great quantities with similar microstructures and properties. Here we investigate the reproducibility of the freeze casting process by independently fabricating three sets of eight ZrO2-epoxy composite scaffolds with the same processing conditions but varying solid loading (10, 15 and 20 vol.%). Statistical analyses (One-way ANOVA and Tukey's HSD tests) run upon measurements of the microstructural dimensions of these composite scaffold sets show that, while the majority of microstructures are similar, in all cases the composite scaffolds display statistically significant variability. In addition, composite scaffolds where mechanically compressed and statistically analyzed. Similar to the microstructures, almost all of their resultant properties displayed significant variability though most composite scaffolds were similar. These results suggest that additional research to improve control of the freeze casting technique is required before scaffolds and composite scaffolds can reliably be reproduced for commercial or medical applications. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. PCA as a practical indicator of OPLS-DA model reliability.

    PubMed

    Worley, Bradley; Powers, Robert

    Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.

  15. The applications of statistical quantification techniques in nanomechanics and nanoelectronics.

    PubMed

    Mai, Wenjie; Deng, Xinwei

    2010-10-08

    Although nanoscience and nanotechnology have been developing for approximately two decades and have achieved numerous breakthroughs, the experimental results from nanomaterials with a higher noise level and poorer repeatability than those from bulk materials still remain as a practical issue, and challenge many techniques of quantification of nanomaterials. This work proposes a physical-statistical modeling approach and a global fitting statistical method to use all the available discrete data or quasi-continuous curves to quantify a few targeted physical parameters, which can provide more accurate, efficient and reliable parameter estimates, and give reasonable physical explanations. In the resonance method for measuring the elastic modulus of ZnO nanowires (Zhou et al 2006 Solid State Commun. 139 222-6), our statistical technique gives E = 128.33 GPa instead of the original E = 108 GPa, and unveils a negative bias adjustment f(0). The causes are suggested by the systematic bias in measuring the length of the nanowires. In the electronic measurement of the resistivity of a Mo nanowire (Zach et al 2000 Science 290 2120-3), the proposed new method automatically identified the importance of accounting for the Ohmic contact resistance in the model of the Ohmic behavior in nanoelectronics experiments. The 95% confidence interval of resistivity in the proposed one-step procedure is determined to be 3.57 +/- 0.0274 x 10( - 5) ohm cm, which should be a more reliable and precise estimate. The statistical quantification technique should find wide applications in obtaining better estimations from various systematic errors and biased effects that become more significant at the nanoscale.

  16. The role of test-retest reliability in measuring individual and group differences in executive functioning.

    PubMed

    Paap, Kenneth R; Sawi, Oliver

    2016-12-01

    Studies testing for individual or group differences in executive functioning can be compromised by unknown test-retest reliability. Test-retest reliabilities across an interval of about one week were obtained from performance in the antisaccade, flanker, Simon, and color-shape switching tasks. There is a general trade-off between the greater reliability of single mean RT measures, and the greater process purity of measures based on contrasts between mean RTs in two conditions. The individual differences in RT model recently developed by Miller and Ulrich was used to evaluate the trade-off. Test-retest reliability was statistically significant for 11 of the 12 measures, but was of moderate size, at best, for the difference scores. The test-retest reliabilities for the Simon and flanker interference scores were lower than those for switching costs. Standard practice evaluates the reliability of executive-functioning measures using split-half methods based on data obtained in a single day. Our test-retest measures of reliability are lower, especially for difference scores. These reliability measures must also take into account possible day effects that classical test theory assumes do not occur. Measures based on single mean RTs tend to have acceptable levels of reliability and convergent validity, but are "impure" measures of specific executive functions. The individual differences in RT model shows that the impurity problem is worse than typically assumed. However, the "purer" measures based on difference scores have low convergent validity that is partly caused by deficiencies in test-retest reliability. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Statistical analysis of global horizontal solar irradiation GHI in Fez city, Morocco

    NASA Astrophysics Data System (ADS)

    Bounoua, Z.; Mechaqrane, A.

    2018-05-01

    An accurate knowledge of the solar energy reaching the ground is necessary for sizing and optimizing the performances of solar installations. This paper describes a statistical analysis of the global horizontal solar irradiation (GHI) at Fez city, Morocco. For better reliability, we have first applied a set of check procedures to test the quality of hourly GHI measurements. We then eliminate the erroneous values which are generally due to measurement or the cosine effect errors. Statistical analysis show that the annual mean daily values of GHI is of approximately 5 kWh/m²/day. Daily monthly mean values and other parameter are also calculated.

  18. Statistical Capability Study of a Helical Grinding Machine Producing Screw Rotors

    NASA Astrophysics Data System (ADS)

    Holmes, C. S.; Headley, M.; Hart, P. W.

    2017-08-01

    Screw compressors depend for their efficiency and reliability on the accuracy of the rotors, and therefore on the machinery used in their production. The machinery has evolved over more than half a century in response to customer demands for production accuracy, efficiency, and flexibility, and is now at a high level on all three criteria. Production equipment and processes must be capable of maintaining accuracy over a production run, and this must be assessed statistically under strictly controlled conditions. This paper gives numerical data from such a study of an innovative machine tool and shows that it is possible to meet the demanding statistical capability requirements.

  19. Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics

    USGS Publications Warehouse

    Lee, L.; Helsel, D.

    2005-01-01

    Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.

  20. Assessment of change in dynamic psychotherapy.

    PubMed

    Høglend, P; Bøgwald, K P; Amlo, S; Heyerdahl, O; Sørbye, O; Marble, A; Sjaastad, M C; Bentsen, H

    2000-01-01

    Five scales have been developed to assess changes that are consistent with the therapeutic rationales and procedures of dynamic psychotherapy. Seven raters evaluated 50 patients before and 36 patients again after brief dynamic psychotherapy. A factor analysis indicated that the scales represent a dimension that is discriminable from general symptoms. A summary measure, Dynamic Capacity, was rated with acceptable reliability by a single rater. However, average scores of three raters were needed for good reliability of change ratings. The scales seem to be sufficiently fine-grained to capture statistically and clinically significant changes during brief dynamic psychotherapy.

  1. PIV measurements in near-wake turbulent regions

    NASA Astrophysics Data System (ADS)

    Chen, Wei-Cheng; Chang, Keh-Chin

    2018-05-01

    Particle image velocimetry (PIV) is a non-intrusive optical diagnostic and must be made of the seedings (foreign particles) instead of the fluid itself. However, reliable PIV measurement of turbulence requires sufficient numbers of seeding falling in each interrogation window of image. A gray-level criterion is developed in this work to check the attainment of statistically stationary status of turbulent flow properties. It is suggested that the gray level of no less than 0.66 is used as the threshold for reliable PIV measurements in the present near-wake turbulent regions.

  2. Lift truck safety review

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cadwallader, L.C.

    1997-03-01

    This report presents safety information about powered industrial trucks. The basic lift truck, the counterbalanced sit down rider truck, is the primary focus of the report. Lift truck engineering is briefly described, then a hazard analysis is performed on the lift truck. Case histories and accident statistics are also given. Rules and regulations about lift trucks, such as the US Occupational Safety an Health Administration laws and the Underwriter`s Laboratories standards, are discussed. Safety issues with lift trucks are reviewed, and lift truck safety and reliability are discussed. Some quantitative reliability values are given.

  3. A new multidimensional measure of African adolescents' perceptions of teachers' behaviors.

    PubMed

    Mboya, M M

    1994-04-01

    The Perceived Teacher Behavior Inventory was designed to measure three dimensions of students' perceptions of the behaviors of their teachers. This research was conducted to assess the statistical validity and reliability of the instrument administered to 770 students attending two coeducational high schools in Cape Town, South Africa. Factor analysis clearly identified three subscales indicating that the instrument distinguished the students' perceptions of their teachers' behaviors in three areas. Estimates of internal consistency of the subscales were assessed using the squared multiple correlation as the index of reliability.

  4. 25 CFR 39.114 - What characteristics may qualify a student as gifted and talented for purposes of supplemental...

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... intellectual ability. (b) Creativity/Divergent Thinking means scoring in the top 5 percent of performance on a statistically valid and reliable measurement tool of creativity/divergent thinking. (c) Academic Aptitude...

  5. 25 CFR 39.114 - What characteristics may qualify a student as gifted and talented for purposes of supplemental...

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... intellectual ability. (b) Creativity/Divergent Thinking means scoring in the top 5 percent of performance on a statistically valid and reliable measurement tool of creativity/divergent thinking. (c) Academic Aptitude...

  6. 25 CFR 39.114 - What characteristics may qualify a student as gifted and talented for purposes of supplemental...

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... intellectual ability. (b) Creativity/Divergent Thinking means scoring in the top 5 percent of performance on a statistically valid and reliable measurement tool of creativity/divergent thinking. (c) Academic Aptitude...

  7. 25 CFR 39.114 - What characteristics may qualify a student as gifted and talented for purposes of supplemental...

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... intellectual ability. (b) Creativity/Divergent Thinking means scoring in the top 5 percent of performance on a statistically valid and reliable measurement tool of creativity/divergent thinking. (c) Academic Aptitude...

  8. Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms - Part II.

    PubMed

    Setia, Maninder Singh

    2017-01-01

    This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources.

  9. Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms – Part II

    PubMed Central

    Setia, Maninder Singh

    2017-01-01

    This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources. PMID:28584367

  10. Reliability of a computer and Internet survey (Computer User Profile) used by adults with and without traumatic brain injury (TBI).

    PubMed

    Kilov, Andrea M; Togher, Leanne; Power, Emma

    2015-01-01

    To determine test-re-test reliability of the 'Computer User Profile' (CUP) in people with and without TBI. The CUP was administered on two occasions to people with and without TBI. The CUP investigated the nature and frequency of participants' computer and Internet use. Intra-class correlation coefficients and kappa coefficients were conducted to measure reliability of individual CUP items. Descriptive statistics were used to summarize content of responses. Sixteen adults with TBI and 40 adults without TBI were included in the study. All participants were reliable in reporting demographic information, frequency of social communication and leisure activities and computer/Internet habits and usage. Adults with TBI were reliable in 77% of their responses to survey items. Adults without TBI were reliable in 88% of their responses to survey items. The CUP was practical and valuable in capturing information about social, leisure, communication and computer/Internet habits of people with and without TBI. Adults without TBI scored more items with satisfactory reliability overall in their surveys. Future studies may include larger samples and could also include an exploration of how people with/without TBI use other digital communication technologies. This may provide further information on determining technology readiness for people with TBI in therapy programmes.

  11. Estimates Of The Orbiter RSI Thermal Protection System Thermal Reliability

    NASA Technical Reports Server (NTRS)

    Kolodziej, P.; Rasky, D. J.

    2002-01-01

    In support of the Space Shuttle Orbiter post-flight inspection, structure temperatures are recorded at selected positions on the windward, leeward, starboard and port surfaces. Statistical analysis of this flight data and a non-dimensional load interference (NDLI) method are used to estimate the thermal reliability at positions were reusable surface insulation (RSI) is installed. In this analysis, structure temperatures that exceed the design limit define the critical failure mode. At thirty-three positions the RSI thermal reliability is greater than 0.999999 for the missions studied. This is not the overall system level reliability of the thermal protection system installed on an Orbiter. The results from two Orbiters, OV-102 and OV-105, are in good agreement. The original RSI designs on the OV-102 Orbital Maneuvering System pods, which had low reliability, were significantly improved on OV-105. The NDLI method was also used to estimate thermal reliability from an assessment of TPS uncertainties that was completed shortly before the first Orbiter flight. Results fiom the flight data analysis and the pre-flight assessment agree at several positions near each other. The NDLI method is also effective for optimizing RSI designs to provide uniform thermal reliability on the acreage surface of reusable launch vehicles.

  12. Reliability and validity of current physical examination techniques of the foot and ankle.

    PubMed

    Wrobel, James S; Armstrong, David G

    2008-01-01

    This literature review was undertaken to evaluate the reliability and validity of the orthopedic, neurologic, and vascular examination of the foot and ankle. We searched PubMed-the US National Library of Medicine's database of biomedical citations-and abstracts for relevant publications from 1966 to 2006. We also searched the bibliographies of the retrieved articles. We identified 35 articles to review. For discussion purposes, we used reliability interpretation guidelines proposed by others. For the kappa statistic that calculates reliability for dichotomous (eg, yes or no) measures, reliability was defined as moderate (0.4-0.6), substantial (0.6-0.8), and outstanding (> 0.8). For the intraclass correlation coefficient that calculates reliability for continuous (eg, degrees of motion) measures, reliability was defined as good (> 0.75), moderate (0.5-0.75), and poor (< 0.5). Intraclass correlations, based on the various examinations performed, varied widely. The range was from 0.08 to 0.98, depending on the examination performed. Concurrent and predictive validity ranged from poor to good. Although hundreds of articles exist describing various methods of lower-extremity assessment, few rigorously assess the measurement properties. This information can be used both by the discerning clinician in the art of clinical examination and by the scientist in the measurement properties of reproducibility and validity.

  13. Systematic review of methods for quantifying teamwork in the operating theatre

    PubMed Central

    Marshall, D.; Sykes, M.; McCulloch, P.; Shalhoub, J.; Maruthappu, M.

    2018-01-01

    Background Teamwork in the operating theatre is becoming increasingly recognized as a major factor in clinical outcomes. Many tools have been developed to measure teamwork. Most fall into two categories: self‐assessment by theatre staff and assessment by observers. A critical and comparative analysis of the validity and reliability of these tools is lacking. Methods MEDLINE and Embase databases were searched following PRISMA guidelines. Content validity was assessed using measurements of inter‐rater agreement, predictive validity and multisite reliability, and interobserver reliability using statistical measures of inter‐rater agreement and reliability. Quantitative meta‐analysis was deemed unsuitable. Results Forty‐eight articles were selected for final inclusion; self‐assessment tools were used in 18 and observational tools in 28, and there were two qualitative studies. Self‐assessment of teamwork by profession varied with the profession of the assessor. The most robust self‐assessment tool was the Safety Attitudes Questionnaire (SAQ), although this failed to demonstrate multisite reliability. The most robust observational tool was the Non‐Technical Skills (NOTECHS) system, which demonstrated both test–retest reliability (P > 0·09) and interobserver reliability (Rwg = 0·96). Conclusion Self‐assessment of teamwork by the theatre team was influenced by professional differences. Observational tools, when used by trained observers, circumvented this.

  14. Statistical tools for transgene copy number estimation based on real-time PCR.

    PubMed

    Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal

    2007-11-01

    As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.

  15. Reliability and validity of the Outcome Expectations for Exercise Scale-2.

    PubMed

    Resnick, Barbara

    2005-10-01

    Development of a reliable and valid measure of outcome expectations for exercise for older adults will help establish the relationship between outcome expectations and exercise and facilitate the development of interventions to increase physical activity in older adults. The purpose of this study was to test the reliability and validity of the Outcome Expectations for Exercise-2 Scale (OEE-2), a 13-item measure with two subscales: positive OEE (POEE) and negative OEE (NOEE). The OEE-2 scale was given to 161 residents in a continuing-care retirement community. There was some evidence of validity based on confirmatory factor analysis, Rasch-analysis INFIT and OUTFIT statistics, and convergent validity and test criterion relationships. There was some evidence for reliability of the OEE-2 based on alpha coefficients, person- and item-separation reliability indexes, and R(2)values. Based on analyses, suggested revisions are provided for future use of the OEE-2. Although ongoing reliability and validity testing are needed, the OEE-2 scale can be used to identify older adults with low outcome expectations for exercise, and interventions can then be implemented to strengthen these expectations and improve exercise behavior.

  16. [The assessment of family resources and need for help: Construct validity and reliability of the Systematic Exploration and Process Inventory for health professionals in early childhood intervention services (SEVG)].

    PubMed

    Scharmanski, Sara; Renner, Ilona

    2016-12-01

    Health professionals in early childhood intervention and prevention make an important contribution by helping burdened families with young children cope with everyday life and child raising issues. A prerequisite for success is the health professionals' ability to tailor their services to the specific needs of families. The "Systematic Exploration and Process Inventory for health professionals in early childhood intervention services (SEVG)" can be used to identify each family's individual resources and needs, enabling a valid, reliable and objective assessment of the conditions and the process of counseling service. The present paper presents the statistical analyses that were used to confirm the reliability of the inventory. Based on the results of the reliability analysis and principal component analysis (PCA), the SEVG seems to be a reliable and objective inventory for assessing families' need for support. It also allows for calculation of average values of each scale. The development of valid and reliable assessments is essential to quality assurance and the professionalization of interventions in early childhood service. Copyright © 2016. Published by Elsevier GmbH.

  17. Test-retest reliability of the scale of participation in organized activities among adolescents in the Czech Republic and Slovakia.

    PubMed

    Bosakova, Lucia; Kolarcik, Peter; Bobakova, Daniela; Sulcova, Martina; Van Dijk, Jitse P; Reijneveld, Sijmen A; Geckova, Andrea Madarasova

    2016-04-01

    Participation in organized activities is related with a range of positive outcomes, but the way such participation is measured has not been scrutinized. Test-retest reliability as an important indicator of a scale's reliability has been assessed rarely and for "The scale of participation in organized activities" lacks completely. This test-retest study is based on the Health Behaviour in School-aged Children study and is consistent with its methodology. We obtained data from 353 Czech (51.9 % boys) and 227 Slovak (52.9 % boys) primary school pupils, grades five and nine, who participated in this study in 2013. We used Cohen's kappa statistic and single measures of the intraclass correlation coefficient to estimate the test-retest reliability of all selected items in the sample, stratified by gender, age and country. We mostly observed a large correlation between the test and retest in all of the examined variables (κ ranged from 0.46 to 0.68). Test-retest reliability of the sum score of individual items showed substantial agreement (ICC = 0.64). The scale of participation in organized activities has an acceptable level of agreement, indicating good reliability.

  18. Reliability analysis of component of affination centrifugal 1 machine by using reliability engineering

    NASA Astrophysics Data System (ADS)

    Sembiring, N.; Ginting, E.; Darnello, T.

    2017-12-01

    Problems that appear in a company that produces refined sugar, the production floor has not reached the level of critical machine availability because it often suffered damage (breakdown). This results in a sudden loss of production time and production opportunities. This problem can be solved by Reliability Engineering method where the statistical approach to historical damage data is performed to see the pattern of the distribution. The method can provide a value of reliability, rate of damage, and availability level, of an machine during the maintenance time interval schedule. The result of distribution test to time inter-damage data (MTTF) flexible hose component is lognormal distribution while component of teflon cone lifthing is weibull distribution. While from distribution test to mean time of improvement (MTTR) flexible hose component is exponential distribution while component of teflon cone lifthing is weibull distribution. The actual results of the flexible hose component on the replacement schedule per 720 hours obtained reliability of 0.2451 and availability 0.9960. While on the critical components of teflon cone lifthing actual on the replacement schedule per 1944 hours obtained reliability of 0.4083 and availability 0.9927.

  19. Unlawful Discrimination DEOCS 4.1 Construct Validity Summary

    DTIC Science & Technology

    2017-08-01

    Included is a review of the 4.0 description and items, followed by the proposed modifications to the factor. The current DEOCS (4.0) contains multiple...Officer (E7 – E9) 586 10.8% Junior Officer (O1 – O3) 474 9% Senior Officer (O4 and above) 391 6.1% Descriptive Statistics and Reliability This section...displays descriptive statistics for the items on the Unlawful Discrimination scale. All items had a range from 1 to 7 (strongly disagree to strongly

  20. Symposium on the Interface: Computing Science and Statistics (20th). Theme: Computationally Intensive Methods in Statistics Held in Reston, Virginia on April 20-23, 1988

    DTIC Science & Technology

    1988-08-20

    34 William A. Link, Patuxent Wildlife Research Center "Increasing reliability of multiversion fault-tolerant software design by modulation," Junryo 3... Multiversion lault-Tolerant Software Design by Modularization Junryo Miyashita Department of Computer Science California state University at san Bernardino Fault...They shall beE refered to as " multiversion fault-tolerant software design". Onel problem of developing multi-versions of a program is the high cost

  1. Incorporating Biological Knowledge into Evaluation of Casual Regulatory Hypothesis

    NASA Technical Reports Server (NTRS)

    Chrisman, Lonnie; Langley, Pat; Bay, Stephen; Pohorille, Andrew; DeVincenzi, D. (Technical Monitor)

    2002-01-01

    Biological data can be scarce and costly to obtain. The small number of samples available typically limits statistical power and makes reliable inference of causal relations extremely difficult. However, we argue that statistical power can be increased substantially by incorporating prior knowledge and data from diverse sources. We present a Bayesian framework that combines information from different sources and we show empirically that this lets one make correct causal inferences with small sample sizes that otherwise would be impossible.

  2. Improvised Explosive Devices in Iraq, 2003-09: A Case of Operational Surprise and Institutional Response

    DTIC Science & Technology

    2011-04-01

    Finally, the reliability of the Defenselink fi gures is supported by the fact that they are generally consistent with summary statistics available...occasionally from offi cial sources, such as Congressional Research Service reports,20 and detailed statistics available from unoffi cial sources...discretion in the size of its com- mitment, or in the degree of risk it must accept—as the Coalition leaders, U.S. forces had to do the business in Iraq

  3. Significant lexical relationships

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pedersen, T.; Kayaalp, M.; Bruce, R.

    Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.

  4. Quantitative metal magnetic memory reliability modeling for welded joints

    NASA Astrophysics Data System (ADS)

    Xing, Haiyan; Dang, Yongbin; Wang, Ben; Leng, Jiancheng

    2016-03-01

    Metal magnetic memory(MMM) testing has been widely used to detect welded joints. However, load levels, environmental magnetic field, and measurement noises make the MMM data dispersive and bring difficulty to quantitative evaluation. In order to promote the development of quantitative MMM reliability assessment, a new MMM model is presented for welded joints. Steel Q235 welded specimens are tested along the longitudinal and horizontal lines by TSC-2M-8 instrument in the tensile fatigue experiments. The X-ray testing is carried out synchronously to verify the MMM results. It is found that MMM testing can detect the hidden crack earlier than X-ray testing. Moreover, the MMM gradient vector sum K vs is sensitive to the damage degree, especially at early and hidden damage stages. Considering the dispersion of MMM data, the K vs statistical law is investigated, which shows that K vs obeys Gaussian distribution. So K vs is the suitable MMM parameter to establish reliability model of welded joints. At last, the original quantitative MMM reliability model is first presented based on the improved stress strength interference theory. It is shown that the reliability degree R gradually decreases with the decreasing of the residual life ratio T, and the maximal error between prediction reliability degree R 1 and verification reliability degree R 2 is 9.15%. This presented method provides a novel tool of reliability testing and evaluating in practical engineering for welded joints.

  5. Validation of the Lithuanian Version of the Speech Handicap Index.

    PubMed

    Pribuisiene, Ruta; Liutkevicius, Vykintas; Pribuisis, Kipras; Uloza, Virgilijus

    2018-05-01

    The objective is to study the cultural adaptation and validation of the Speech Handicap Index (SHI) questionnaire to the Lithuanian language. Cultural adaptation and validation of the translated Lithuanian version of the SHI (SHI-LT) was performed as described by the Scientific Advisory Committee of the Medical Outcomes Trust. The SHI-LT was completed by 46 patients after total laryngectomy and by 60 healthy subjects of the control group. Validity and reliability of the SHI-LT were evaluated. The SHI-LT showed a statistically significant high internal consistency and test-retest reliability (Cronbach's α = 0.96-0.98). Good validity of SHI-LT was reflected by statistically significant (P < 0.001) difference between the mean scores of the patients and control groups (74.7 ± 26.9 and 5.5 ± 6.5, respectively). No age or gender dependence of SHI-LT was found (P > 0.05). Receiver operating characteristic test indicated that SHI-LT scores exceeding 17.0 points (cutoff value) distinguish patients from healthy controls, with a sensitivity of 97.8% and specificity of 95.0%. SHI-LT is considered to be a valid and reliable speech assessment tool for Lithuanian-speaking patients after laryngectomy. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  6. Tinnitus and Hearing Survey: A Screening Tool to Differentiate Bothersome Tinnitus From Hearing Difficulties

    PubMed Central

    Griest, Susan; Zaugg, Tara L.; Thielman, Emily; Kaelin, Christine; Galvez, Gino; Carlson, Kathleen F.

    2015-01-01

    Purpose Individuals complaining of tinnitus often attribute hearing problems to the tinnitus. In such cases some (or all) of their reported “tinnitus distress” may in fact be caused by trouble communicating due to hearing problems. We developed the Tinnitus and Hearing Survey (THS) as a tool to rapidly differentiate hearing problems from tinnitus problems. Method For 2 of our research studies, we administered the THS twice (mean of 16.5 days between tests) to 67 participants who did not receive intervention. These data allow for measures of statistical validation of the THS. Results Reliability of the THS was good to excellent regarding internal consistency (α = .86–.94), test–retest reliability (r = .76–.83), and convergent validity between the Tinnitus Handicap Inventory (Newman, Jacobson, & Spitzer, 1996; Newman, Sandridge, & Jacobson, 1998) and the A (Tinnitus) subscale of the THS (r = .78). Factor analysis confirmed that the 2 subscales, A (Tinnitus) and B (Hearing), have strong internal structure, explaining 71.7% of the total variance, and low correlation with each other (r = .46), resulting in a small amount of shared variance (21%). Conclusion These results provide evidence that the THS is statistically validated and reliable for use in assisting patients and clinicians in quickly (and collaboratively) determining whether intervention for tinnitus is appropriate. PMID:25551458

  7. The script concordance test in radiation oncology: validation study of a new tool to assess clinical reasoning

    PubMed Central

    Lambert, Carole; Gagnon, Robert; Nguyen, David; Charlin, Bernard

    2009-01-01

    Background The Script Concordance test (SCT) is a reliable and valid tool to evaluate clinical reasoning in complex situations where experts' opinions may be divided. Scores reflect the degree of concordance between the performance of examinees and that of a reference panel of experienced physicians. The purpose of this study is to demonstrate SCT's usefulness in radiation oncology. Methods A 90 items radiation oncology SCT was administered to 155 participants. Three levels of experience were tested: medical students (n = 70), radiation oncology residents (n = 38) and radiation oncologists (n = 47). Statistical tests were performed to assess reliability and to document validity. Results After item optimization, the test comprised 30 cases and 70 questions. Cronbach alpha was 0.90. Mean scores were 51.62 (± 8.19) for students, 71.20 (± 9.45) for residents and 76.67 (± 6.14) for radiation oncologists. The difference between the three groups was statistically significant when compared by the Kruskall-Wallis test (p < 0.001). Conclusion The SCT is reliable and useful to discriminate among participants according to their level of experience in radiation oncology. It appears as a useful tool to document the progression of reasoning during residency training. PMID:19203358

  8. Reliability testing of a portfolio assessment tool for postgraduate family medicine training in South Africa

    PubMed Central

    Mash, Bob; Derese, Anselme

    2013-01-01

    Abstract Background Competency-based education and the validity and reliability of workplace-based assessment of postgraduate trainees have received increasing attention worldwide. Family medicine was recognised as a speciality in South Africa six years ago and a satisfactory portfolio of learning is a prerequisite to sit the national exit exam. A massive scaling up of the number of family physicians is needed in order to meet the health needs of the country. Aim The aim of this study was to develop a reliable, robust and feasible portfolio assessment tool (PAT) for South Africa. Methods Six raters each rated nine portfolios from the Stellenbosch University programme, using the PAT, to test for inter-rater reliability. This rating was repeated three months later to determine test–retest reliability. Following initial analysis and feedback the PAT was modified and the inter-rater reliability again assessed on nine new portfolios. An acceptable intra-class correlation was considered to be > 0.80. Results The total score was found to be reliable, with a coefficient of 0.92. For test–retest reliability, the difference in mean total score was 1.7%, which was not statistically significant. Amongst the subsections, only assessment of the educational meetings and the logbook showed reliability coefficients > 0.80. Conclusion This was the first attempt to develop a reliable, robust and feasible national portfolio assessment tool to assess postgraduate family medicine training in the South African context. The tool was reliable for the total score, but the low reliability of several sections in the PAT helped us to develop 12 recommendations regarding the use of the portfolio, the design of the PAT and the training of raters.

  9. Routine hospital data – is it good enough for trials? An example using England’s Hospital Episode Statistics in the SHIFT trial of Family Therapy vs. Treatment as Usual in adolescents following self-harm

    PubMed Central

    Graham, Elizabeth; Cottrell, David; Farrin, Amanda

    2018-01-01

    Background: Use of routine data sources within clinical research is increasing and is endorsed by the National Institute for Health Research to increase trial efficiencies; however there is limited evidence for its use in clinical trials, especially in relation to self-harm. One source of routine data, Hospital Episode Statistics, is collated and distributed by NHS Digital and contains details of admissions, outpatient, and Accident and Emergency attendances provided periodically by English National Health Service hospitals. We explored the reliability and accuracy of Hospital Episode Statistics, compared to data collected directly from hospital records, to assess whether it would provide complete, accurate, and reliable means of acquiring hospital attendances for self-harm – the primary outcome for the SHIFT (Self-Harm Intervention: Family Therapy) trial evaluating Family Therapy for adolescents following self-harm. Methods: Participant identifiers were linked to Hospital Episode Statistics Accident and Emergency, and Admissions data, and episodes combined to describe participants’ complete hospital attendance. Attendance data were initially compared to data previously gathered by trial researchers from pre-identified hospitals. Final comparison was conducted of subsequent attendances collected through Hospital Episode Statistics and researcher follow-up. Consideration was given to linkage rates; number and proportion of attendances retrieved; reliability of Accident and Emergency, and Admissions data; percentage of self-harm episodes recorded and coded appropriately; and percentage of required data items retrieved. Results: Participants were first linked to Hospital Episode Statistics with an acceptable match rate of 95%, identifying a total of 341 complete hospital attendances, compared to 139 reported by the researchers at the time. More than double the proportion of Hospital Episode Statistics Accident and Emergency episodes could not be classified in relation to self-harm (75%) compared to 34.9% of admitted episodes, and of overall attendances, 18% were classified as self-harm related and 20% not related, while ambiguity or insufficient information meant 62% were unclassified. Of 39 self-harm-related attendances reported by the researchers, Hospital Episode Statistics identified 24 (62%) as self-harm related while 15 (38%) were unclassified. Based on final data received, 1490 complete hospital attendances were identified and comparison to researcher follow-up found Hospital Episode Statistics underestimated the number of self-harm attendances by 37.2% (95% confidence interval 32.6%–41.9%). Conclusion: Advantages of routine data collection via NHS Digital included the acquisition of more comprehensive and timely trial outcome data, identifying more than double the number of hospital attendances than researchers. Disadvantages included ambiguity in the classification of self-harm relatedness. Our resulting primary outcome data collection strategy used routine data to identify hospital attendances supplemented by targeted researcher data collection for attendances requiring further self-harm classification. PMID:29498542

  10. Reliability analysis of the AOSpine thoracolumbar spine injury classification system by a worldwide group of naïve spinal surgeons.

    PubMed

    Kepler, Christopher K; Vaccaro, Alexander R; Koerner, John D; Dvorak, Marcel F; Kandziora, Frank; Rajasekaran, Shanmuganathan; Aarabi, Bizhan; Vialle, Luiz R; Fehlings, Michael G; Schroeder, Gregory D; Reinhold, Maximilian; Schnake, Klaus John; Bellabarba, Carlo; Cumhur Öner, F

    2016-04-01

    The aims of this study were (1) to demonstrate the AOSpine thoracolumbar spine injury classification system can be reliably applied by an international group of surgeons and (2) to delineate those injury types which are difficult for spine surgeons to classify reliably. A previously described classification system of thoracolumbar injuries which consists of a morphologic classification of the fracture, a grading system for the neurologic status and relevant patient-specific modifiers was applied to 25 cases by 100 spinal surgeons from across the world twice independently, in grading sessions 1 month apart. The results were analyzed for classification reliability using the Kappa coefficient (κ). The overall Kappa coefficient for all cases was 0.56, which represents moderate reliability. Kappa values describing interobserver agreement were 0.80 for type A injuries, 0.68 for type B injuries and 0.72 for type C injuries, all representing substantial reliability. The lowest level of agreement for specific subtypes was for fracture subtype A4 (Kappa = 0.19). Intraobserver analysis demonstrated overall average Kappa statistic for subtype grading of 0.68 also representing substantial reproducibility. In a worldwide sample of spinal surgeons without previous exposure to the recently described AOSpine Thoracolumbar Spine Injury Classification System, we demonstrated moderate interobserver and substantial intraobserver reliability. These results suggest that most spine surgeons can reliably apply this system to spine trauma patients as or more reliably than previously described systems.

  11. Inter-method reliability of paper surveys and computer assisted telephone interviews in a randomized controlled trial of yoga for low back pain

    PubMed Central

    2014-01-01

    Background Little is known about the reliability of different methods of survey administration in low back pain trials. This analysis was designed to determine the reliability of responses to self-administered paper surveys compared to computer assisted telephone interviews (CATI) for the primary outcomes of pain intensity and back-related function, and secondary outcomes of patient satisfaction, SF-36, and global improvement among participants enrolled in a study of yoga for chronic low back pain. Results Pain intensity, back-related function, and both physical and mental health components of the SF-36 showed excellent reliability at all three time points; ICC scores ranged from 0.82 to 0.98. Pain medication use showed good reliability; kappa statistics ranged from 0.68 to 0.78. Patient satisfaction had moderate to excellent reliability; ICC scores ranged from 0.40 to 0.86. Global improvement showed poor reliability at 6 weeks (ICC = 0.24) and 12 weeks (ICC = 0.10). Conclusion CATI shows excellent reliability for primary outcomes and at least some secondary outcomes when compared to self-administered paper surveys in a low back pain yoga trial. Having two reliable options for data collection may be helpful to increase response rates for core outcomes in back pain trials. Trial registration ClinicalTrials.gov: NCT01761617. Date of trial registration: December 4, 2012. PMID:24716775

  12. Evaluation of tools used to measure calcium and/or dairy consumption in adults.

    PubMed

    Magarey, Anthea; Baulderstone, Lauren; Yaxley, Alison; Markow, Kylie; Miller, Michelle

    2015-05-01

    To identify and critique tools for the assessment of Ca and/or dairy intake in adults, in order to ascertain the most accurate and reliable tools available. A systematic review of the literature was conducted using defined inclusion and exclusion criteria. Articles reporting on originally developed tools or testing the reliability or validity of existing tools that measure Ca and/or dairy intake in adults were included. Author-defined criteria for reporting reliability and validity properties were applied. Studies conducted in Western countries. Adults. Thirty papers, utilising thirty-six tools assessing intake of dairy, Ca or both, were identified. Reliability testing was conducted on only two dairy and five Ca tools, with results indicating that only one dairy and two Ca tools were reliable. Validity testing was conducted for all but four Ca-only tools. There was high reliance in validity testing on lower-order tests such as correlation and failure to differentiate between statistical and clinically meaningful differences. Results of the validity testing suggest one dairy and five Ca tools are valid. Thus one tool was considered both reliable and valid for the assessment of dairy intake and only two tools proved reliable and valid for the assessment of Ca intake. While several tools are reliable and valid, their application across adult populations is limited by the populations in which they were tested. These results indicate a need for tools that assess Ca and/or dairy intake in adults to be rigorously tested for reliability and validity.

  13. Cognitive and attitudinal predictors related to graphing achievement among pre-service elementary teachers

    NASA Astrophysics Data System (ADS)

    Szyjka, Sebastian P.

    The purpose of this study was to determine the extent to which six cognitive and attitudinal variables predicted pre-service elementary teachers' performance on line graphing. Predictors included Illinois teacher education basic skills sub-component scores in reading comprehension and mathematics, logical thinking performance scores, as well as measures of attitudes toward science, mathematics and graphing. This study also determined the strength of the relationship between each prospective predictor variable and the line graphing performance variable, as well as the extent to which measures of attitude towards science, mathematics and graphing mediated relationships between scores on mathematics, reading, logical thinking and line graphing. Ninety-four pre-service elementary education teachers enrolled in two different elementary science methods courses during the spring 2009 semester at Southern Illinois University Carbondale participated in this study. Each subject completed five different instruments designed to assess science, mathematics and graphing attitudes as well as logical thinking and graphing ability. Sixty subjects provided copies of primary basic skills score reports that listed subset scores for both reading comprehension and mathematics. The remaining scores were supplied by a faculty member who had access to a database from which the scores were drawn. Seven subjects, whose scores could not be found, were eliminated from final data analysis. Confirmatory factor analysis (CFA) was conducted in order to establish validity and reliability of the Questionnaire of Attitude Toward Line Graphs in Science (QALGS) instrument. CFA tested the statistical hypothesis that the five main factor structures within the Questionnaire of Attitude Toward Statistical Graphs (QASG) would be maintained in the revised QALGS. Stepwise Regression Analysis with backward elimination was conducted in order to generate a parsimonious and precise predictive model. This procedure allowed the researcher to explore the relationships among the affective and cognitive variables that were included in the regression analysis. The results for CFA indicated that the revised QALGS measure was sound in its psychometric properties when tested against the QASG. Reliability statistics indicated that the overall reliability for the 32 items in the QALGS was .90. The learning preferences construct had the lowest reliability (.67), while enjoyment (.89), confidence (.86) and usefulness (.77) constructs had moderate to high reliabilities. The first four measurement models fit the data well as indicated by the appropriate descriptive and statistical indices. However, the fifth measurement model did not fit the data well statistically, and only fit well with two descriptive indices. The results addressing the research question indicated that mathematical and logical thinking ability were significant predictors of line graph performance among the remaining group of variables. These predictors accounted for 41% of the total variability on the line graph performance variable. Partial correlation coefficients indicated that mathematics ability accounted for 20.5% of the variance on the line graphing performance variable when removing the effect of logical thinking. The logical thinking variable accounted for 4.7% of the variance on the line graphing performance variable when removing the effect of mathematics ability.

  14. Calibration of LRFR live load factors using weigh-in-motion data.

    DOT National Transportation Integrated Search

    2006-06-01

    The Load and Resistance Factor Rating (LRFR) code for load rating bridges is based on factors calibrated from structural : load and resistance statistics to achieve a more uniform level of reliability for all bridges. The liveload factors in the : LR...

  15. Monolithic ceramic analysis using the SCARE program

    NASA Technical Reports Server (NTRS)

    Manderscheid, Jane M.

    1988-01-01

    The Structural Ceramics Analysis and Reliability Evaluation (SCARE) computer program calculates the fast fracture reliability of monolithic ceramic components. The code is a post-processor to the MSC/NASTRAN general purpose finite element program. The SCARE program automatically accepts the MSC/NASTRAN output necessary to compute reliability. This includes element stresses, temperatures, volumes, and areas. The SCARE program computes two-parameter Weibull strength distributions from input fracture data for both volume and surface flaws. The distributions can then be used to calculate the reliability of geometrically complex components subjected to multiaxial stress states. Several fracture criteria and flaw types are available for selection by the user, including out-of-plane crack extension theories. The theoretical basis for the reliability calculations was proposed by Batdorf. These models combine linear elastic fracture mechanics (LEFM) with Weibull statistics to provide a mechanistic failure criterion. Other fracture theories included in SCARE are the normal stress averaging technique and the principle of independent action. The objective of this presentation is to summarize these theories, including their limitations and advantages, and to provide a general description of the SCARE program, along with example problems.

  16. Validity, Reliability, and the Questionable Role of Psychometrics in Plastic Surgery

    PubMed Central

    2014-01-01

    Summary: This report examines the meaning of validity and reliability and the role of psychometrics in plastic surgery. Study titles increasingly include the word “valid” to support the authors’ claims. Studies by other investigators may be labeled “not validated.” Validity simply refers to the ability of a device to measure what it intends to measure. Validity is not an intrinsic test property. It is a relative term most credibly assigned by the independent user. Similarly, the word “reliable” is subject to interpretation. In psychometrics, its meaning is synonymous with “reproducible.” The definitions of valid and reliable are analogous to accuracy and precision. Reliability (both the reliability of the data and the consistency of measurements) is a prerequisite for validity. Outcome measures in plastic surgery are intended to be surveys, not tests. The role of psychometric modeling in plastic surgery is unclear, and this discipline introduces difficult jargon that can discourage investigators. Standard statistical tests suffice. The unambiguous term “reproducible” is preferred when discussing data consistency. Study design and methodology are essential considerations when assessing a study’s validity. PMID:25289354

  17. Modeling Sensor Reliability in Fault Diagnosis Based on Evidence Theory

    PubMed Central

    Yuan, Kaijuan; Xiao, Fuyuan; Fei, Liguo; Kang, Bingyi; Deng, Yong

    2016-01-01

    Sensor data fusion plays an important role in fault diagnosis. Dempster–Shafer (D-R) evidence theory is widely used in fault diagnosis, since it is efficient to combine evidence from different sensors. However, under the situation where the evidence highly conflicts, it may obtain a counterintuitive result. To address the issue, a new method is proposed in this paper. Not only the statistic sensor reliability, but also the dynamic sensor reliability are taken into consideration. The evidence distance function and the belief entropy are combined to obtain the dynamic reliability of each sensor report. A weighted averaging method is adopted to modify the conflict evidence by assigning different weights to evidence according to sensor reliability. The proposed method has better performance in conflict management and fault diagnosis due to the fact that the information volume of each sensor report is taken into consideration. An application in fault diagnosis based on sensor fusion is illustrated to show the efficiency of the proposed method. The results show that the proposed method improves the accuracy of fault diagnosis from 81.19% to 89.48% compared to the existing methods. PMID:26797611

  18. Test-retest reliability and practice effects of a rapid screen of mild traumatic brain injury.

    PubMed

    De Monte, Veronica Eileen; Geffen, Gina Malke; Kwapil, Karleigh

    2005-07-01

    Test-retest reliabilities and practice effects of measures from the Rapid Screen of Concussion (RSC), in addition to the Digit Symbol Substitution Test (Digit Symbol), were examined. Twenty five male participants were tested three times; each testing session scheduled a week apart. The test-retest reliability estimates for most measures were reasonably good, ranging from .79 to .97. An exception was the delayed word recall test, which has had a reliability estimate of .66 for the first retest, and .59 for the second retest. Practice effects were evident from Times 1 to 2 on the sentence comprehension and delayed recall subtests of the RSC, Digit Symbol and a composite score. There was also a practice effect of the same magnitude found from Time 2 to Time 3 on Digit Symbol, delayed recall and the composite score. Statistics on measures for both the first and second retest intervals, with associated practice effects, are presented to enable the calculation of reliable change indices (RCI). The RCI may be used to assess any improvement in cognitive functioning after mild Traumatic Brain Injury.

  19. Effect of bow-type initial imperfection on reliability of minimum-weight, stiffened structural panels

    NASA Technical Reports Server (NTRS)

    Stroud, W. Jefferson; Krishnamurthy, Thiagaraja; Sykes, Nancy P.; Elishakoff, Isaac

    1993-01-01

    Computations were performed to determine the effect of an overall bow-type imperfection on the reliability of structural panels under combined compression and shear loadings. A panel's reliability is the probability that it will perform the intended function - in this case, carry a given load without buckling or exceeding in-plane strain allowables. For a panel loaded in compression, a small initial bow can cause large bending stresses that reduce both the buckling load and the load at which strain allowables are exceeded; hence, the bow reduces the reliability of the panel. In this report, analytical studies on two stiffened panels quantified that effect. The bow is in the shape of a half-sine wave along the length of the panel. The size e of the bow at panel midlength is taken to be the single random variable. Several probability density distributions for e are examined to determine the sensitivity of the reliability to details of the bow statistics. In addition, the effects of quality control are explored with truncated distributions.

  20. Defining physicians' readiness to screen and manage intimate partner violence in Greek primary care settings.

    PubMed

    Papadakaki, Maria; Prokopiadou, Dimitra; Petridou, Eleni; Kogevinas, Manolis; Lionis, Christos

    2012-06-01

    The current article aims to translate the PREMIS (Physician Readiness to Manage Intimate Partner Violence) survey into the Greek language and test its validity and reliability in a sample of primary care physicians. The validation study was conducted in 2010 and involved all the general practitioners serving two adjacent prefectures of Greece (n = 80). Maximum-likelihood factor analysis (MLF) was used to extract key survey factors. The instrument was further assessed for the following psychometric properties: (a) scale reliability, (b) item-specific reliability, (c) test-retest reliability, (d) scale construct validity, and (e) internal predictive validity. The MLF analysis of 23 opinion items revealed a seven-factor solution (preparation, constraint, workplace issues, screening, self-efficacy, alcohol/drugs, victim understanding), which was statistically sound (p = .293). Most of the newly derived scales displayed satisfactory internal consistency (α ≥ .60), high item-specific reliability, strong construct, and internal predictive validity (F = 2.82; p = .004), and high repeatability when retested with 20 individuals (intraclass correlation coefficient [ICC] > .70). The tool was found appropriate to facilitate the identification of competence deficits and the evaluation of training initiatives.

  1. Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.

    PubMed

    Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn

    2014-06-01

    Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013 Published by Elsevier Inc.

  2. Experimental control in software reliability certification

    NASA Technical Reports Server (NTRS)

    Trammell, Carmen J.; Poore, Jesse H.

    1994-01-01

    There is growing interest in software 'certification', i.e., confirmation that software has performed satisfactorily under a defined certification protocol. Regulatory agencies, customers, and prospective reusers all want assurance that a defined product standard has been met. In other industries, products are typically certified under protocols in which random samples of the product are drawn, tests characteristic of operational use are applied, analytical or statistical inferences are made, and products meeting a standard are 'certified' as fit for use. A warranty statement is often issued upon satisfactory completion of a certification protocol. This paper outlines specific engineering practices that must be used to preserve the validity of the statistical certification testing protocol. The assumptions associated with a statistical experiment are given, and their implications for statistical testing of software are described.

  3. A comparison of InVivoStat with other statistical software packages for analysis of data generated from animal experiments.

    PubMed

    Clark, Robin A; Shoaib, Mohammed; Hewitt, Katherine N; Stanford, S Clare; Bate, Simon T

    2012-08-01

    InVivoStat is a free-to-use statistical software package for analysis of data generated from animal experiments. The package is designed specifically for researchers in the behavioural sciences, where exploiting the experimental design is crucial for reliable statistical analyses. This paper compares the analysis of three experiments conducted using InVivoStat with other widely used statistical packages: SPSS (V19), PRISM (V5), UniStat (V5.6) and Statistica (V9). We show that InVivoStat provides results that are similar to those from the other packages and, in some cases, are more advanced. This investigation provides evidence of further validation of InVivoStat and should strengthen users' confidence in this new software package.

  4. Interhemispheric Inhibition Measurement Reliability in Stroke: A Pilot Study

    PubMed Central

    Cassidy, Jessica M.; Chu, Haitao; Chen, Mo; Kimberley, Teresa J.; Carey, James R.

    2016-01-01

    Objective Reliable transcranial magnetic stimulation (TMS) measures for probing corticomotor excitability are important when assessing the physiological effects of non-invasive brain stimulation. The primary objective of this study was to examine test-retest reliability of an interhemispheric inhibition (IHI) index measurement in stroke. Materials and Methods Ten subjects with chronic stroke (≥ 6 months) completed two IHI testing sessions per week for three weeks (six testing sessions total). A single investigator measured IHI in the contra- to-ipsilesional primary motor cortex direction and in the opposite direction using bilateral paired-pulse TMS. Weekly sessions were separated by 24 hours with a 1-week washout period separating testing weeks. To determine if motor-evoked potential (MEP) quantification method affected measurement reliability, IHI indices computed from both MEP amplitude and area responses were found. Reliability was assessed with two-way, mixed intraclass correlation coefficients (ICC(3,k)). Standard error of measurement and minimal detectable difference statistics were also determined. Results With the exception of the initial testing week, IHI indices measured in the contra-to-ipsilesional hemisphere direction demonstrated moderate to excellent reliability (ICC = 0.725 – 0.913). Ipsi-to-contralesional IHI indices depicted poor or invalid reliability estimates throughout the three-week testing duration (ICC= −1.153 – 0.105). The overlap of ICC 95% confidence intervals suggested that IHI indices using MEP amplitude vs. area measures did not differ with respect to reliability. Conclusions IHI indices demonstrated varying magnitudes of reliability irrespective of MEP quantification method. Several strategies for improving IHI index measurement reliability are discussed. PMID:27333364

  5. Reliability of abstracting performance measures: results of the cardiac rehabilitation referral and reliability (CR3) project.

    PubMed

    Thomas, Randal J; Chiu, Jensen S; Goff, David C; King, Marjorie; Lahr, Brian; Lichtman, Steven W; Lui, Karen; Pack, Quinn R; Shahriary, Melanie

    2014-01-01

    Assessment of the reliability of performance measure (PM) abstraction is an important step in PM validation. Reliability has not been previously assessed for abstracting PMs for the referral of patients to cardiac rehabilitation (CR) and secondary prevention (SP) programs. To help validate these PMs, we carried out a multicenter assessment of their reliability. Hospitals and clinical practices from around the United States were invited to participate in the Cardiac Rehabilitation Referral Reliability (CR3) Project. Twenty-nine hospitals and 23 outpatient centers expressed interest in participating. Seven hospitals and 6 outpatient centers met participation criteria and submitted completed data. Site coordinators identified 35 patients whose charts were reviewed by 2 site abstractors twice, 1 week apart. Percent agreement and the Cohen κ statistic were used to describe intra- and interabstractor reliability for patient eligibility for CR/SP, patient exceptions for CR/SP referral, and documented referral to CR/SP. Results were obtained from within-site data, as well as from pooled data of all inpatient and all outpatient sites. We found that intra-abstractor reliability reflected excellent repeatability (≥ 90% agreement; κ ≥ 0.75) for ratings of CR/SP eligibility, exceptions, and referral, both from pooled and site-specific analyses of inpatient and outpatient data. Similarly, the interabstractor agreement from pooled analysis ranged from good to excellent for the 3 items, although with slightly lower measures of reliability. Abstraction of PMs for CR/SP referral has high reliability, supporting the use of these PMs in quality improvement initiatives aimed at increasing CR/SP delivery to patients with cardiovascular disease.

  6. Multisite Reliability of Cognitive BOLD Data

    PubMed Central

    Brown, Gregory G.; Mathalon, Daniel H.; Stern, Hal; Ford, Judith; Mueller, Bryon; Greve, Douglas N.; McCarthy, Gregory; Voyvodic, Jim; Glover, Gary; Diaz, Michele; Yetter, Elizabeth; Burak Ozyurt, I.; Jorgensen, Kasper W.; Wible, Cynthia G.; Turner, Jessica A.; Thompson, Wesley K.; Potkin, Steven G.

    2010-01-01

    Investigators perform multi-site functional magnetic resonance imaging studies to increase statistical power, to enhance generalizability, and to improve the likelihood of sampling relevant subgroups. Yet undesired site variation in imaging methods could off-set these potential advantages. We used variance components analysis to investigate sources of variation in the blood oxygen level dependent (BOLD) signal across four 3T magnets in voxelwise and region of interest (ROI) analyses. Eighteen participants traveled to four magnet sites to complete eight runs of a working memory task involving emotional or neutral distraction. Person variance was more than 10 times larger than site variance for five of six ROIs studied. Person-by-site interactions, however, contributed sizable unwanted variance to the total. Averaging over runs increased between-site reliability, with many voxels showing good to excellent between-site reliability when eight runs were averaged and regions of interest showing fair to good reliability. Between-site reliability depended on the specific functional contrast analyzed in addition to the number of runs averaged. Although median effect size was correlated with between-site reliability, dissociations were observed for many voxels. Brain regions where the pooled effect size was large but between-site reliability was poor were associated with reduced individual differences. Brain regions where the pooled effect size was small but between-site reliability was excellent were associated with a balance of participants who displayed consistently positive or consistently negative BOLD responses. Although between-site reliability of BOLD data can be good to excellent, acquiring highly reliable data requires robust activation paradigms, ongoing quality assurance, and careful experimental control. PMID:20932915

  7. [Reliability and reproducibility of the Fitzpatrick phototype scale for skin sensitivity to ultraviolet light].

    PubMed

    Sánchez, Guillermo; Nova, John; Arias, Nilsa; Peña, Bibiana

    2008-12-01

    The Fitzpatrick phototype scale has been used to determine skin sensitivity to ultraviolet light. The reliability of this scale in estimating sensitivity permits risk evaluation of skin cancer based on phototype. Reliability and changes in intra and inter-observer concordance was determined for the Fitzpatrick phototype scale after the assessment methods for establishing the phototype were standardized. An analytical study of intra and inter-observer concordance was performed. The Fitzpatrick phototype scale was standardized using focus group methodology. To determine intra and inter-observer agreement, the weighted kappa statistical method was applied. The standardization effect was measured using the equal kappa contrast hypothesis and Wald test for dependent measurements. The phototype scale was applied to 155 patients over 15 years of age who were assessed four times by two independent observers. The sample was drawn from patients of the Centro Dermatol6gico Federico Lleras Acosta. During the pre-standardization phase, the baseline and six-week inter-observer weighted kappa were 0.31 and 0.40, respectively. The intra-observer kappa values for observers A and B were 0.47 and 0.51, respectively. After the standardization process, the baseline and six-week inter-observer weighted kappa values were 0.77, and 0.82, respectively. Intra-observer kappa coefficients for observers A and B were 0.78 and 0.82. Statistically significant differences were found between coefficients before and after standardization (p<0.001) in all comparisons. Following a standardization exercise, the Fitzpatrick phototype scale yielded reliable, reproducible and consistent results.

  8. Inter-rater reliability of the German version of the Nurses' Global Assessment of Suicide Risk scale.

    PubMed

    Kozel, Bernd; Grieser, Manuela; Abderhalden, Christoph; Cutcliffe, John R

    2016-10-01

    In comparison to the general population, the suicide rates of psychiatric inpatient populations in Germany and Switzerland are very high. An important preventive contribution to the lowering of the suicide rates in mental health care is to ensure that the risk of suicide of psychiatric inpatients is assessed as accurately as possible. While risk-assessment instruments can serve an important function in determining such risk, very few have been translated to German. Therefore, in the present study, we reported on the German version of Nurses' Global Assessment of Suicide Risk (NGASR) scale. After translating the original instrument into German and pretesting the German version, we tested the inter-rater reliability of the instrument. Twelve video case studies were evaluated by 13 raters with the NGASR scale in a 'laboratory' trial. In each case, the observer's agreement was calculated for the single items, the overall scale, the risk levels, and the sum scores. The statistical data analysis was conducted with kappa and AC1 statistics for dichotomous (items, scale) scales. A high-to-very high observers' agreement (AC1: 0.62-1.00, kappa: 0.00-1.00) was determined for 16 items of the German version of the NGASR scale. We conclude that the German version of the NGASR scale is a reliable instrument for evaluating risk factors for suicide. A reliable application in the clinical practise appears to be enhanced by training in the use of the instrument and the right implementation instructions. © 2016 Australian College of Mental Health Nurses Inc.

  9. Validity and reliability of CHOICE Health Experience Questionnaire: Thai version.

    PubMed

    Aiyasanon, Nipa; Premasathian, Nalinee; Nimmannit, Akarin; Jetanavanich, Pantip; Sritippayawan, Suchai

    2009-09-01

    Assess the reliability and validity of the Thai translation of the CHOICE Health Experience Questionnaire (CHEQ), which is the English-language questionnaire, developed specifically for End-stage-renal disease (ESRD) patients. The CHEQ comprised of two parts, nine general domains of SF-36 (physical function, role-physical, bodily pain, mental health, role-emotional, social function, vitality, general health, and report transition) and 16 dialysis specific domains of the CHEQ (role-physical, mental health, general health, freedom, travel restriction, cognitive function, financial function, restriction diet and fluids, recreation, work, body image, symptoms, sex, sleep, access, and quality of life). The authors translated the CHEQ questionnaire into Thai and confirmed the accuracy by back translation. Pilot study sample was 10 Thai ESRD patients. Then the CHEQ (Thai) was applied to 110 Thai ESRD patients. Twenty-three patients had chronic peritoneal dialysis patients and 87 were chronic intermittent hemodialysis patients. Statistical analysis included descriptive statistics, Mann-Whitney U test, Student's t-test, and Cronbach's alpha. Construct validity was satisfactory with the significant difference less than 0.001 between the low and high group. The reliability coefficient for the Cronbach's alpha of the total scale of the CHEQ (Thai) was 0.98. The Cronbach 's alphas were greater than 0.7 for all domains, range from 0.58 to 0.92, except the social function and quality of life domain (alpha = 0.66 and 0.575). The CHEQ (Thai) is reliable and valid for assessment of Thai ESRD patients receiving chronic dialysis. Its properties are similar to those reported in the original version.

  10. Validity and Reliability of the Turkish Version for DSM-5 Level 2 Anger Scale (Child Form for Children Aged 11-17 Years and Parent Form for Children Aged 6-17 Years).

    PubMed

    Yalin Sapmaz, Şermin; Özek Erkuran, Handan; Yalin, Nefize; Önen, Özlem; Öztekin, Siğnem; Kavurma, Canem; Köroğlu, Ertuğrul; Aydemir, Ömer

    2017-12-01

    This study aimed to assess the validity and reliability of the Turkish version of Diagnostic and Statistical Manual of Mental Disorders (DSM-5) Level 2 Anger Scale. The scale was prepared by translation and back translation of DSM-5 Level 2 Anger Scale. Study groups consisted of a clinical sample of cases diagnosed with depressive disorder and treated in a child and adolescent psychiatry unit and a community sample. The study was continued with 218 children and 160 parents. In the assessment process, child and parent forms of DSM-5 Level 2 Anger Scale and Children's Depression Inventory and Strengths and Difficulties Questionnaire-Parent Form were used. In the reliability analyses, the Cronbach alpha internal consistency coefficient values were found very high regarding child and parent forms. Item-total score correlation coefficients were high and very high, respectively, for child and parent forms indicating a statistical significance. As for construct validity, one factor was maintained for each form and was found to be consistent with the original form of the scale. As for concurrent validity, the child form of the scale showed significant correlation with Children's Depression Inventory, while the parent form showed significant correlation with Strengths and Difficulties Questionnaire-Parent Form. It was found that the Turkish version of DSM-5 Level 2 Anger Scale could be utilized as a valid and reliable tool both in clinical practice and for research purposes.

  11. Do downscaled general circulation models reliably simulate historical climatic conditions?

    USGS Publications Warehouse

    Bock, Andrew R.; Hay, Lauren E.; McCabe, Gregory J.; Markstrom, Steven L.; Atkinson, R. Dwight

    2018-01-01

    The accuracy of statistically downscaled (SD) general circulation model (GCM) simulations of monthly surface climate for historical conditions (1950–2005) was assessed for the conterminous United States (CONUS). The SD monthly precipitation (PPT) and temperature (TAVE) from 95 GCMs from phases 3 and 5 of the Coupled Model Intercomparison Project (CMIP3 and CMIP5) were used as inputs to a monthly water balance model (MWBM). Distributions of MWBM input (PPT and TAVE) and output [runoff (RUN)] variables derived from gridded station data (GSD) and historical SD climate were compared using the Kolmogorov–Smirnov (KS) test For all three variables considered, the KS test results showed that variables simulated using CMIP5 generally are more reliable than those derived from CMIP3, likely due to improvements in PPT simulations. At most locations across the CONUS, the largest differences between GSD and SD PPT and RUN occurred in the lowest part of the distributions (i.e., low-flow RUN and low-magnitude PPT). Results indicate that for the majority of the CONUS, there are downscaled GCMs that can reliably simulate historical climatic conditions. But, in some geographic locations, none of the SD GCMs replicated historical conditions for two of the three variables (PPT and RUN) based on the KS test, with a significance level of 0.05. In these locations, improved GCM simulations of PPT are needed to more reliably estimate components of the hydrologic cycle. Simple metrics and statistical tests, such as those described here, can provide an initial set of criteria to help simplify GCM selection.

  12. Latent structure and reliability analysis of the measure of body apperception: cross-validation for head and neck cancer patients.

    PubMed

    Jean-Pierre, Pascal; Fundakowski, Christopher; Perez, Enrique; Jean-Pierre, Shadae E; Jean-Pierre, Ashley R; Melillo, Angelica B; Libby, Rachel; Sargi, Zoukaa

    2013-02-01

    Cancer and its treatments are associated with psychological distress that can negatively impact self-perception, psychosocial functioning, and quality of life. Patients with head and neck cancers (HNC) are particularly susceptible to psychological distress. This study involved a cross-validation of the Measure of Body Apperception (MBA) for HNC patients. One hundred and twenty-two English-fluent HNC patients between 20 and 88 years of age completed the MBA on a Likert scale ranging from "1 = disagree" to "4 = agree." We assessed the latent structure and internal consistency reliability of the MBA using Principal Components Analysis (PCA) and Cronbach's coefficient alpha (α), respectively. We determined convergent and divergent validities of the MBA using correlations with the Hospital Anxiety and Depression Scale (HADS), observer disfigurement rating, and patients' clinical and demographic variables. The PCA revealed a coherent set of items that explained 38 % of the variance. The Kaiser-Meyer-Olkin measure of sampling adequacy was 0.73 and the Bartlett's test of sphericity was statistically significant (χ (2) (28) = 253.64; p < 0.001), confirming the suitability of the data for dimension reduction analysis. The MBA had good internal consistency reliability (α = 0.77) and demonstrated adequate convergent and divergent validities based on statistically significant moderate correlations with the HADS (p < 0.01) and observer rating of disfigurement (p < 0.026) and nonstatistically significant correlations with patients' clinical and demographic variables: tumor location, age at diagnosis, and birth place (all p (s) > 0.05). The MBA is a valid and reliable screening measure of body apperception for HNC patients.

  13. Estimating mortality using data from civil registration: a cross-sectional study in India

    PubMed Central

    Rao, Chalapati; Lakshmi, PVM; Prinja, Shankar; Kumar, Rajesh

    2016-01-01

    Abstract Objective To analyse the design and operational status of India’s civil registration and vital statistics system and facilitate the system’s development into an accurate and reliable source of mortality data. Methods We assessed the national civil registration and vital statistics system’s legal framework, administrative structure and design through document review. We did a cross-sectional study for the year 2013 at national level and in Punjab state to assess the quality of the system’s mortality data through analyses of life tables and investigation of the completeness of death registration and the proportion of deaths assigned ill-defined causes. We interviewed registrars, medical officers and coders in Punjab state to assess their knowledge and practice. Findings Although we found the legal framework and system design to be appropriate, data collection was based on complex intersectoral collaborations at state and local level and the collected data were found to be of poor quality. The registration data were inadequate for a robust estimate of mortality at national level. A medically certified cause of death was only recorded for 965 992 (16.8%) of the 5 735 082 deaths registered. Conclusion The data recorded by India’s civil registration and vital statistics system in 2011 were incomplete. If improved, the system could be used to reliably estimate mortality. We recommend improving political support and intersectoral coordination, capacity building, computerization and state-level initiatives to ensure that every death is registered and that reliable causes of death are recorded – at least within an adequate sample of registration units within each state. PMID:26769992

  14. Identification of Nasal Bone Fractures on Conventional Radiography and Facial CT: Comparison of the Diagnostic Accuracy in Different Imaging Modalities and Analysis of Interobserver Reliability.

    PubMed

    Baek, Hye Jin; Kim, Dong Wook; Ryu, Ji Hwa; Lee, Yoo Jin

    2013-09-01

    There has been no study to compare the diagnostic accuracy of an experienced radiologist with a trainee in nasal bone fracture. To compare the diagnostic accuracy between conventional radiography and computed tomography (CT) for the identification of nasal bone fractures and to evaluate the interobserver reliability between a staff radiologist and a trainee. A total of 108 patients who underwent conventional radiography and CT after acute nasal trauma were included in this retrospective study. Two readers, a staff radiologist and a second-year resident, independently assessed the results of the imaging studies. Of the 108 patients, the presence of a nasal bone fracture was confirmed in 88 (81.5%) patients. The number of non-depressed fractures was higher than the number of depressed fractures. In nine (10.2%) patients, nasal bone fractures were only identified on conventional radiography, including three depressed and six non-depressed fractures. CT was more accurate as compared to conventional radiography for the identification of nasal bone fractures as determined by both readers (P <0.05), all diagnostic indices of an experienced radiologist were similar to or higher than those of a trainee, and κ statistics showed moderate agreement between the two diagnostic tools for both readers. There was no statistical difference in the assessment of interobserver reliability for both imaging modalities in the identification of nasal bone fractures. For the identification of nasal bone fractures, CT was significantly superior to conventional radiography. Although a staff radiologist showed better values in the identification of nasal bone fracture and differentiation between depressed and non-depressed fractures than a trainee, there was no statistically significant difference in the interpretation of conventional radiography and CT between a radiologist and a trainee.

  15. A new statistical framework to assess structural alignment quality using information compression

    PubMed Central

    Collier, James H.; Allison, Lloyd; Lesk, Arthur M.; Garcia de la Banda, Maria; Konagurthu, Arun S.

    2014-01-01

    Motivation: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of structural alignment programs. The lack of consensus on how the quality of structural alignments must be assessed has been identified as the main cause for the observed differences. Current methods assess structural alignment quality by constructing a scoring function that attempts to balance conflicting criteria, mainly alignment coverage and fidelity of structures under superposition. This traditional approach to measuring alignment quality, the subject of considerable literature, has failed to solve the problem. Further development along the same lines is unlikely to rectify the current deficiencies in the field. Results: This paper proposes a new statistical framework to assess structural alignment quality and significance based on lossless information compression. This is a radical departure from the traditional approach of formulating scoring functions. It links the structural alignment problem to the general class of statistical inductive inference problems, solved using the information-theoretic criterion of minimum message length. Based on this, we developed an efficient and reliable measure of structural alignment quality, I-value. The performance of I-value is demonstrated in comparison with a number of popular scoring functions, on a large collection of competing alignments. Our analysis shows that I-value provides a rigorous and reliable quantification of structural alignment quality, addressing a major gap in the field. Availability: http://lcb.infotech.monash.edu.au/I-value Contact: arun.konagurthu@monash.edu Supplementary information: Online supplementary data are available at http://lcb.infotech.monash.edu.au/I-value/suppl.html PMID:25161241

  16. Three-column classification and Schatzker classification: a three- and two-dimensional computed tomography characterisation and analysis of tibial plateau fractures.

    PubMed

    Patange Subba Rao, Sheethal Prasad; Lewis, James; Haddad, Ziad; Paringe, Vishal; Mohanty, Khitish

    2014-10-01

    The aim of the study was to evaluate inter-observer reliability and intra-observer reproducibility between the three-column classification and Schatzker classification systems using 2D and 3D CT models. Fifty-two consecutive patients with tibial plateau fractures were evaluated by five orthopaedic surgeons. All patients were classified into Schatzker and three-column classification systems using x-rays and 2D and 3D CT images. The inter-observer reliability was evaluated in the first round and the intra-observer reliability was determined during the second round 2 weeks later. The average intra-observer reproducibility for the three-column classification was from substantial to excellent in all sub classifications, as compared with Schatzker classification. The inter-observer kappa values increased from substantial to excellent in three-column classification and to moderate in Schatzker classification The average values for three-column classification for all the categories are as follows: (I-III) k2D = 0.718, 95% CI 0.554-0.864, p < 0.0001 and average 3D = 0.874, 95% CI 0.754-0.890, p < 0.0001. For Schatzker classification system, the average values for all six categories are as follows: (I-VI) k2D = 0.536, 95% CI 0.365-0.685, p < 0.0001 and average k3D = 0.552 95% CI 0.405-0.700, p < 0.0001. The values are statistically significant. Statistically significant inter-observer values in both rounds were noted with the three-column classification, making it statistically an excellent agreement. The intra-observer reproducibility for the three-column classification improved as compared with the Schatzker classification. The three-column classification seems to be an effective way to characterise and classify fractures of tibial plateau.

  17. How one might miss early warning signals of critical transitions in time series data: A systematic study of two major currency pairs

    PubMed Central

    Ciamarra, Massimo Pica; Cheong, Siew Ann

    2018-01-01

    There is growing interest in the use of critical slowing down and critical fluctuations as early warning signals for critical transitions in different complex systems. However, while some studies found them effective, others found the opposite. In this paper, we investigated why this might be so, by testing three commonly used indicators: lag-1 autocorrelation, variance, and low-frequency power spectrum at anticipating critical transitions in the very-high-frequency time series data of the Australian Dollar-Japanese Yen and Swiss Franc-Japanese Yen exchange rates. Besides testing rising trends in these indicators at a strict level of confidence using the Kendall-tau test, we also required statistically significant early warning signals to be concurrent in the three indicators, which must rise to appreciable values. We then found for our data set the optimum parameters for discovering critical transitions, and showed that the set of critical transitions found is generally insensitive to variations in the parameters. Suspecting that negative results in the literature are the results of low data frequencies, we created time series with time intervals over three orders of magnitude from the raw data, and tested them for early warning signals. Early warning signals can be reliably found only if the time interval of the data is shorter than the time scale of critical transitions in our complex system of interest. Finally, we compared the set of time windows with statistically significant early warning signals with the set of time windows followed by large movements, to conclude that the early warning signals indeed provide reliable information on impending critical transitions. This reliability becomes more compelling statistically the more events we test. PMID:29538373

  18. How one might miss early warning signals of critical transitions in time series data: A systematic study of two major currency pairs.

    PubMed

    Wen, Haoyu; Ciamarra, Massimo Pica; Cheong, Siew Ann

    2018-01-01

    There is growing interest in the use of critical slowing down and critical fluctuations as early warning signals for critical transitions in different complex systems. However, while some studies found them effective, others found the opposite. In this paper, we investigated why this might be so, by testing three commonly used indicators: lag-1 autocorrelation, variance, and low-frequency power spectrum at anticipating critical transitions in the very-high-frequency time series data of the Australian Dollar-Japanese Yen and Swiss Franc-Japanese Yen exchange rates. Besides testing rising trends in these indicators at a strict level of confidence using the Kendall-tau test, we also required statistically significant early warning signals to be concurrent in the three indicators, which must rise to appreciable values. We then found for our data set the optimum parameters for discovering critical transitions, and showed that the set of critical transitions found is generally insensitive to variations in the parameters. Suspecting that negative results in the literature are the results of low data frequencies, we created time series with time intervals over three orders of magnitude from the raw data, and tested them for early warning signals. Early warning signals can be reliably found only if the time interval of the data is shorter than the time scale of critical transitions in our complex system of interest. Finally, we compared the set of time windows with statistically significant early warning signals with the set of time windows followed by large movements, to conclude that the early warning signals indeed provide reliable information on impending critical transitions. This reliability becomes more compelling statistically the more events we test.

  19. An Exercise in Exploring Big Data for Producing Reliable Statistical Information.

    PubMed

    Rey-Del-Castillo, Pilar; Cardeñosa, Jesús

    2016-06-01

    The availability of copious data about many human, social, and economic phenomena is considered an opportunity for the production of official statistics. National statistical organizations and other institutions are more and more involved in new projects for developing what is sometimes seen as a possible change of paradigm in the way statistical figures are produced. Nevertheless, there are hardly any systems in production using Big Data sources. Arguments of confidentiality, data ownership, representativeness, and others make it a difficult task to get results in the short term. Using Call Detail Records from Ivory Coast as an illustration, this article shows some of the issues that must be dealt with when producing statistical indicators from Big Data sources. A proposal of a graphical method to evaluate one specific aspect of the quality of the computed figures is also presented, demonstrating that the visual insight provided improves the results obtained using other traditional procedures.

  20. Development and validity of a scale to measure workplace culture of health.

    PubMed

    Kwon, Youngbum; Marzec, Mary L; Edington, Dee W

    2015-05-01

    To describe the development of and test the validity and reliability of the Workplace Culture of Health (COH) scale. Exploratory factor analysis and confirmatory factor analysis were performed on data from a health care organization (N = 627). To verify the factor structure, confirmatory factor analysis was performed on a second data set from a medical equipment manufacturer (N = 226). The COH scale included a structure of five orthogonal factors: senior leadership and polices, programs and rewards, quality assurance, supervisor support, and coworker support. With regard to construct validity (convergent and discriminant) and reliability, two different US companies showed the same factorial structure, satisfactory fit statistics, and suitable internal and external consistency. The COH scale represents a reliable and valid scale to assess the workplace environment and culture for supporting health.

  1. Optimal periodic proof test based on cost-effective and reliability criteria

    NASA Technical Reports Server (NTRS)

    Yang, J.-N.

    1976-01-01

    An exploratory study for the optimization of periodic proof tests for fatigue-critical structures is presented. The optimal proof load level and the optimal number of periodic proof tests are determined by minimizing the total expected (statistical average) cost, while the constraint on the allowable level of structural reliability is satisfied. The total expected cost consists of the expected cost of proof tests, the expected cost of structures destroyed by proof tests, and the expected cost of structural failure in service. It is demonstrated by numerical examples that significant cost saving and reliability improvement for fatigue-critical structures can be achieved by the application of the optimal periodic proof test. The present study is relevant to the establishment of optimal maintenance procedures for fatigue-critical structures.

  2. Fracture mechanics concepts in reliability analysis of monolithic ceramics

    NASA Technical Reports Server (NTRS)

    Manderscheid, Jane M.; Gyekenyesi, John P.

    1987-01-01

    Basic design concepts for high-performance, monolithic ceramic structural components are addressed. The design of brittle ceramics differs from that of ductile metals because of the inability of ceramic materials to redistribute high local stresses caused by inherent flaws. Random flaw size and orientation requires that a probabilistic analysis be performed in order to determine component reliability. The current trend in probabilistic analysis is to combine linear elastic fracture mechanics concepts with the two parameter Weibull distribution function to predict component reliability under multiaxial stress states. Nondestructive evaluation supports this analytical effort by supplying data during verification testing. It can also help to determine statistical parameters which describe the material strength variation, in particular the material threshold strength (the third Weibull parameter), which in the past was often taken as zero for simplicity.

  3. Reliability and validity of an Internet traumatic stress survey with a college student sample.

    PubMed

    Fortson, Beverly L; Scotti, Joseph R; Del Ben, Kevin S; Chen, Yi-Chuen

    2006-10-01

    The reliability and validity of Internet-based questionnaires were assessed in a sample of undergraduates (N = 411) by comparing data collected via the Internet with data collected in a more traditional format. A 2 x 2 x 2 repeated measures factorial design was used, forming four groups: Paper-Paper, Paper-Internet, Internet-Paper, and Internet-Internet. Scores on measures of trauma exposure, depression, and posttraumatic stress symptoms formed the dependent variables. Statistical analyses demonstrated that the psychometric properties of Internet-based questionnaires are similar to those established via formats that are more traditional. Questionnaire format and presentation order did not affect rates of psychological symptoms endorsed by participants. Researchers can feel comfortable that Internet data collection is a viable--and reliable--means for conducting trauma research.

  4. Transmission overhaul estimates for partial and full replacement at repair

    NASA Technical Reports Server (NTRS)

    Savage, M.; Lewicki, D. G.

    1991-01-01

    Timely transmission overhauls increase in-flight service reliability greater than the calculated design reliabilities of the individual aircraft transmission components. Although necessary for aircraft safety, transmission overhauls contribute significantly to aircraft expense. Predictions of a transmission's maintenance needs at the design stage should enable the development of more cost effective and reliable transmissions in the future. The frequency is estimated of overhaul along with the number of transmissions or components needed to support the overhaul schedule. Two methods based on the two parameter Weibull statistical distribution for component life are used to estimate the time between transmission overhauls. These methods predict transmission lives for maintenance schedules which repair the transmission with a complete system replacement or repair only failed components of the transmission. An example illustrates the methods.

  5. Psychometric evaluation of the Persian version of the Templer's Death Anxiety Scale in cancer patients.

    PubMed

    Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid

    2016-10-01

    In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.

  6. Measuring the severity of topical 5-fluorouracil toxicity.

    PubMed

    Korgavkar, Kaveri; Firoz, Elnaz F; Xiong, Michael; Lew, Robert; Marcolivio, Kimberly; Burnside, Nancy; Dyer, Robert; Weinstock, Martin A

    2014-01-01

    Topical 5% 5-fluorouracil (5-FU) is known to cause toxicity, such as erythema, pain, and crusting/erosions. We sought to develop a scale to measure this toxicity and test the scale for reliability. A scale was developed involving four parameters: erythema severity, percentage of face involved in erythema, crusting/erosions severity, and percentage of face involved in crusting/erosions. Thirteen raters graded 99 sets of photographs from the Veterans Affairs Keratinocyte Carcinoma Chemoprevention (VAKCC) Trial using these parameters. Intraclass correlation overall for 13 raters was 0.82 (95% CI 0.77-0.86). There was no statistically significant trend in reliability by level of training in dermatology. This scale is a reliable method of evaluating the severity of toxicity from topical 5-fluorouracil and can be used by dermatologists and nondermatologists alike.

  7. The reliability-quality relationship for quality systems and quality risk management.

    PubMed

    Claycamp, H Gregg; Rahaman, Faiad; Urban, Jason M

    2012-01-01

    Engineering reliability typically refers to the probability that a system, or any of its components, will perform a required function for a stated period of time and under specified operating conditions. As such, reliability is inextricably linked with time-dependent quality concepts, such as maintaining a state of control and predicting the chances of losses from failures for quality risk management. Two popular current good manufacturing practice (cGMP) and quality risk management tools, failure mode and effects analysis (FMEA) and root cause analysis (RCA) are examples of engineering reliability evaluations that link reliability with quality and risk. Current concepts in pharmaceutical quality and quality management systems call for more predictive systems for maintaining quality; yet, the current pharmaceutical manufacturing literature and guidelines are curiously silent on engineering quality. This commentary discusses the meaning of engineering reliability while linking the concept to quality systems and quality risk management. The essay also discusses the difference between engineering reliability and statistical (assay) reliability. The assurance of quality in a pharmaceutical product is no longer measured only "after the fact" of manufacturing. Rather, concepts of quality systems and quality risk management call for designing quality assurance into all stages of the pharmaceutical product life cycle. Interestingly, most assays for quality are essentially static and inform product quality over the life cycle only by being repeated over time. Engineering process reliability is the fundamental concept that is meant to anticipate quality failures over the life cycle of the product. Reliability is a well-developed theory and practice for other types of manufactured products and manufacturing processes. Thus, it is well known to be an appropriate index of manufactured product quality. This essay discusses the meaning of reliability and its linkages with quality systems and quality risk management.

  8. Reliability of physical examination for diagnosis of myofascial trigger points: a systematic review of the literature.

    PubMed

    Lucas, Nicholas; Macaskill, Petra; Irwig, Les; Moran, Robert; Bogduk, Nikolai

    2009-01-01

    Trigger points are promoted as an important cause of musculoskeletal pain. There is no accepted reference standard for the diagnosis of trigger points, and data on the reliability of physical examination for trigger points are conflicting. To systematically review the literature on the reliability of physical examination for the diagnosis of trigger points. MEDLINE, EMBASE, and other sources were searched for articles reporting the reliability of physical examination for trigger points. Included studies were evaluated for their quality and applicability, and reliability estimates were extracted and reported. Nine studies were eligible for inclusion. None satisfied all quality and applicability criteria. No study specifically reported reliability for the identification of the location of active trigger points in the muscles of symptomatic participants. Reliability estimates varied widely for each diagnostic sign, for each muscle, and across each study. Reliability estimates were generally higher for subjective signs such as tenderness (kappa range, 0.22-1.0) and pain reproduction (kappa range, 0.57-1.00), and lower for objective signs such as the taut band (kappa range, -0.08-0.75) and local twitch response (kappa range, -0.05-0.57). No study to date has reported the reliability of trigger point diagnosis according to the currently proposed criteria. On the basis of the limited number of studies available, and significant problems with their design, reporting, statistical integrity, and clinical applicability, physical examination cannot currently be recommended as a reliable test for the diagnosis of trigger points. The reliability of trigger point diagnosis needs to be further investigated with studies of high quality that use current diagnostic criteria in clinically relevant patients.

  9. Estimating Classification Consistency and Accuracy for Cognitive Diagnostic Assessment

    ERIC Educational Resources Information Center

    Cui, Ying; Gierl, Mark J.; Chang, Hua-Hua

    2012-01-01

    This article introduces procedures for the computation and asymptotic statistical inference for classification consistency and accuracy indices specifically designed for cognitive diagnostic assessments. The new classification indices can be used as important indicators of the reliability and validity of classification results produced by…

  10. Increasing Army Supply Chain Performance: Using an Integrated End to End Metrics System

    DTIC Science & Technology

    2017-01-01

    Sched Deliver Sched Delinquent Contracts Current Metrics PQDR/SDRs Forecasting Accuracy Reliability Demand Management Asset Mgmt Strategies Pipeline...are identified and characterized by statistical analysis. The study proposed a framework and tool for inventory management based on factors such as

  11. Sensing system development for HOV/HOT (high occupancy vehicle) lane monitoring.

    DOT National Transportation Integrated Search

    2011-02-01

    With continued interest in the efficient use of roadways the ability to monitor the use of HOV/HOT lanes is essential for management, planning and operation. A system to reliably monitor these lanes on a continuous basis and provide usage statistics ...

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoo, Jun Soo

    The bubble departure diameter and bubble release frequency were obtained through the analysis of TAMU subcooled flow boiling experimental data. The numerous images of bubbles at departure were analyzed for each experimental condition to achieve the reliable statistics of the measured bubble parameters. The results are provided in this report with simple discussion.

  13. Methods to assess pecan scab

    USDA-ARS?s Scientific Manuscript database

    Pecan scab (Fusicladium effusum [G. Winter]) is the most important disease of pecan in the U.S. Measuring the severity of scab accurately and reliably and providing data amenable to analysis using parametric statistics is important where treatments are being compared to minimize the risk of Type II ...

  14. A new statistical model for subgrid dispersion in large eddy simulations of particle-laden flows

    NASA Astrophysics Data System (ADS)

    Muela, Jordi; Lehmkuhl, Oriol; Pérez-Segarra, Carles David; Oliva, Asensi

    2016-09-01

    Dispersed multiphase turbulent flows are present in many industrial and commercial applications like internal combustion engines, turbofans, dispersion of contaminants, steam turbines, etc. Therefore, there is a clear interest in the development of models and numerical tools capable of performing detailed and reliable simulations about these kind of flows. Large Eddy Simulations offer good accuracy and reliable results together with reasonable computational requirements, making it a really interesting method to develop numerical tools for particle-laden turbulent flows. Nonetheless, in multiphase dispersed flows additional difficulties arises in LES, since the effect of the unresolved scales of the continuous phase over the dispersed phase is lost due to the filtering procedure. In order to solve this issue a model able to reconstruct the subgrid velocity seen by the particles is required. In this work a new model for the reconstruction of the subgrid scale effects over the dispersed phase is presented and assessed. This innovative methodology is based in the reconstruction of statistics via Probability Density Functions (PDFs).

  15. The Dysexecutive Questionnaire advanced: item and test score characteristics, 4-factor solution, and severity classification.

    PubMed

    Bodenburg, Sebastian; Dopslaff, Nina

    2008-01-01

    The Dysexecutive Questionnaire (DEX, , Behavioral assessment of the dysexecutive syndrome, 1996) is a standardized instrument to measure possible behavioral changes as a result of the dysexecutive syndrome. Although initially intended only as a qualitative instrument, the DEX has also been used increasingly to address quantitative problems. Until now there have not been more fundamental statistical analyses of the questionnaire's testing quality. The present study is based on an unselected sample of 191 patients with acquired brain injury and reports on the data relating to the quality of the items, the reliability and the factorial structure of the DEX. Item 3 displayed too great an item difficulty, whereas item 11 was not sufficiently discriminating. The DEX's reliability in self-rating is r = 0.85. In addition to presenting the statistical values of the tests, a clinical severity classification of the overall scores of the 4 found factors and of the questionnaire as a whole is carried out on the basis of quartile standards.

  16. Development and applications of the SWAN rating scale for assessment of attention deficit hyperactivity disorder: a literature review.

    PubMed

    Brites, C; Salgado-Azoni, C A; Ferreira, T L; Lima, R F; Ciasca, S M

    2015-11-01

    This study reviewed the use of the Strengths and Weaknesses of Attention-Deficit/Hyperactivity-symptoms and Normal-behaviors (SWAN) rating scale in diagnostic and evolutive approaches to attention deficit hyperactivity disorder (ADHD) and in correlational studies of the disorder. A review of articles published in indexed journals from electronic databases was conducted and 61 articles on the SWAN scale were analyzed. From these, 27 were selected to a) examine use of SWAN in research on attention disorders and b) verify evidence of its usefulness in the areas of genetics, neuropsychology, diagnostics, psychiatric comorbidities, neuroimaging, pharmacotherapy, and to examine its statistical reliability and validity in studies of diverse populations. This review of articles indicated a growing use of the SWAN scale for diagnostic purposes, for therapy, and in research on areas other than ADHD, especially when compared with other reliable scales. Use of the scale in ADHD diagnosis requires further statistical testing to define its psychometric properties.

  17. Mindful attention and awareness: relationships with psychopathology and emotion regulation.

    PubMed

    Gregório, Sónia; Pinto-Gouveia, José

    2013-01-01

    The growing interest in mindfulness from the scientific community has originated several self-report measures of this psychological construct. The Mindful Attention and Awareness Scale (MAAS) is a self-report measure of mindfulness at a trait-level. This paper aims at exploring MAAS psychometric characteristics and validating it for the Portuguese population. The first two studies replicate some of the original author's statistical procedures in two different samples from the Portuguese general community population, in particular confirmatory factor analyses. Results from both analyses confirmed the scale single-factor structure and indicated a very good reliability. Moreover, cross-validation statistics showed that this single-factor structure is valid for different respondents from the general community population. In the third study the Portuguese version of the MAAS was found to have good convergent and discriminant validities. Overall the findings support the psychometric validity of the Portuguese version of MAAS and suggest this is a reliable self-report measure of trait-mindfulness, a central construct in Clinical Psychology research and intervention fields.

  18. [THE COMPARATIVE ANALYSIS OF INFORMATION VALUE OF MAIN CLINICAL CRITERIA USED TO DIAGNOSE OF BACTERIAL VAGINOSIS].

    PubMed

    Tsvetkova, A V; Murtazina, Z A; Markusheva, T V; Mavzutov, A R

    2015-05-01

    The bacterial vaginosis is one of the most frequent causes of women visiting gynecologist. The diagnostics of bacterial vaginosis is predominantly based on Amsel criteria (1983). Nowadays, the objectivity of these criteria is disputed more often. The analysis of excretion of mucous membranes of posterolateral fornix of vagina was applied to 640 women with clinical diagnosis bacterial vaginosis. The application of light microscopy to mounts of excretion confirmed in laboratory way the diagnosis of bacterial vaginosis in 100 (15.63%) women. The complaints of burning and unpleasant smell and the Amsel criterion of detection of "key cells" against the background of pH > 4.5 were established as statistically significant for bacterial vaginosis. According study data, the occurrence of excretions has no statistical reliable obligation for differentiation of bacterial vaginosis form other inflammatory pathological conditions of female reproductive sphere. At the same time, detection of "key cells" in mount reliably correlated with bacterial vaginosis.

  19. Capture approximations beyond a statistical quantum mechanical method for atom-diatom reactions

    NASA Astrophysics Data System (ADS)

    Barrios, Lizandra; Rubayo-Soneira, Jesús; González-Lezana, Tomás

    2016-03-01

    Statistical techniques constitute useful approaches to investigate atom-diatom reactions mediated by insertion dynamics which involves complex-forming mechanisms. Different capture schemes based on energy considerations regarding the specific diatom rovibrational states are suggested to evaluate the corresponding probabilities of formation of such collision species between reactants and products in an attempt to test reliable alternatives for computationally demanding processes. These approximations are tested in combination with a statistical quantum mechanical method for the S + H2(v = 0 ,j = 1) → SH + H and Si + O2(v = 0 ,j = 1) → SiO + O reactions, where this dynamical mechanism plays a significant role, in order to probe their validity.

  20. Reliability and validity of the workplace harassment questionnaire for Korean finance and service workers.

    PubMed

    Lee, Myeongjun; Kim, Hyunjung; Shin, Donghee; Lee, Sangyun

    2016-01-01

    Harassment means systemic and repeated unethical acts. Research on workplace harassment have been conducted widely and the NAQ-R has been widely used for the researches. But this tool, however the limitations in revealing differended in sub-factors depending on the culture and in reflecting that unique characteristics of the Koren society. So, The workplace harassment questionnaire for Korean finace and service workers has been developed to assess the level of personal harassment at work. This study aims to develop a tool to assess the level of personal harassment at work and to test its validity and reliability while examining specific characteristics of workplace harassment against finance and service workers in Korea. The framework of survey was established based on literature review, focused-group interview for the Korean finance and service workers. To verify its reliability, Cronbach's alpha coefficient was calculated; and to verify its validity, items and factors of the tool were analyzed. The correlation matrix analysis was examined to verify the tool's convergent validity and discriminant validity. Structural validity was verified by checking statistical significance in relation to the BDI-K. Cronbach's alpha coefficient of this survey was 0.93, which indicates a quite high level of reliability. To verify the appropriateness of this survey tool, its construct validity was examined through factor analysis. As a result of the factor analysis, 3 factors were extracted, explaining 56.5 % of the total variance. The loading values and communalities of the 20 items were 0.85 to 0.48 and 0.71 to 0.46. The convergent validity and discriminant validity were analyzed and rate of item discriminant validity was 100 %. Finally, for the concurrent validity, We examined the relationship between the WHI-KFSW and pschosocial stress by examining the correlation with the BDI-K. The results of chi-square test and multiple logistic analysis indicated that the correlation with the BDI-K was satatisctically significant. Workplace harassment in actual workplaces were investigated based on interviews, and the statistical analysis contributed to systematizing the types of actual workplace harassment. By statistical method, we developed the questionare, 20 items of 3 categories.

  1. Statistical inference of protein structural alignments using information and compression.

    PubMed

    Collier, James H; Allison, Lloyd; Lesk, Arthur M; Stuckey, Peter J; Garcia de la Banda, Maria; Konagurthu, Arun S

    2017-04-01

    Structural molecular biology depends crucially on computational techniques that compare protein three-dimensional structures and generate structural alignments (the assignment of one-to-one correspondences between subsets of amino acids based on atomic coordinates). Despite its importance, the structural alignment problem has not been formulated, much less solved, in a consistent and reliable way. To overcome these difficulties, we present here a statistical framework for the precise inference of structural alignments, built on the Bayesian and information-theoretic principle of Minimum Message Length (MML). The quality of any alignment is measured by its explanatory power-the amount of lossless compression achieved to explain the protein coordinates using that alignment. We have implemented this approach in MMLigner , the first program able to infer statistically significant structural alignments. We also demonstrate the reliability of MMLigner 's alignment results when compared with the state of the art. Importantly, MMLigner can also discover different structural alignments of comparable quality, a challenging problem for oligomers and protein complexes. Source code, binaries and an interactive web version are available at http://lcb.infotech.monash.edu.au/mmligner . arun.konagurthu@monash.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  2. Statistical evaluation for stability studies under stress storage conditions.

    PubMed

    Gil-Alegre, M E; Bernabeu, J A; Camacho, M A; Torres-Suarez, A I

    2001-11-01

    During the pharmaceutical development of a new drug, it is necessary to select as soon as possible the formulation with the best stability characteristics. The current International Commission for Harmonisation (ICH) regulations regarding stability testing requirements for a Registration Application provide the stress testing conditions with the aim of assessing the effect of severe conditions on the drug product. In practice, the well-known Arrhenius theory is still used to make a rapid stability prediction, to estimate a drug product shelf life during early stages of its pharmaceutical development. In this work, both the planning of a stress stability study to obtain a correct stability prediction from a temperature extrapolation and the suitable data treatment to discern the reliability of the stability results are discussed. The study was focused on the early formulation step of a very stable drug, Mitonafide (antineoplastic agent), formulated in a parenteral solution and in tablets. It was observed, for the solid system, that the extrapolated results using Arrhenius theory might be statistically good, but far from the real situation if the stability study is not designed in a correct way. The statistical data treatment and the stress-stability test proposed in this work are suitable to make a reliable stability prediction of different formulations with the same drug, within its pharmaceutical development.

  3. Detection of proximal caries using quantitative light-induced fluorescence-digital and laser fluorescence: a comparative study.

    PubMed

    Yoon, Hyung-In; Yoo, Min-Jeong; Park, Eun-Jin

    2017-12-01

    The purpose of this study was to evaluate the in vitro validity of quantitative light-induced fluorescence-digital (QLF-D) and laser fluorescence (DIAGNOdent) for assessing proximal caries in extracted premolars, using digital radiography as reference method. A total of 102 extracted premolars with similar lengths and shapes were used. A single operator conducted all the examinations using three different detection methods (bitewing radiography, QLF-D, and DIAGNOdent). The bitewing x-ray scale, QLF-D fluorescence loss (ΔF), and DIAGNOdent peak readings were compared and statistically analyzed. Each method showed an excellent reliability. The correlation coefficient between bitewing radiography and QLF-D, DIAGNOdent were -0.644 and 0.448, respectively, while the value between QLF-D and DIAGNOdent was -0.382. The kappa statistics for bitewing radiography and QLF-D had a higher diagnosis consensus than those for bitewing radiography and DIAGNOdent. The QLF-D was moderately to highly accurate (AUC = 0.753 - 0.908), while DIAGNOdent was moderately to less accurate (AUC = 0.622 - 0.784). All detection methods showed statistically significant correlation and high correlation between the bitewing radiography and QLF-D. QLF-D was found to be a valid and reliable alternative diagnostic method to digital bitewing radiography for in vitro detection of proximal caries.

  4. Resting-state fMRI correlations: From link-wise unreliability to whole brain stability.

    PubMed

    Pannunzi, Mario; Hindriks, Rikkert; Bettinardi, Ruggero G; Wenger, Elisabeth; Lisofsky, Nina; Martensson, Johan; Butler, Oisin; Filevich, Elisa; Becker, Maxi; Lochstet, Martyna; Kühn, Simone; Deco, Gustavo

    2017-08-15

    The functional architecture of spontaneous BOLD fluctuations has been characterized in detail by numerous studies, demonstrating its potential relevance as a biomarker. However, the systematic investigation of its consistency is still in its infancy. Here, we analyze within- and between-subject variability and test-retest reliability of resting-state functional connectivity (FC) in a unique data set comprising multiple fMRI scans (42) from 5 subjects, and 50 single scans from 50 subjects. We adopt a statistical framework that enables us to identify different sources of variability in FC. We show that the low reliability of single links can be significantly improved by using multiple scans per subject. Moreover, in contrast to earlier studies, we show that spatial heterogeneity in FC reliability is not significant. Finally, we demonstrate that despite the low reliability of individual links, the information carried by the whole-brain FC matrix is robust and can be used as a functional fingerprint to identify individual subjects from the population. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Competing risk models in reliability systems, an exponential distribution model with Bayesian analysis approach

    NASA Astrophysics Data System (ADS)

    Iskandar, I.

    2018-03-01

    The exponential distribution is the most widely used reliability analysis. This distribution is very suitable for representing the lengths of life of many cases and is available in a simple statistical form. The characteristic of this distribution is a constant hazard rate. The exponential distribution is the lower rank of the Weibull distributions. In this paper our effort is to introduce the basic notions that constitute an exponential competing risks model in reliability analysis using Bayesian analysis approach and presenting their analytic methods. The cases are limited to the models with independent causes of failure. A non-informative prior distribution is used in our analysis. This model describes the likelihood function and follows with the description of the posterior function and the estimations of the point, interval, hazard function, and reliability. The net probability of failure if only one specific risk is present, crude probability of failure due to a specific risk in the presence of other causes, and partial crude probabilities are also included.

  6. Validity and reliability of wii fit balance board for the assessment of balance of healthy young adults and the elderly.

    PubMed

    Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen

    2013-10-01

    [Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86-0.99) for the elderly people and positive correlations (r = 0.58-0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability.

  7. Improving the Reliability of Technological Subsystems Equipment for Steam Turbine Unit in Operation

    NASA Astrophysics Data System (ADS)

    Brodov, Yu. M.; Murmansky, B. E.; Aronson, R. T.

    2017-11-01

    The authors’ conception is presented of an integrated approach to reliability improving of the steam turbine unit (STU) state along with its implementation examples for the various STU technological subsystems. Basing on the statistical analysis of damage to turbine individual parts and components, on the development and application of modern methods and technologies of repair and on operational monitoring techniques, the critical components and elements of equipment are identified and priorities are proposed for improving the reliability of STU equipment in operation. The research results are presented of the analysis of malfunctions for various STU technological subsystems equipment operating as part of power units and at cross-linked thermal power plants and resulting in turbine unit shutdown (failure). Proposals are formulated and justified for adjustment of maintenance and repair for turbine components and parts, for condenser unit equipment, for regeneration subsystem and oil supply system that permit to increase the operational reliability, to reduce the cost of STU maintenance and repair and to optimize the timing and amount of repairs.

  8. Enhanced Component Performance Study: Air-Operated Valves 1998-2014

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schroeder, John Alton

    2015-11-01

    This report presents a performance evaluation of air-operated valves (AOVs) at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience failure reports from fiscal year 1998 through 2014 for the component reliability as reported in the Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES). The AOV failure modes considered are failure-to-open/close, failure to operate or control, and spurious operation. The component reliability estimates and the reliability data are trended for the most recent 10-year period, while yearly estimates for reliability are provided for the entire active period. One statistically significantmore » trend was observed in the AOV data: The frequency of demands per reactor year for valves recording the fail-to-open or fail-to-close failure modes, for high-demand valves (those with greater than twenty demands per year), was found to be decreasing. The decrease was about three percent over the ten year period trended.« less

  9. [Development and reliability evaluation of an instrument to measure health-related quality of life in independent elderly].

    PubMed

    Lima, Maria José Barbosa de; Portela, Margareth Crisóstomo

    2010-08-01

    This study presents an instrument, the health-related quality of life (HRQOL) profile for independent elderly, to measure the health-related quality of life of the functionally independent elderly assisted in the outpatient setting, based on the adaptation of four validated scales: Short-Form Health Survey (SF-36), Duke-UNC Health Profile (DUHP), Sickness Impact Profile (SIP), and Nottingham Health Profile (NHP). The study also evaluates the instrument's reliability based on its use by two different observers with a 15-day interval. The instrument includes five dimensions (health perception, symptoms, physical function, psychological function, and social function) and 45 items. Reliability evaluation of the QUASI instrument was based on interviews with 142 elderly outpatients in the city of Rio de Janeiro, Brazil. Prevalence-adjusted kappa statistic was used to assess all 45 items. Correlation was also calculated between overall scores and scores on individual dimensions. In the reliability evaluation, 39 of the 45 items showed prevalence-adjusted kappa greater than 0.60.

  10. The Outcome and Assessment Information Set (OASIS): A Review of Validity and Reliability

    PubMed Central

    O’CONNOR, MELISSA; DAVITT, JOAN K.

    2015-01-01

    The Outcome and Assessment Information Set (OASIS) is the patient-specific, standardized assessment used in Medicare home health care to plan care, determine reimbursement, and measure quality. Since its inception in 1999, there has been debate over the reliability and validity of the OASIS as a research tool and outcome measure. A systematic literature review of English-language articles identified 12 studies published in the last 10 years examining the validity and reliability of the OASIS. Empirical findings indicate the validity and reliability of the OASIS range from low to moderate but vary depending on the item studied. Limitations in the existing research include: nonrepresentative samples; inconsistencies in methods used, items tested, measurement, and statistical procedures; and the changes to the OASIS itself over time. The inconsistencies suggest that these results are tentative at best; additional research is needed to confirm the value of the OASIS for measuring patient outcomes, research, and quality improvement. PMID:23216513

  11. Validity and reliability of the Greek version of the xerostomia questionnaire in head and neck cancer patients.

    PubMed

    Memtsa, Pinelopi Theopisti; Tolia, Maria; Tzitzikas, Ioannis; Bizakis, Ioannis; Pistevou-Gombaki, Kyriaki; Charalambidou, Martha; Iliopoulou, Chrysoula; Kyrgias, George

    2017-03-01

    Xerostomia after radiation therapy for head and neck (H&N) cancer has serious effects on patients' quality of life. The purpose of this study was to validate the Greek version of the self-reported eight-item xerostomia questionnaire (XQ) in patients treated with radiotherapy for H&N cancer. The XQ was translated into Greek and administered to 100 XQ patients. An exploratory factor analysis was performed. Reliability measures were calculated. Several types of validity were evaluated. The observer-rated scoring system was also used. The mean XQ value was 41.92 (SD 22.71). Factor analysis revealed the unidimensional nature of the questionnaire. High reliability measures (ICC, Cronbach's α, Pearson coefficients) were obtained. Patients differed statistically significantly in terms of XQ score, depending on the RTOG/EORTC classification. The Greek version of XQ is valid and reliable. Its score is well related to observer's findings and it can be used to evaluate the impact of radiation therapy on the subjective feeling of xerostomia.

  12. Reliability of high-power QCW arrays

    NASA Astrophysics Data System (ADS)

    Feeler, Ryan; Junghans, Jeremy; Remley, Jennifer; Schnurbusch, Don; Stephens, Ed

    2010-02-01

    Northrop Grumman Cutting Edge Optronics has developed a family of arrays for high-power QCW operation. These arrays are built using CTE-matched heat sinks and hard solder in order to maximize the reliability of the devices. A summary of a recent life test is presented in order to quantify the reliability of QCW arrays and associated laser gain modules. A statistical analysis of the raw lifetime data is presented in order to quantify the data in such a way that is useful for laser system designers. The life tests demonstrate the high level of reliability of these arrays in a number of operating regimes. For single-bar arrays, a MTTF of 19.8 billion shots is predicted. For four-bar samples, a MTTF of 14.6 billion shots is predicted. In addition, data representing a large pump source is analyzed and shown to have an expected lifetime of 13.5 billion shots. This corresponds to an expected operational lifetime of greater than ten thousand hours at repetition rates less than 370 Hz.

  13. Clinical criteria for psychiatric diagnosis and DSM-III.

    PubMed

    Spitzer, R L; Endicott, J; Robins, E

    1975-11-01

    The authors identify the differences in formal inclusion and exclusion criteria used to classify patient data into diagnoses as the largest source of diagnostic unreliability in psychiatry. They describe the efforts that have been made to reduce these differences, particularly the specified criteria approach to defining diagnostic categories, which was developed for research purposes. On the basis of studies showing that the use of specified criteria increases the reliability of diagnostic judgments, they suggest that including such criteria in the next edition of APA's Diagnostic and Statistical Manual of Mental Disorders (DSM-III) would improve the reliability and validity of routine psychiatric diagnosis.

  14. Probabilistic finite elements for fatigue and fracture analysis

    NASA Astrophysics Data System (ADS)

    Belytschko, Ted; Liu, Wing Kam

    Attenuation is focused on the development of Probabilistic Finite Element Method (PFEM), which combines the finite element method with statistics and reliability methods, and its application to linear, nonlinear structural mechanics problems and fracture mechanics problems. The computational tool based on the Stochastic Boundary Element Method is also given for the reliability analysis of a curvilinear fatigue crack growth. The existing PFEM's have been applied to solve for two types of problems: (1) determination of the response uncertainty in terms of the means, variance and correlation coefficients; and (2) determination the probability of failure associated with prescribed limit states.

  15. Assessment of Change in Dynamic Psychotherapy

    PubMed Central

    Høglend, Per; Bøgwald, Kjell-Petter; Amlo, Svein; Heyerdahl, Oscar; Sørbye, Øystein; Marble, Alice; Sjaastad, Mary Cosgrove; Bentsen, Håvard

    2000-01-01

    Five scales have been developed to assess changes that are consistent with the therapeutic rationales and procedures of dynamic psychotherapy. Seven raters evaluated 50 patients before and 36 patients again after brief dynamic psychotherapy. A factor analysis indicated that the scales represent a dimension that is discriminable from general symptoms. A summary measure, Dynamic Capacity, was rated with acceptable reliability by a single rater. However, average scores of three raters were needed for good reliability of change ratings. The scales seem to be sufficiently fine-grained to capture statistically and clinically significant changes during brief dynamic psychotherapy. PMID:11069131

  16. Probabilistic finite elements for fatigue and fracture analysis

    NASA Technical Reports Server (NTRS)

    Belytschko, Ted; Liu, Wing Kam

    1992-01-01

    Attenuation is focused on the development of Probabilistic Finite Element Method (PFEM), which combines the finite element method with statistics and reliability methods, and its application to linear, nonlinear structural mechanics problems and fracture mechanics problems. The computational tool based on the Stochastic Boundary Element Method is also given for the reliability analysis of a curvilinear fatigue crack growth. The existing PFEM's have been applied to solve for two types of problems: (1) determination of the response uncertainty in terms of the means, variance and correlation coefficients; and (2) determination the probability of failure associated with prescribed limit states.

  17. Reliability of scanning laser acoustic microscopy for detecting internal voids in structural ceramics

    NASA Technical Reports Server (NTRS)

    Roth, D. J.; Baaklini, G. Y.

    1986-01-01

    The reliability of 100 MHz scanning laser acoustic microscopy (SLAM) for detecting internal voids in sintered specimens of silicon nitride and silicon carbide was evaluated. The specimens contained artificially implanted voids and were positioned at depths ranging up to 2 mm below the specimen surface. Detection probability of 0.90 at a 0.95 confidence level was determined as a function of material, void diameter, and void depth. The statistical results presented for void detectability indicate some of the strengths and limitations of SLAM as a nondestructive evaluation technique for structural ceramics.

  18. Automatic Bone Drilling - More Precise, Reliable and Safe Manipulation in the Orthopaedic Surgery

    NASA Astrophysics Data System (ADS)

    Boiadjiev, George; Kastelov, Rumen; Boiadjiev, Tony; Delchev, Kamen; Zagurski, Kazimir

    2016-06-01

    Bone drilling manipulation often occurs in the orthopaedic surgery. By statistics, nowadays, about one million people only in Europe need such an operation every year, where bone implants are inserted. Almost always, the drilling is performed handily, which cannot avoid the subjective factor influence. The question of subjective factor reduction has its answer - automatic bone drilling. The specific features and problems of orthopaedic drilling manipulation are considered in this work. The automatic drilling is presented according the possibilities of robotized system Orthopaedic Drilling Robot (ODRO) for assuring the manipulation accuracy, precision, reliability and safety.

  19. Electric propulsion reliability: Statistical analysis of on-orbit anomalies and comparative analysis of electric versus chemical propulsion failure rates

    NASA Astrophysics Data System (ADS)

    Saleh, Joseph Homer; Geng, Fan; Ku, Michelle; Walker, Mitchell L. R.

    2017-10-01

    With a few hundred spacecraft launched to date with electric propulsion (EP), it is possible to conduct an epidemiological study of EP's on orbit reliability. The first objective of the present work was to undertake such a study and analyze EP's track record of on orbit anomalies and failures by different covariates. The second objective was to provide a comparative analysis of EP's failure rates with those of chemical propulsion. Satellite operators, manufacturers, and insurers will make reliability- and risk-informed decisions regarding the adoption and promotion of EP on board spacecraft. This work provides evidence-based support for such decisions. After a thorough data collection, 162 EP-equipped satellites launched between January 1997 and December 2015 were included in our dataset for analysis. Several statistical analyses were conducted, at the aggregate level and then with the data stratified by severity of the anomaly, by orbit type, and by EP technology. Mean Time To Anomaly (MTTA) and the distribution of the time to (minor/major) anomaly were investigated, as well as anomaly rates. The important findings in this work include the following: (1) Post-2005, EP's reliability has outperformed that of chemical propulsion; (2) Hall thrusters have robustly outperformed chemical propulsion, and they maintain a small but shrinking reliability advantage over gridded ion engines. Other results were also provided, for example the differentials in MTTA of minor and major anomalies for gridded ion engines and Hall thrusters. It was shown that: (3) Hall thrusters exhibit minor anomalies very early on orbit, which might be indicative of infant anomalies, and thus would benefit from better ground testing and acceptance procedures; (4) Strong evidence exists that EP anomalies (onset and likelihood) and orbit type are dependent, a dependence likely mediated by either the space environment or differences in thrusters duty cycles; (5) Gridded ion thrusters exhibit both infant and wear-out failures, and thus would benefit from a reliability growth program that addresses both these types of problems.

  20. Identifying reliable independent components via split-half comparisons

    PubMed Central

    Groppe, David M.; Makeig, Scott; Kutas, Marta

    2011-01-01

    Independent component analysis (ICA) is a family of unsupervised learning algorithms that have proven useful for the analysis of the electroencephalogram (EEG) and magnetoencephalogram (MEG). ICA decomposes an EEG/MEG data set into a basis of maximally temporally independent components (ICs) that are learned from the data. As with any statistic, a concern with using ICA is the degree to which the estimated ICs are reliable. An IC may not be reliable if ICA was trained on insufficient data, if ICA training was stopped prematurely or at a local minimum (for some algorithms), or if multiple global minima were present. Consequently, evidence of ICA reliability is critical for the credibility of ICA results. In this paper, we present a new algorithm for assessing the reliability of ICs based on applying ICA separately to split-halves of a data set. This algorithm improves upon existing methods in that it considers both IC scalp topographies and activations, uses a probabilistically interpretable threshold for accepting ICs as reliable, and requires applying ICA only three times per data set. As evidence of the method’s validity, we show that the method can perform comparably to more time intensive bootstrap resampling and depends in a reasonable manner on the amount of training data. Finally, using the method we illustrate the importance of checking the reliability of ICs by demonstrating that IC reliability is dramatically increased by removing the mean EEG at each channel for each epoch of data rather than the mean EEG in a prestimulus baseline. PMID:19162199

Top